Two Possible Paths into the Future of Wearable Computing: Part 2 – AR

The year is 2015. Wearable glasses have taken off, and they’re game-changers every bit as much as smartphones were, because these descendants of Google Glass give you access to information everywhere, all the time. You’re wearing these glasses, I’m wearing these glasses, all the early adopters are wearing them. We use them to make and receive phone calls, send and receive texts, do instant messaging, do email, get directions and route guidance, browse the Web, and do pretty much everything we do with smartphones today.

However, these glasses don’t support true AR (augmented reality); that is, they can’t display virtual objects that appear to be part of the real world. Instead, they can only display what I called HUDSpace in the last post – heads-up-display-type information that doesn’t try to seem to be part of the real world (Terminator vision, for example). No one particularly misses AR, because all the functionality of a smartphone is there in a more accessible form, and that’s enough to make the glasses incredibly useful.

But then someone comes out with a special edition of their HUDSpace glasses; the special part is that if you put a marker card down on a convenient surface, you can play virtual card and board games on it, either by yourself or with friends. This is a reasonably popular novelty; then the offerings expand to include anything you can play on a table – strategy games, RTSes, arena-type 2D arcade games extruded into 3D, and new games unique to AR – and someone comes out with a version of the glasses that doesn’t need markers, so you can play games in boring meetings, and suddenly everyone wants one. The race is on, and soon there’s room-scale AR, followed by steady progress on the long march toward general, walk-around AR.

And that’s how I think it’s most likely AR will come into our daily lives.

A quick recap

Last time, I described how my original thinking that AR was likely to be the next great platform shift had evolved to consider the possibility that VR (virtual reality) might be equally important, at least in the short and medium term, and far more tractable today, so perhaps it would make sense to pursue VR as well right now. (See the last post for definitions of AR, VR, and other terms I’ll use in this post.) Then I made the case for VR as the most promising target in the near future. I personally think that case is pretty compelling.

This time I’ll make a case for AR as more likely to succeed even in the short term (last time I explained why I think it’s the most important long-term goal), and I think that case is pretty compelling too. The truth is, given infinite resources, I’d want to pursue both as hard as possible; one doesn’t preclude the other, and both could pan out in a big way. But resources (especially time) are finite, alas, and choices have to be made, so a lot of thought has gone into choosing where to focus, and these two posts recount some of that thinking.

Of course, I hope we get this right, but as Yogi Berra put it: “It’s tough to make predictions, especially about the future.” All we can do is make our best assessment, start doing experiments, and see where that leads, constantly reevaluating and course correcting as needed (and it will be needed!). So I’m by no means laying out a roadmap of the future; this post and the last are just two possible ways wearable computing might unfold.

How AR might actually evolve

The first step in assessing whether to focus on AR or VR is to figure out how each is most likely to succeed, and then to compare the strengths and weaknesses of each of those probable paths. The likely path for VR is obvious, and already in motion: The Oculus Rift will come out, running ports of existing PC games. If it’s even moderately successful, games will be written specifically for VR, Oculus will improve the hardware and competitors will emerge, VR will likely spread to mobile and consoles, and the boom will be on.

The path AR might take to success is less clear, because there are many types of AR – tabletop, room-scale, and walk-around – and several platforms it could emerge on – PC, mobile, and console. Also, as the scenario I sketched out at the start illustrates, AR could evolve from HUDSpace. So let’s look in more detail at that scenario, and then examine why I think it might be more promising than other paths for AR and VR.

In my scenario, AR isn’t even part of the picture at first; see-through glasses emerge, but wearable computing develops along the Google Glass path, supporting only display of information that doesn’t appear to be part of the real world, rather than true AR. To be clear, it’s quite possible that Google Glass won’t be see-through, but will just provide an opaque information display above and to the side of your normal line of sight. However, I think see-through glasses have much more potential, if only because they’ll have much more screen real estate, won’t block your view, allow for in-place annotation of the real world, and will be more comfortable to look at. That’s a good thing for my scenario, since see-through potentially leads to AR, while an opaque display out of the line of sight doesn’t.

Having information available everywhere, all the time will be tremendously valuable, and HUDSpace glasses will probably become widely used; in fact, you could make a strong argument that people who wear them will seem to be smarter than everyone else, because they will have faster access to information. Think of all the times you’ve hauled out your phone in a conversation to look something up, and now imagine you can do that without having to visibly query anything; you’ll just seem smarter. (Obviously I could be wrong in assuming that HUDSpace glasses will be widely used – it may turn out that people hate having information fed to them through glasses – but certainly there’s a strong argument to be made that better access to information is likely to be compelling, and since the rest of this scenario depends on it, I’ll just take it as a given.)

You may well wonder why these glasses wouldn’t have AR capabilities – after all, even cellphones can do AR today, right? Here I need to draw a distinction between true AR and cellphone AR. Cellphone AR, although interesting, is at best a distant cousin to true AR, for one key reason: cellphone AR doesn’t have to fool your entire visual perception system into thinking virtual images exist in the real world. By this I’m referring not to photorealistic rendering (the eye and brain are quite tolerant of cruder rendering), but rather to the requirement that virtual images appear to be solid, crisp, and in exactly the right place relative to the real world at all times as your head moves. The tolerance of the human visual system for discrepancies in those areas when viewing 3D virtual images that are supposed to appear to be part of the real world – that is, true AR – is astonishingly low; violate exceedingly tight parameters (for example, something on the order of 20 ms for latency), and virtual objects simply won’t seem like they’re part of the world around you. With cellphone AR, you’re just looking at a 2D picture, like a TV, and in that circumstance there are all sorts of automatic reflexes and visual processing that don’t kick in, which greatly relaxes hardware requirements.

The visual system’s low tolerance for mismatches between the virtual and real worlds means that the hardware required to make true AR work well is significantly more demanding – and expensive – than the hardware needed for HUDSpace. This is particularly the case for general, walk-around AR, which has to constantly cope with new, wildly varying settings and lighting, but it’s true even for room-scale and tabletop AR, primarily due to the requirements for display technology and tracking of the real world. At some point I’ll post about those areas, but for now, trust me, it’s a lot easier to build glasses that display HUD information, or at most images that are loosely related to the real world (like floating signs in the general direction of restaurants) than it is to build one that displays virtual images that fool your visual system into thinking they exist in the real world.

Given that true AR is hard, expensive, and not required, HUDSpace glasses will initially almost certainly not support true AR. Interestingly, because they’ll almost certainly be see-through, HUDSpace glasses won’t even support cellphone AR well,.

So in this scenario, a few years from now we’re all wearing HUDSpace glasses and using them to do what we do now with a smartphone, but more effectively, because the glasses give us access to information all the time, and privately. They’ll also do things that a smartphone isn’t good at, such as popping up the names of people you encounter, which you can’t politely use your smartphone to do. The obvious difference from a smartphone is that the glasses won’t have a capacitive touchscreen, and honestly I don’t know what the input method will be, but there are several plausible answers, so I’ll assume that’ll work out and skip over it for now. Several large companies are making HUDSpace glasses, and the competition is as fierce as it is in smartphones today. All kinds of great apps are being written for the glasses, including HUDSpace versions of existing casual and location-based games, but there are no true AR apps, because the hardware doesn’t support them.

As I described in the opening, it’s at this point that someone will probably put a camera on their glasses that’s good enough for tabletop AR, probably with the help of a fiducial (a marker designed to be tracked by a camera) placed where you want the AR to appear. Add good tracking code, and you’ll be able to play any tabletop game anyone cares to write. The glasses will be networked, so you’ll be able to play any card or board game you can think of, and you’ll be able to do that either with someone sitting at the table with you or with anyone on the Internet. Better tracking hardware and software will eliminate the need for fiducials, and the Tetris or Angry Birds of tabletop AR will appear, sparking a rapidly escalating AR arms race, similar to what happened with 3D accelerators and 3D games. AR will expand to room scale, which will involve group games, of course, and a general expansion of current console gaming possibilities, but also non-game applications like construction kits (living room Minecraft- and Lego-type applications), virtual toys, and virtual pets, and at that point there will be a critical mass of AR users, hardware, and software that makes it economically and technically feasible to start chipping away at walk-around AR. It’ll probably take a decade or two, or even more, before truly general AR exists, but it’s easy to see how an accelerating curve heading in that direction could spring from the first wearable glasses that provide a good-enough tabletop AR experience.

Why AR is more likely to evolve from HUDSpace than to appear on its own

There are several reasons I think evolving from HUDSpace is a more likely way for AR to come into broad use than emerging as a fully-formed product on its own.

The first thing you’ll notice is that my favored scenario doesn’t involve walk-around AR at all for a long time. That’s a huge plus; even though I think walk-around AR is the end point and hugely valuable, it’s very hard to get to in any near-term timeframe. One problem with a lot of potential technological innovations is that they require abandoning existing systems and making a wholesale jump to a new system, and it’s hard to make all the parts of those sorts of transitions happen successfully at the same time. That’s certainly true of walk-around AR, which would require display, image-generation, and tracking technology that doesn’t exist today, all packaged in a form factor similar to bulky sunglasses, running on a power budget that far exceeds what’s now possible in a mobile device, along with completely new types of applications, as I discussed in the last point. Honestly, though, I used walk-around AR as a strawman in that post; it’s clear that it’s a long way away from being good enough to be a product, so it served as a useful counterpoint to illustrate the advantages of VR. Constrained AR, both room-scale and tabletop, lies somewhere between walk-around AR and VR, and is much closer than walk-around AR to being ready for broad use, although not as close as VR. Room-scale AR has many of the same technical challenges as walk-around AR, although to a lesser degree; tracking, for example, is difficult, but there are potentially workable, albeit currently expensive, solutions. Tabletop AR, on the other hand, is relatively tractable, although not quite to VR’s level; the problem with tabletop AR is primarily that because it’s so limited, it’s simply not as compelling or novel as room-scale or walk-around AR.

AR that emerges in stages from HUDSpace glasses, on the other hand, doesn’t require any great leaps; each step is an incremental one that stands on its own. Solving those problems separately and incrementally is far more realistic, especially assuming the preexistence of a HUDSpace business that’s big enough to justify the R&D AR will need. As a starting point, tabletop AR that evolves from HUDSpace glasses involves tracking that’s doable today, optics and image projectors that will be a manageable step from HUDSpace, power and processing technologies that will be largely driven by phones, tablets, and HUDSpace glasses, and initial software that’s familiar, including at least the tabletop games I listed in the introduction.

In short, the technological path from HUDSpace glasses to HUDSpace-plus-tabletop-AR glasses seems realistic, while going from nothing directly to walk-around or even room-scale AR seems like a big stretch. That’s true not only technically, but also in a business sense, because HUDSpace-plus-tabletop-AR doesn’t require AR to justify the cost of the hardware by itself; in contrast, standalone AR systems would be in direct competition with consoles and dedicated gaming devices, with all the costs and risks that involves.

Consider two products that support AR. The first product is a special edition of a widely-used pair of HUDSpace glasses that is normally sold for $199; the special edition sells for $299 because it has cameras and more powerful processors that let it support tabletop AR gaming. The second product is a pair of AR glasses designed specifically for living-room use; it supports room-scale AR games that you can play on your own or with friends, and costs $299, plus $199 for each additional pair of glasses.

Even though the pure AR glasses are more powerful and would support a wider variety of novel experiences for the same total price, it’s hard to see how they could be successful unless the experience was truly awesome. At $299 and up, this would be going directly against existing consoles, and it’s hard to make the first games for a whole new type of gaming be killer apps that it’s worth buying the whole system for, because it takes time to figure out what unique experiences new hardware makes possible. Getting developers to devote effort to support a new, unproven platform is hard as well – it obviously can be done, but it’s a major undertaking. Also, the up-front expenditures and risk would be relatively large, since this would be a new type of product that at least overlaps with the existing console space. In short, it would require a console-scale effort, with all the risk a new console with new technology involves. A tabletop AR product would be less of a step into the unknown, and could be somewhat less expensive – but at the same time it would be more limited and less novel than room-scale AR, so there’s still the question of whether it’d be compelling enough to justify the purchase of a complete system. I’d love to be wrong – It’d be great if a standalone tabletop or room-scale AR system could be successful on its own merits. It just seems like they would have to overcome considerably greater market and technical challenges than evolution from HUDSpace glasses.

On the other hand, I have no problem imagining that a lot of people who are buying the HUDSpace glasses anyway – which they will be, because they’re very useful – would spend $100 to upgrade to make them more fun to use. The key here is that AR itself doesn’t have to justify the cost of the system, just the much smaller upgrade cost. You might say that’s not fair, that it’s not as powerful a system, but that’s the point – in the beefed-up HUDSpace case, AR doesn’t have to be compelling enough to justify the purchase of the glasses in the first place. If you want to convince people to buy a whole new system to put in their living room, or to buy a dedicated AR system for tabletop gaming, you have to get over the barrier of convincing them that they want to own yet another gaming device. If, on the other hand, you want to sell people established HUDSpace glasses with tabletop AR capability, they’ve already decided to make a purchase, and it’s just a question of whether they want to buy a cool and not very expensive option; in fact, far from being a barrier to purchase, the AR option makes the purchase of HUDSpace glasses more attractive.

Better yet, if you want to play a multiplayer game with someone else, they’re likely to have their own glasses, since HUDSpace glasses will probably be widely used, so there’s no incremental cost for multiplayer. The network effect from widespread adoption based on HUDSpace is a huge advantage for beefed-up HUDSpace glasses.

The bottom line is that the HUDSpace-plus-tabletop-AR scenario is a pull model, with the right incentives; a lot of the hardware and a sizeable market get developed for HUDSpace independent of AR, and AR then serves as an enhancement to help sell HUDSpace glasses into that existing market. In contrast, any scenario involving a standalone AR product is a push model, where a market for a new type of relatively expensive product has to be created and developed rapidly, in competition with existing consoles. It could happen, but it seems less likely to succeed.

Advantages of constrained AR over VR

Now that you know how I think AR is most likely to emerge, and that it will likely be constrained to tabletop and possibly room-scale AR for quite a while, we can return to our original question, which is whether it makes sense to pursue AR only, or a mix of AR and VR, especially in the near- and medium-term. Last time I discussed why VR was interesting; now it’s time to talk about why AR might be more interesting.

I will first note again that last time I compared VR to walk-around AR, and that that was a strawman argument. I don’t think there’s any world in which true walk-around AR is feasible in any way in the next five years. As I discussed above, the challenges that constrained AR – room-scale and tabletop – face are similar to but far less daunting than walk-around AR, and constrained AR is probably doable to at least some extent in the next five years, a somewhat but not greatly longer timeframe than VR, so the question is which makes more sense to pursue.

First off, technically VR is easier to implement with existing and near-term technology; that’s just a fact, as evidenced by the Oculus Rift. The Rift definitely has some rough edges to smooth out, but there are ways to address those, and I expect Oculus to ship a credible product at a consumer price; in contrast, as of this writing, I have been unable to obtain a pair of AR glasses capable of being a successful consumer product. The core issue generally has to do with the great difficulty of making good enough see-through optics in glasses that with acceptable form factor and weight. However, I know of several approaches in development, any of which would be sufficient if all the kinks were ironed out, and it seems probable that this will be solved relatively soon, so it’s a disadvantage for AR, but not a decisive one.

VR is also more immersive in several ways: field of view, blocking of real-world stimuli, and full control over the color and intensity of every pixel , which can make for deeper, more compelling experiences, but there are downsides as well. Immersion may not be good for extended use, either because it induces unpleasant sensory overload or simply because it makes people sick. AR provides anchoring to the real world, and that helps a lot; I personally get simulator sickness quite easily with existing VR systems, but rarely have that problem with AR. I’m confident that AR will be easier for most people to use for long periods than VR.

Another advantage that comes with being less immersive is awareness of the real world around you, and that’s a big one.

For starters, being not-blind means that you can reach for your coffee or soda, find the keyboard and mouse and controller, answer the phone, and see if someone’s come into the room. This is such a big deal that I believe VR will not be widely adopted until VR headsets appear that make it possible to be not-blind instantly, most likely by being able to switch the display over to the feed from a camera on the headset with a touch of a button, but also possibly with a small picture-in-picture feed from the camera while otherwise immersed in VR, or with a display that can become transparent instantly.

Being not-blind also means that you can give only a part of your attention to AR. For example, you could have an in-progress game of chess sitting on a corner of your desk; you’d notice it every so often, but you wouldn’t have to be focused on it all the time. This lets you use AR a lot of the time, in a variety of situations. In contrast, when you’re doing something in VR, it’s the only thing you can be doing, which considerably limits the possibilities. Being not-blind also means that you can be mobile while using AR, even if only to move around a table for a better view, while VR pretty much requires you to be immobile, further limiting the possibilities.

Most important, being able to see the real world means that you can have far more social AR experiences with other people than you can with VR. Sitting around the table with your family playing a board game, sitting on the couch with a friend seeing how high you can build a tower together, or having a quadracopter dogfight are all appealing in very different ways than isolated VR experiences, and given how intensely social humans are, those are ways that are arguably more compelling. In this respect, AR experiences will be more complex and unique than VR experiences, since they will incorporate both the real world and that most unpredictable and creative of factors, other people, and consequently have greater potential.

Finally, constrained AR is on the path to walk-around AR, and walk-around AR is where I think we all end up eventually.

So, AR or VR?

At long last, to quote the renowned technology sage Meat Loaf: “What’s it gonna be, boy?” Unfortunately, after our long and interesting journey through possible futures, I’m not going to give you the crisp, decisive answer you (and I) would like, because there are two time frames and two scopes at work here.

There’s no way it makes sense to simply abandon AR for VR. Interaction with the real world and especially with other people is why AR is the right target in the long run; we live our lives in the real world and in the company of other people, and eventually AR will be woven deeply into our lives. In the medium term, I believe AR will likely emerge from HUDSpace roughly along the lines of the scenario above; another possibility is that a console manufacturer will decide to make room-scale AR a key feature, as hinted at by the purported leak of Microsoft’s Project Fortaleza a few months ago. All this makes it highly likely that work on tabletop and room-scale AR now will bear fruit in the future; it might be a little early right now to be working on that, but the problems are challenging and will take time to solve, so it makes sense to investigate them now.

In the near term, though, VR hardware will be shipping, and because the requirements are more limited, it should improve more rapidly than AR hardware. Also, it’s easier to adapt existing AAA titles to VR, and while VR won’t really take off until there are great games that built around what VR can do, AAA titles should get VR off the ground and attract a hard-core gaming audience. And a lot of the work done on VR will benefit AR as well.

So my personal opinion (which is not necessarily Valve’s) is that it makes sense to do VR now, and push it forward as quickly as possible, but at the same time to continue research into the problems unique to AR, with an eye to tilting more and more toward AR over time as it matures. As I said, it’s not the definitive answer we’d all like, but it’s where my thinking has led me. However, I’ve encountered intelligent opinions from one end of the spectrum to the other, and I look forward to continuing the discussion in the comments.

33 Responses to Two Possible Paths into the Future of Wearable Computing: Part 2 – AR

  1. George Kong says:

    It seems to me that we’ll likely have AR converging from 2 directions – the Google Glasses HUD Space you talk about in the first half of your article…

    and the enthusaist developer, from the direction of augmented VR goggles (i.e. VR headsets with cameras attached to provide see through functionality).

    The latter won’t become a standard consumer solution for AR – but it’ll likely be versatile enough (especially if you put the cameras in front of the eyes) for the enthusiast and hardcore gamer crowd to seriously experiment with AR that way.

    By the time the speed and power of mobile computing devices allow for walk-about AR, and the optics are advanced enough to ameliorate most of the stated issues, the design language and conventions for AR interfaces and even programs should be quite mature – the idea of AR should similarly be familiar to most people that use technology with any degree of regularity.

    • MAbrash says:

      George, well said – that’s a very plausible analysis, and a logical way for the two scenarios I described to play out.

      –MIchael

    • Ryan says:

      I completely agree. I see the scenario of AR developing out of VR with cameras mounted in the eye locations as much more likely from a technology standpoint. There are several advantages to mounting cameras on a VR headset (1) head tracking through optical flow or some other method (2) ability to see the real world as needed (3) potential for hand tracking (4) potential for AR use (5) low cost to implement. Palmer Luck is already considering adding cameras to the consumer Rift to do tracking by Optical Flow.

      The HUDSpace glasses however may have see-through displays, which is a non-starter for AR optically as I understand it, and low FOV, making the AR experience less compelling.

      I remains to be seen how willing people are to wear the head gear. Society is surprisingly flexible, but I am a poor judge.

      • MAbrash says:

        The type of camera used for optical tracking probably wouldn’t be good for video passthrough for seeing the real world or doing AR; the characteristics are quite different. But cameras are cheap and small, so having more than one would be fine. Having cameras where your eyes should be is another matter; socially, I have a hard time seeing that.

        HUDSpace glasses may or may not be see-through initially, but I think in the long run they will be see-through because otherwise they will have very limited display space and will block part of your view.

        I don’t think we actually “completely agree,” or actually agree at all :) I don’t think video-passthrough AR will develop from VR, because AR glasses will be see-through for a long time; video-passthrough is just not going to work well enough until a lot of problems are solved, and that won’t happen for quite a few years. (I discussed this a few posts back.) So see-through HUDSpace glasses will probably be the starting point for AR. VR is interesting for AR because it shares a lot of technology, although mostly in different forms, and so will get us farther down the path to working AR technology.

        –Michael

  2. Jamie says:

    Could you unpack the problem of objects needing to be “exactly in the same place at all times.” What is wrong with something that isn’t perfectly stable? To me it just sounds different, not wrong. What’s wrong with an object shifting around a bit? Sure you won’t say “wow it’s really there in the world like everything else I see!” but you can still say “wow cool and unexpected things are happening in the world!” What is the problem with the thing in and of itself? It sounds cool to me.

    After your very first blog post I bought and read Snow Crash. Great recommendation! While reading it I realized that all the technology needed to make the Metaverse happen already exists. The specifics are just design and engineering problems, the foundation exists in level streaming (large world), dedicated servers (people controlling their own domain with mods), VR (occulus) and so on. I think it’d work well for open world (space?) or even tying together completely different IPs. It could even be a desktop window manager.

    I’m not the kind of person that normally gets scared of technology, but when I hear about AR I worry. Not just for privacy reasons (can I mod my HUDSpace avatar to cover my face with a smiley instead of a nametag?) I worry because all my non-geek friends are fascinated by their smart phones. They sit around in groups, staring at the screen, not talking to each other, not interacting. I fear that AR will give us a society of dullards sitting on couches, mouth agape, not speaking, not interacting, just drooling as their eyes scan pictures of cats which, while cute, cannot replace the human connection.

    • MAbrash says:

      Jamie,

      Anything that isn’t stable within tight limits is rejected by the visual system as being part of the world. When you look at really stable virtual images, your mind just accepts that there’s something there. But if when you move your head the virtual scene shifts (due to either latency or tracking delay), it instantly turns into HUD-type stuff. Imagine if you looked at your monitor, and when you moved your head, the monitor shifted relative to the desk and keyboard, then snapped back into place. It would seem just not right. No different for virtual images. But when they’re really pegged in place (and imaged in a rock-solid way as well – tear lines don’t help, for example), they’re just there. Believe me, it’s a difference of kind. That’s not to say that poorly anchored virtual images can’t have value – as I said, they can, even if they’re just HUD images. But they’re not the same as AR or VR that’s registered to the degree the visual system requires, and the effect is less powerful.

      –Michael

  3. JP says:

    “…and honestly I don’t know what the input method will be, but there are several plausible answers, so I’ll assume that’ll work out and skip over it for now.”

    I’m actually way more interested in this question than in the display and motion tracking stuff! Should I read your “assume it’ll work out” statement as one of confidence in existing areas of research, or an admission that there are no good leads? If the former, what are the promising directions? If the latter, doesn’t that dramatically limit the potential relevance and utility of AR/VR in ways that should shape Valve’s goals with it?

    In general, input interface seems to be the great unanswered question since the earliest days of VR. The field seems to attract rendering gurus who often dismiss user interface as the trivial or unexciting part – it’s also what scifi seems to hand-wave the most, or leap entirely by supposing direct neural interfaces – when it may well be the /most/ crucial factor in its relevance to society at large.

    Without some great leap forward in input interface, I don’t think I share your strong optimism that people will find HUDspace products like Google Glass a useful replacement for smartphones. It was easy for Apple et al to convince us we needed a smartphone because we could start off thinking of it as a new generation of an existing technology that we already had a clear need for… does a similar argument emerge for glasses? There’s also a lot of hard-to-dismiss ergonomic factors… a phone is a modal-use, straightforwardly physical object, you take it out of your pocket and mess with it, the mental flow is like most tool usage. People know how it fits into their lives. Glasses are quasi-modal with normal activity – which as you point out is a tremendous potential strength – but the quasi-modal interaction boundaries suggested by speech recognition, gestures, and even wearable keyboards have so many tough challenges in real-world settings. Speech recognition in particular is such a classic cautionary tale… every 5 years we think we’re at the precipice of the exponential curve that will uplift it to being a truly viable paradigm, only to see it struggle in the same old problems – human language is intractably ambiguous, people speak in such different ways, teasing a coherent signal out of real world noise is hard, et cetera, et cetera.

    As always, thanks you for sharing your ideas, your insight, and your frank self-skepticism. People attacking these problems with rigor, the very opposite of so much previous wild-eyed futurism, is very encouraging :]

    • MAbrash says:

      JP,

      First off, thanks for your kind words. It would be fun to have wild-eyed futurism, but all that goes away after a few months of serious work on AR/VR :)

      As for input: great questions! I don’t worry about it for VR because a standard game controller is good enough, and we have clear ideas (which I can’t discuss at this point, but I hope to in the future) about how to make better VR controllers. AR in general is much harder, though.

      For tabletop AR, game controllers and similar devices will work fine, although not undetectably (if you’re in a meeting), and a controller would be another thing to have to carry. Oddly, a phone would be an adequate controller for a lot of tabletop AR stuff, and people will carry phones for a while yet.

      If you’re wondering about HUDSpace controllers, I don’t know; I haven’t been thinking about that. Google can figure that out :)

      So for the next few years, I think it’s okay, and that’s pretty much my planning horizon at the moment. Beyond that, it gets fuzzy – and, as you point out, difficult. We have a number of ideas, but I think the solution will involve multiple types of input (speech, phone, eye movement, new types of devices, inference from the scene), and that’s not something that’s going to get figured out deductively; it’s too complex, and we’ll have to see how things evolve (and help them evolve).

      –Michael

      • Patrik Nordberg says:

        To begin with thanks for very interesting articles.

        I understand you’re not too interested in the input aspect of the VR/AR technology but my take on this is an demand in improved direct neuro input devices. Like the Emotiv EPOC (http://emotiv.com/store/hardware/epoc-bci/epoc-neuroheadset/) Of course there’s improvements that’s necessary for this to work with for example text input but this is what I believe will be the future type of input.

        And further more I think that visual, and why not other sensory, input to us will not be screen based but rather passed directly to the brain. See for example this article http://www.gizmag.com/bva-bionic-eye-prototype-implant/23920/

        Best regards,
        Patrik

        PS. Thanks for the tip of Ready Player One, really enjoyed the read.

        • MAbrash says:

          Patrik,

          I’m very interested in the input side of VR/AR – it’s just not where my focus is at this point. There’s lots of interesting work to be done there.

          Having said that, we have looked at neuro input, and we don’t think it’s good enough yet, and isn’t likely to be in the near future. As for directly interfacing to the brain, that may well be the ultimate approach, but it’s far enough off that it’s not even on my radar at this point. And I will point out that the retina and other early visual processing circuitry do a lot of work that’s critical to vision. Replacing all that with software and hardware would be a major challenge, and for now it’s much easier to leverage it than to replace it. Down the road, who knows?

          Glad you liked Ready Player One!

          –Michael

  4. Emanuel says:

    As you’ve mentioned in this article, a huge factor in determining when these devices become viable is largely dependent on Materials Engineering. I’m curious as to what technologies we can employ to disperse light through one side of a lens, yet appear either opaque or not diffuse the light on the other side.

    • MAbrash says:

      Emanuel, could you explain why you think that property is key? It would be a plus to not have generated light coming out the front of AR glasses, but it doesn’t seem critical; getting the image into the pupil in a way the eye can focus on, with a wide enough FOV and enough brightness seems like more of a core (and very hard) problem.

      –Michael

      • Emanuel says:

        I imagine once AR has grown to a point where it’s a common consumer product, one of the next steps would be security, or in this case, preventing others from seeing what you’re seeing. Harvard researchers have been working on flat lenses that can converge light, like their convex counterparts. The compactness of these lenses could possibly be used in series to severely distort images projected outside of the glasses.

        http://pubs.acs.org/doi/pdf/10.1021/nl302516v

        Ultimately, yes, this is further down the road and the problems of FOV and brightness are key issues preventing the technology from flourishing currently.

        I don’t have any knowledge as to how Google or other AR developers are approaching these problems, though. I’m curious as to where they are generating the light and what material they’re projecting it onto.

        Out of my ignorance, I’m approaching this problem without any outside influence, and my first thought was that of something similar to a HUD already used in military applications. A system that would generate an image, collimate the light, and a substrate within the lens would project this parallel to the user’s point of view, as delineated in figure 1:

        http://upload.wikimedia.org/wikipedia/commons/6/6f/Reflector_reflex_sight_diagram_3.png

        Would you have any blogs or literature I could pick up in regards to the technical aspects of this technology? That would be great to delve into.

        • MAbrash says:

          I don’t have any specific reading to suggest – I haven’t found any particularly useful blogs, and no canonical book – but there are hundreds of research papers out there, and they’re easy to find. It’s all a question of what you want to learn – and the deeper you look, the more you’ll find you need to learn; it’s like peeling an onion.

          –Michael

  5. JMoak says:

    So what do you think of the HUD space glasses becoming just another part of a mobile device? For instance, you go out and buy the new killer phone from your store or online. But instead of just one piece, you have two. A plastic or other material brick with maybe a keyboard or other manipulation trigger, and a pair of glasses. Having a pair of those glasses becomes one way to become distinguished, much in the same way as I imagine owning a pair of Google Glass is soon to be.

    • MAbrash says:

      That’s definitely a possible path, one that I’m sure many companies are working on. It seems like the logical endpoint of Google Glass. But you need a new interface; you can’t just run Android as-is. And as a couple of people have pointed out, you also need to figure out how you’re going to do input. I also think it’ll be a while before people want to replace their phone with something wearable, as opposed to having both. I expect to see attempts at this, but don’t know how successful they’ll be.

      –Michael

  6. STRESS says:

    I think pushing aside the interaction with what you call the HUD glasses is a big mistake as I reckon this will be the falling step straight away when you’ll get it wrong.

    Voice interaction? Probably not the best choice as you loose many plus point against smartphones.
    Gaze interaction? Although definitely possible accuracy might be a bit of a problem and more severe it will make you look like really weird when wearing (so not cool and therefore will also stop it success straightaway).

    So what is left? An additional input device somewhere (?). But if that’s the case where is the benefit compared to touchscreen?

    Or free hand gesture interaction? This will make you like even worse in public compared to gaze interaction.

    • MAbrash says:

      Input is certainly an unresolved area. See my earlier reply to JP for why I’m not personally too worried about it for VR/AR gaming in the immediate future. It’ll be interesting to see what Google and anyone else who enters the wearable space does in terms of input for general usage.

      –Michael

      • STRESS says:

        It’s true that for VR is not a big deal as this is mostly in private and gaming in general as well as this is in private activity in a private surrounding and your focusing on one single specific task. But for HUDSpace which your path outlined (very plausible btw) needs to be a success is far more doubtful and honestly I doubt that google has any way into this at all right now. Besides they are hardly have the greatest track record when it comes down to UI in general.

        I even wonder if glasses is really the right solution for what they want to achieve. Although wearable is right but I could imagine a flexible OLED/fabric-based device that integrates nicely with your clothing is much more better for them as it doesn’t change the interaction paradigm as you can use the same touch interface you don’t have to wear obnoxious glasses which still is a big barrier for many people. I can imagine that’s why the public response for the google glasses was more on the skeptical note. I doubt we will see any product soon (if at all).

      • uh20 says:

        if its anything like what valve managed to do with controller interaction of steam big screen, someone will come up with at least an ok system pretty soon

        i think the best combination would probably have to be a mixture of speech, gestures, and perhaps a bit of pressure control on the glasses themselves, all will do well enough if everythings put together right

  7. Dan says:

    AR just sounds like an accident waiting to happen.

    It doesn’t really offer anything that smartphones don’t does it?

    Except, a glance at the halfwits wandering or driving around staring at their phones with their attention completely focussed on that and not on the poor saps that have to try to navigate around these buffoons (or their dogs) tells me that AR will be a disaster.

    As a species we haven’t yet evolved to the point where we can responsibly wander around with a screen that you either look at, or you don’t. We don’t have that level of responsibility or self-control. Ergo, it’s self-evident, it’s irresponsible beyond belief to give people a screen that’s overlaying their vision with a bunch of distracting crap all the time.

    As the Colonel in First blood nearly said, “If you sell AR devices, don’t forget one thing….a good supply of bodybags” :)

    VR, on the other hand I think has a use. Generally you’ll be sitting down in front of the TV, so it’s probably fairly reasonable to cover your head without much risk. Then you might play Team Fortress 2 and become completely immersed in the game for a few hours.

    I already experience so much immersion in TF2 that, on occasions, if someone speaks behind me when I’m playing, I use the mouse to turn around in the game instead of turning in my chair, and it takes me a moment to break that connection with the game and turn around in real life instead. For a second I have to consciously think to turn around. Perhaps my brain is just addled but at that point it feels like my brain believes to turn around I use the mouse and the sound cues are telling me there’s someone behind me.

    Which I think is yet another nail in the coffin of the idea that we’ll all be walking around reading reddit and twitter as we cross the road or navigate a bustling city centre.

    But it also shows a similar effect to these experiments where they place a camera on someone and what that camera “sees” is then fed to a experimental subject’s headset. The subject can then be convinced (often using some other stimulus) that he’s sat behind his own body, or someone else’s arm is his arm. To me these experiments show how big the potential for VR is to give a new immersive experience to games.

    Occasionally in Borderlands 2 I’ve managed to jump off a tall mountain and experience a real sensation of falling – similar to the sense that if I watch a video of someone climbing a tall building, I’ll experience vertigo.

    I’m sure these kind of effects would be enhanced with VR and would enhance a game. That I might start to feel more like I’m driving the racing car rather than sitting at my desk controlling a little man and a little car I’m sat behind looking at on a monitor. Similarly, what is rocket jumping or sticky jumping going to be like with VR? You obviously aren’t going to get G-forces or wind, but I’m sure it’ll be significantly more immersive than watching a screen when playing today.

    If I felt I was actually inside a Team fortress 2 map, even with mouse + keyboard movement, I’m sure it would enhance the experience no end.

    I’d buy one.

    But AR? Not only can I not see much use for it, when you can simply check a screen you’re carrying instead. It sounds an inherently dangerous distraction to give anyone that is on the move, whatever mode of transport they happen to be in control of. You’d be better researching screens and devices that are thinner, flexible, cheaper, with lower power requirements imo.

    I regularly cycle with a GPS attached to the stem of the handlebars on my bike. I’m generally only interested in my current speed and whether I’m following the track of my route, but that GPS device could be a smartphone if I wanted it to be. That’s all the access to information I really need, I don’t need it displaying hud-style so it’s distracting me as I’m cycling along.

    I think AR is like 3D TV, a solution to a problem that no one actually has or cares about.

    The problem with the terminator metaphor is, clearly our brains do immense amounts of visual, audio and language processing as we experience the world. We don’t, however, have debug information spilling out over our vision as we look at things or pick a phrase to say, do we? You look at a chair and you see a chair. If you built a terminator, there’d be no reason for his visual system to look at a chair and then highlight the object, and display a scrolling list of words until CHAIR appears bolded would there? The idea an artificial intelligence would be watching his own processing in action seems flawed, even if some process like this may well be happening in both real and artificial intelligence, it’s clearly not something we “see” happening. So, in effect, “terminator vision” is just a sci-fi metaphor to show an audience that Arnie is a machine. If that machine could really, say, look at someone and from that figure out his weight and size (so he knows whose clothes to steal), he would, were the machine to exist, just know it. What’s displayed on his vision wouldn’t be necessary. No more than you experience the word ‘cup’ appearing when you look at a cup – you just know in your mind it’s a cup and that it’s a separate object from your desk.

    The idea you want to look at that cup and get a menu of drink prices or something else is just so silly. There are bazillions of different things you might want to see or not see when you look at a particular object. But, most of the time, I’d suggest, people don’t want to see any of them, they just want to locate the cup so they can pick it up and take a sip.

    This is AR as the office paperclip, it’s doomed to failure “I saw you looking at a cup, do you want a coffee? a coca cola? To guess the volume and surface area of the cup? To order cups from amazon? Directions to the nearest kettle?’ A nightmare.

    VR that works, on the other hand, sounds like it’ll improve the immersion in games. Of course “that works” is the key to the success of the whole thing. VR per se is not new but obviously never lived up to the imagination of the user in the past.

    • MAbrash says:

      Good take on VR.

      You could well be totally right about AR. Or completely wrong :) I certainly don’t know how things will turn out. It is hopeless to try to deduce complex outcomes; the only way to get the answer is to run the experiment, and that’s going to happen over the next few years. Should be interesting!

      I used Terminator vision as an example of what people might see in HUDSpace, in the sense of having information (rather than AR) displayed, not specific stuff like highlighting an object or listing its name. What you say about AIs is probably correct, but doesn’t seem terribly relevant to humans; I don’t know of a pathway by which we can just “know” information. With regard to the utility of having information displayed on AR glasses, I certainly think not having to get out a phone and tie up one or both hands with it has considerable value. Also, I would note that some of your objections to AR are very much like objections to smartphones before they exploded, and certainly objections to tablets – why would you want a tablet when you could have a more powerful laptop or notebook computer with a proper keyboard and pointing device, running more sophisticated software, in many cases for less money? And yet…

      Btw, having that GPS on your bike is something that I (as a long-ago biker of many miles) would say seems not necessarily all that responsible (to use your word). Every time you look down at it, you take your eyes off the road. Now, maybe you only look at it when you’re stopped, but if that’s the case, why not just take out your phone and look at it? Actually, this seems fairly analogous to AR versus a phone, now that I think of it – the GPS display is not like a smartphone, it’s like HUDSpace glasses, because it lets you get information without having to free up your hands. If you had HUDSpace glasses, you could look at the readout and see your speed without tying up your hands and without taking your eyes off the road. The readout might be distracting (although there are a lot of controls in my car that are peripherally visible when I’m looking at the road, including the speedometer, and I have no problem filtering them out), so if there was a way to call it up easily, that would be perfect. It’d be interesting to think about what that control might be on a bike – one of the more demanding situations. I can think of several possibilities, but certainly don’t have a solid answer.

      –Michael

      • As for the cycling GPS computer example, I totally agree with Michael’s point of view: as a competitive amateur triathlete I’m interested in speed, cadence, heart rate, instant and average power, distance and sometimes even more. To read all that data I’ve got to glance at my Garmin every now an then. The same goes to running. I won’t hesitate to pay some hundreds of dollars to get some primitive HUD glasses that would show me those numbers with no need really distract from what I’m doing.

        The same goes about the Terminator vision and a cup: something not quite practical for an everyday use it might well be suited for professional applications. As a scientist I wouldn’t mind to know some specs of thing I’m dealing with on my optical table. And for engineering applications that’s even more important. I can easily imagine what it’s like to lay electric circuits in an aeroplane with hundreds of wires of different size, shape and colors all around you: having them labeled thought the AR glasses would make life much easier.

        One can think of many other applications of primitive “cup-like” labeling that might come in handy in various situations. I see HUD glasses more like a kind of additional content-aware desktop space. Say, driving a car I’d like to know how much petrol I’ve got, looking at my fridge I’d like to know if there’s any milk left inside and going to a store I might like to check the prices on ebay by just looking at some stuff or my friend’s wishlists being scanned for any object’s in my field of vision. It even might be that simple cup my best friend wanted for his b-day :)

  8. I see a small problem with the HUDSpace to Constrained AR that you have sugested and that is the z-filtering i don’t want to play a game of chess try and point to a peace and have my hand stuck behind the image. While this depth problem might be solvable with system similar to the Kinect, it is just that i don’t see the company in any hurry to strap a Ir camera and projector to there rigs.

    I see VR as the way in, it is already a compelling product(Oculus Rift) and will tap into the graphics arms race fatigue that is going around where as AR has so many hurdles to go. I agree we are very social creatures however escapism is still a force to be reckoned with.

    That all being said they compliment each other so it really doesn’t matter what comes first, VR is just AR with the lights off after all.

    • MAbrash says:

      “VR is just AR with the lights off after all” – that’s a great line :)

      Agreed, it doesn’t matter which path comes first or how relatively successful either is; the question is whether either will finally make it over the hump into broad consumer acceptance. Once that happens and proves there’s a really business there, everything else will follow.

      –Michael

  9. Nestor says:

    I have to disagree with what some naysayers are stating about the irrelevance of AR. To me AR and even Hudspace are very relevant for the future of computing. As is VR, even if the pursuit is for better micromanagement of information on a user level and in 3D space. Productivity alone is a huge benefactor of these improvements in technology.

    Right now there is little reason to improve the fidelity of TVs or even OLEDs beyond 4K, At least not for small devices or even large flatscreens. This being because beyond 4-8K it is hard to discern any improvement to the untrained eye. Besides Blu Ray itself is still on the uptake and the industry can hardly justify another generation of ultra high res TVs taking the market by storm so soon. So where does that leave us with higher screen resolutions, where will the demand come from? Even the 3d graphics market is plateauing on this front. But personally I believe the next catalyst for ultra high resolutions will come from VR and AR or hudspace, as they start to demand more screen real estate and fidelity to become more immersive and seamless.

    http://www.ign.com/videos/2012/01/11/ces-85-inch-prototype-hdtv-has-16-times-more-pixels-than-1080p

    When your eye is barely an inch from the screen, you can be rest assured that beyond 4K will be very compelling and useful to have. Others mentioned that it will be impossible to convince consumers to adopt such technologies when the natural trend will be to keep gravitating towards tablet and smartphone computing. But I think that soft AR or even Hudspace augmenting our availability of on demand information will one day be as indispensable as GPS is today.

    I can envisage situations where the prospect of seeing everything captioned intuitively will save huge amounts of time for someone at a local furniture store for instance. Not to mention travellers who will be able to understand their chinese taxi driver as they see every utterance subtitled with the possibility of replying by an AI assisted teleprompter/guiding with proper pronunciations and probable responses. Even simple carpentry will be possible to the uninitiated as recent advanced in triangulating and measuring in 3d space are becoming practical. Can you imagine just wearing glasses that will allow you to measure anything in front of you without even having to go up and measure it manually? This is an engineers dream come true.

    http://www.zeitnews.org/applied-sciences/computer-science/measuring-objects-3d-using-only-camera-and-projector

    As far as interfaces of the future are concerned. I see things like the “Magic Finger” and other more sophisticated methods like the ones showcased in “Thats Impossible: Mind Control” on the History Channel. There was an example of a sensor worn on a persons voicebox that could read ahead by vibration or voicebox contractions, and therefore a user could potentially gesture like a ventriloquist but in silence and the computer would read the words before even spoken. So how is that for those who dread public computing and the socially self conscious among us.

    http://www.zeitnews.org/applied-sciences/computer-science/magic-finger-swipes-smartphone-remotely

    But ultimately VR or a combination of that and AR can allow for the elimination of multiple screens for massive workspaces. Couple this with handsfree interactivity for the manipulation of data and suddenly you can increase productivity 10 fold. No more cluttered desktops or even physically cluttered workspaces. Working in screenless office environments might be whats in store for us and greener too. You may only need a smartphone in 5-10yrs that houses all the computing power you will want and have some leap motion and future Oculus Rift iteration that renders almost everything that came before it obsolete. One thing is for sure, the screen fidelity and real estate on these things will be phenomenal!!!

    • STRESS says:

      Good comment, but let me ask me a question

      >Working in screenless office environments might be whats in store for us and greener >too. You may only need a smartphone in 5-10yrs that houses all the computing >power

      What makes you think it would be greener than current desktop/laptop workplace? Just because a smartphone uses less energy at the moment? Doubt as you suggest current usage scenario which means it is very seldom under real load but if you run it 24 hours at full load it won’t. And that’s not even talking about the 10-fold performance increase you would need to bring it to the same level as a good laptop/desktop today.

      In general a battery powered device is actually less green than a device that is strictly connected to a main source. As you have the whole issue of the battery. No battery technology is really green (all actually are very toxic the least) neither very energy efficient.

  10. Michael,

    Thanks for the continued insight into potential futures for VR and AR technology. After reading both articles, however, I have a question for you. You discuss the tight constraints for visual display of AR objects as you move about and your visual perspective of the real-world scene shifts accordingly. If the AR object does not shift (including details of lighting et al beyond the scope of what I’m getting at), your immersion is broken and the illusion shattered; at that point, we leave AR and enter HUDSpace, as you write.

    What does this mean for sound in an AR environment? As one’s position relative to any object, AR or otherwise, shifts, so does the interpretation of sounds generated from that object. The human ear gauges positioning extremely well — this is something that despite advances in headphone or stereo virtualization, has not improved by a significant degree in a number of years, and does not seem likely to do so. Similarly, current surround sound implementations are poor at enabling listeners to move around in an environment while retaining a high degree of positional specificity with regard to the audio source. While the current solutions work fine for communicating general gameplay information (which direction an unseen rocket blast came from, for instance), AR will demand the same levels of precision that the real-world environment demands.

    If immersion in a walk-around AR environment is the long-term goal, then at some level current audio solutions will have to be improved upon. How that will be accomplished, both in hardware and in software, I don’t know. One early trick might be to utilize subtle audio cues in games to help trigger past experiences and reinforce visual messages, such as the different reference cues for the zombies in Left 4 Dead (in which each Special zombie has a unique audio cue announcing that they have spawned, usually playing before visual contact is established or the automated voiceover from the characters kicks in).

    As for VR environments, I think that the virtualization currently utilized for headphone gaming will be sufficient, as VR does not try to bridge our real world experience, but merely replace it — we don’t expect a television character speaking from the back of a scene to come out of our TV speakers significantly differently, for instance.

    • MAbrash says:

      Ryan,

      Great point. I think that really well-spatialized audio combined with a low-latency HMD and excellent tracking will be far more effective than an HMD alone, but I have no experience to report on at this point. I don’t see why AR and VR would have significantly different requirements for audio, though; in both cases, either the ear/brain pick up discrepancies from the real/virtual world or they don’t, and the spatialization requirements seem like they’d be the same in both cases. Anyway, I’m looking forward to exploring this, it just hasn’t made it to the top of the list.

      –Michael

  11. Matt says:

    I have a question about the necessary technology to allow for these glasses. Assuming that HUDSpace glasses do run on batteries and aren’t plugged in like VR headsets most likely would, how would the problem of terrible battery life be solved? I guess you could use USB to recharge the batteries but there will still be situations where you don’t have access to power (like on a tropical vacation). Solar panels maybe? For addditonal reasons, this Cracked article has more:

    http://www.cracked.com/blog/5-things-technology-will-never-fix-and-why/

    Also, a little bit more on the potential input. (Bear with me here, I know it’s not priority for you) One possible way is speech recognition. That would be great in a perfect world with perfect speech recognition, but that isn’t the case. As said in the [Cracked] article below, speech recognition can’t improve until we have “semi-intelligent” computers/robots. (Both battery and speech recog are on the first page)

    http://www.cracked.com/blog/5-things-technology-will-never-fix-and-why/

    Thanks.

    • MAbrash says:

      Matt,

      Battery life is definitely an issue. HUDSpace glasses don’t necessarily have to do much, and may in fact not be on much of the time. AR glasses are a very different story; they need to do heavy-duty computer vision work, and also need to render stereo 3D. That’s one reason why starting out with VR is so appealing – no need to run on batteries all day, or even to run on batteries at all.

      The line about “semi-intelligent” computers/robots is interesting. When I worked on natural language processing, I eventually concluded that it was an “AI-complete” problem, meaning that it wouldn’t work until there was AI to drive the parsing that was fully knowledgeable in whatever domain the text was about; there was no other way to properly drive the search and disambiguate multiple interpretations. Sounds similar. Speech can certainly work if it’s sufficiently constrained, but there are other problems with it, like privacy and social issues.

      –Michael

    • MAbrash says:

      I have. Some interesting observations, and some that seem kind of out there, but Ralph’s obviously a smart guy, so maybe he’s going to look like a prophet in a few years.

      –Michael

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>