How Meta's VR and AR Could Transform Through AI – CNET

How Meta's VR and AR Could Transform Through AI - CNET

As everyone in tech announces AI in everything and Apple readies its first VR/AR headset for next year, Meta’s most recent news at the company’s developer-focused Connect conference straddles both sides at once. In one sense, the products announced were straightforward: a new graphically-boosted Quest 3 and improved camera and audio-enabled Ray-Ban glasses coming later this month. Meta also announced a series of personality-driven AI chatbots, and a generative AI image and sticker-creating tool called Emu.

I’ve been following Meta’s moves in VR and AR since before Oculus was acquired by Facebook, and even visited Meta’s research labs last year for signs of where the future is headed. But at the end of 2023, it seems more than ever that the products we’ve come to recognize as “VR” and “smart glasses” are transforming. The Quest 3 has mixed-reality functions similar to the Apple Vision Pro, feeling at times like AR glasses in VR form. The glasses, next year, will start having AI onboard that will recognize objects and translate text, acting almost like display-free versions of Google Glass or some sort of early AR glasses prototype. And both should be able to run forms of Google’s conversational AI, and possibly a lot more, thanks to Qualcomm’s latest generation of more powerful chips.

meta-quest-3-1080-seq-cnet-00-07-15-02-still005
Watch this: Meta’s Big News: Quest 3, Smart Glasses and AI Celebrities

To get a better sense of how Meta will combine VR, AR and AI, I spoke with Meta’s CTO and product head, Andrew Bosworth, to ask about the future. What about Samsung’s expected device? Where is eye tracking, which was on Quest Pro but is a missing feature on Quest 3? And what about Beat Saber in mixed reality?

The following conversation was lightly edited for clarity and length.

Where do you see the relationship between Meta’s Quest 3, smart glasses and AI? 
Bosworth: If you would draw a box-and-arrows diagram of the architecture we’ve been envisioning for AR for a long time, one of the boxes is like, AI… ? (laughs) It’s so rare in this industry that a technology comes along that solves your problem without you actively pursuing it. But that’s kind of what’s happened [with AI]. 

If you had asked me and [Meta Reality Labs Chief Scientist] Michael Abrash, two years ago, even maybe last year, what’s the number one biggest risk to AR working … as hard as those displays are, as hard as the rendering is, it would have been the AI. Your expectations as a human to have an interface that can see what you can see, hear where you can hear, to have common sense is high. And our ability to provide common sense is low. That’s the problem.

We feel great about [this new Meta AI]; it really solved the problem for us. It was one that we thought we had more time to solve. AI has always been a critical part of our vision. It’s just that now we can actually put it into play.

For a while, Meta’s been promising AI assistant smart glasses that can see what you see. How are these steps starting to happen on Ray-Bans next year?
Bosworth: Right now, the glasses, from a power standpoint, you have to activate them. Over time, getting to the point where we have sensors that are low power enough that they’re able to detect an event that triggers an awareness that triggers the AI, that’s really the dream we’re working towards. And we’re working on those sensors, we’re working on that event detection. We just didn’t have a great solution for what we had previously called ‘the conductor,’ which is the thing that decides … is this a good time? You and I are talking face-to-face, so we should probably clear interfaces out [on a pair of future AR glasses]. If my wife texts me about groceries, keep that out. But if you text me that the kids are sick and need my help right away, pop it up. Like, how do you do that?

We’ve learned so much going from gen 1 to gen 2, going to these Ray-Ban Meta glasses. We see progress on two fronts: on the hardware, where we’re iteratively getting better at making things both better and cheaper. And we’re solving one of the critical software problems that we had with AI.

A screen on a stage showing celebrity faces over AI chat text, with Mark Zuckerberg off to the left side of the stage

Meta’s AI chatbots, with personalities and celebrity faces, will appear on Facebook apps and in VR on Quest 3, but not in smart glasses yet.

Scott Stein/CNET

Will these AI glasses have personalities, too, or just be a general assistant?
Bosworth: Meta AI is more of an agent model. So I think the future of AI is probably a split between agents — these external things that you go to, they have their own kind of atmosphere, you go to them and engage them — versus what I’m going to call a personal assistant.

AR glasses are going to see everything that I see; they’re going to see every private message that I send. They’re going to see every website that I visit. And I want them to do that because that will help them help me, and that’s going to be great. They need to be private. Like, really private. You know, like, really discreet.

Could they also, through plugins, schedule appointments for me? Of course. Can they also respond to messages for me? Of course, I can trust them. But they need to be mine. My private, personal agent. And that’s not the Meta AI assistant. Meta AI is your general-purpose agent. An agent that I can come talk to — you know, general things. That’s what we’re going to start with here. What I think will ultimately populate AR is a very personal version of that. That has, hopefully, an extended memory, has an ability to learn and know you … and an incredible amount of discretion.

The Quest 3 looks like a foot in the door going forward on mixed reality. But there’s a lot that could evolve, like augments, those widget things [you announced]. How do you see what the Quest 3 is going to be?
Bosworth: People forget that when [Oculus] Rift came out, nobody knew how to do locomotion in VR. Nobody knew how to do these basic things. One game would stumble on a great mechanic. And then every game was, like, ‘imitate that.’ Consumers learn that mechanic. And now they know it, and it’s not hard anymore.

We have a lot of ideas of why we think [mixed reality] is great. We’re doing a lot of stuff. I don’t think we know the half of it. Developers are going to discover new and exciting things. There are parts of mixed reality that I think are more developed. We have an alien invasion game, First Encounters, where aliens come through the walls. We understand these in the context of classical games. There are parts of mixed reality which are just cool versions of things that we understand well. That’s value, day one. Then there’s augments: Let’s find out.

One of the reasons it’s so important for us to start that work now is that ultimately does become the AR ecosystem over time. It’s a long time from here to there, but you can’t start soon enough.

A screen full of graphics showing various Meta VR games

There are a bunch of Quest 3-ready games coming soon, but not Beat Saber (yet).

Scott Stein/CNET

Why isn’t there Beat Saber yet for mixed reality [on Quest 3]? Because that seems like such the perfect application. Same thing with Supernatural, your fitness app.
Bosworth: We did look at a version internally of a mixed-reality Beat Saber. It was tougher to do than you think. When those cubes start at a great distance from you, when you’ve got the very dark black background, you can see them, and your brain is really counting on that more than you realize. When you’ve got a busy, well-lit environment, it could be really hard to see that. My point is, it’s one of those games that you think, oh, this is drag-and-drop … and then you do it, and you’re like, actually, there’s a lot more subtlety and nuance to this than we realized. So you just keep cranking on it. Again, the reason it’s so important to get this in the hands of developers soon is so they can start to do that work.

Do you feel like you’re closer to where people won’t need controllers all the time? Do you feel like this [Quest 3] hardware could see more of that realization?
Bosworth: We continue to think that’s a possibility, especially with so much of the time being spent in social environments. More than half of the time [in VR] is being spent in social: Some of that is in social games that are using controllers, but not all of them. It’s not a question of ‘does it work?’ — obviously, it can work. But there’s certainly quite a bit of content today that people want to get access to that requires controllers.

If, at some point, you say, hey, is there enough of it that you can do with just hands that you’ve got a totally viable product, versus making somebody make a separate trip back to the store to go get the controllers that they wish they had … we’re constantly eyeing that as a way to get the devices to people in a way that’s useful to them at a lower price.

A look inside the lenses of a VR headset

The Meta Quest 3 has upgrades, but no onboard eye tracking.

John Kim/CNET

I want to ask your thoughts on eye tracking because it’s on Quest Pro and also Vision Pro and PlayStation VR 2 [but not Quest 3]. What do you think of, interface-wise, for where the Quest platform’s at?
Bosworth: I’ll probably still use my Quest Pro for my meetings because I love the eye tracking and face tracking. We’ve been playing with eye tracking, gaze plus hands, as a user input interface for years. Eye tracking just adds a lot of cost and complexity to the hardware. You’re talking about at least two cameras per eye to do it well, not to mention the infield illumination. Apple Vision Pro, which is a beautiful device, they’ve done infield illumination, so the illumination is coming through the lens. If you’re doing it through the lens, you can’t wear glasses. Hence the need for prescription optic inserts.

Over time, eye tracking will eventually be part of the core platform; I think it’s a great tool. For us, it’s always a matter of the cost-benefit. What’s the tradeoff? For the average consumer that we’re trying to reach, are they going to find it worthwhile to add this extra weight, cost, thermal, battery impact for the benefit that it gets? 

There’s a lot of focus on openness and compatibility: Microsoft partnerships, Office 365 and cloud gaming. Do you see more opportunities for that dovetailing with some of the hardware that’s coming? Between Apple and whatever Samsung is developing with Google, ideally, there will be ways that these will interplay.
Bosworth: We’ve been out here for 10 years at Connect, doing this work, putting it out there. Tens of millions of units sold. How many millions and millions of dollars paid to developers, businesses built on the platform. Everyone else has zero millions.

I’m not saying it’s impossible. We certainly want to use a lot of open standards. Open XR, Vulcan, glTF. We’ve been working on the standards game for a long time and trying to do this thing out in the open and making it easy. We run an Android-based operating system. It would be trivial for Google or somebody else to bring an app store of 2D apps to the platform. Wouldn’t even be hard. We’d be happy to have them. I hope that people do support the ecosystem. They’ve just gotta pick up the phone and call us.

Ray-Ban Meta Wayfarer smart glasses

Meta’s glasses can shoot photos and video, but not spatial video. Someday they might, though.

Meta

Could the Ray-Ban glasses ever do spatial video?
Bosworth: The first version of Ray-Ban Stories had a camera in each temple and were capable of stereo capture from a hardware standpoint. We never built the software. It was just not very popular with consumers. We did some user testing and stereo imagery, even on Facebook and Instagram, where you can kind of do a cool stereoscopic replay and also in-headset. People just weren’t spending time looking at stereo photos. So we ended up not building it out to save the extra power from powering the second camera, to make it last longer and make the capture smaller. We replaced the second camera with an LED [on the new models].

I have multiple VR cameras. I did a whole year where I recorded, every week, a science Saturday with my son in VR and put it online. I really have enthusiasm for it as a creator. I can’t wait to get started; the pieces obviously aren’t in place today. But I think it’s important for us to maximize these glasses for what they are, understanding the people who are using [these glasses] aren’t necessarily trying to be VR creators. The people who are trying to be VR creators probably have better tools for the job.

Talking with Qualcomm’s Hugo Swart about the chip in the Quest 3 and Ray-Ban glasses, it seems like there’s more bandwidth for sensors for pairing with things. Maybe watches. He mentioned wearable sensors. What do you do you see?
Bosworth: We have tremendous ambitions to be in that space. We’ve obviously been open with our developments on neural interfaces. We’re currently kind of wrist-based. And I think those things have to be elsewhere. So there’s opportunity there. Realistically, like almost all of our headsets, these devices are thermally limited. We’re not limited on the chips, bandwidth, input I/O, number of channels, number of pipes, that’s not the major limiter. It’s really important: you couldn’t do the number of sensors that we do without the XR flavor of chip. But once you crossed that Rubicon, you’re actually just throttled on raw ability to create thermal energy on the face. So that’s the major hurdle we have. 

Add a Comment