No hate, no hype: Apple's vision for Spatial Computing
Contents
Even now, it feels like everyone is taking a hate-or-hype position on Vision Pro. For some it’s the Second Coming, while others dismiss it as “ridiculous” on its face.
I want to set all of the various jockeying for position aside.
Instead, let’s talk a bit about what Apple is trying to accomplish with Vision Pro, specifically as it relates to the kinds of software you can create with it.
That is: if we take the goal of building spatial computing seriously, what could you build with that? Is it possible to achieve it using Vision Pro?
How successful is this first generation of product? What would future iterations from Apple, and competition from other players, actually look like?
Is this actually going to work?
These are the questions that most capture me in this moment. Let me know what conclusions you reach.
Business goals
I know I promised you a software discussion, but first let’s talk about the motivations for the business entity that is shipping this product.
Apple wants margins. That’s the Apple DNA: sell a thing that is more than the sum of its parts, through the combination of proprietary software, hardware and industrial processes. Make a thing only Apple can sell you.
Now, you can argue about the success of their differentiation, depending on the product line—this is the point where Android people will chime in and let us know that Android gets all kinds of different features years sooner.
Reasonable people can disagree over the success.
But it’s clear this has been the strategy, at least since the original Macintosh had its licensing business killed when Jobs returned.
So Vision Pro exists to create an opportunity sell you an experience no one else can.
If Apple has something like that, they can make all the margin they want. They have a monopoly on the intellectual property that allows the creation of Vision Pro-like experiences, as they have one on iPhone-like and Mac-like experiences. You can prefer an Android-like experience, or a Unix-like experience. You can prefer other things.
But for Apple things, they’re the only game in town.
Vision Pro, in success, gives them a new IP monopoly and new high-margin revenues.
Success at what?
So if that’s what they’re doing as a business, what they’re doing in product is this:
They’re selling holography.
The foreshadowing is, as usual, in Microsoft’s baroque product matrix. Remember Hololens? It was augmented reality as the primary experience. Write whatever information you wanted into a fully 3D volume.
They made a go at this, hoping for another bite at the platform dominance angle. But they made the mistake of going to business users first, instead of building up the platform’s culture from personal use.
Still, they tried: holograms.
We’d love to do this without a headset, but there’s a problem: we don’t fucking know how. Holography remains a science fiction technology, at least in the sense that we could build interactive, freestanding 3D software. The technology simply doesn’t exist.
So if we can’t create actual 3D holograms in physical space, we have one fall-back position to accomplish a similar experience. We can instead interpose an illusion over the eyes that simulates the existence of 3D images in the physical space around us.
If we can convince the brain the light exists, make the images parseable by the brain circuitry that we use every waking second of the day (for sighted individuals), then we don’t need science fiction holographic projectors. We can fake it.
From there, we can create a completely new kind of software. Software that doesn’t just have 3D appearance, but 3D behavior. Depth and volume. Things you can walk through, reach through. Things that follow your gaze. Things that fill the room around you. Hundreds of times more workspace than just one monitor.
So that’s what happens you put on Vision Pro. You just see the world around you again. But now, extra stuff is projected over it. That extra stuff being whatever developers imagine.
The problem with this, of course, is computing power.
The eyes move fast. We’re constantly sampling, constantly wandering. We have specific expectations about how light works.
And we see at a resolution that is computationally expensive to fully approximate, and more expensive still to recreate in physical pixels. Creating an image reliably enough to trick the eye at this level of intimacy is simply challenging.
Apple is, in 2024, launching at the absolute barest edge: trading off maximal quality of holographic illusion against profitability.
And we will be able to sense these physical tradeoffs when we get our devices.
30 minutes
Apple’s ability to trade access for control in the press has served them perfectly here. Every demo of Vision Pro since summer time had a firm time limit.
Some of this, I’m sure, is practical. There’s only so much time, so much staff, so many units. Press demand is basically unlimited here. But it also gets the reviewer out of the experience before fatigue can set in.
When I got the first iPad home, I could tell one thing immediately: v2 was going to be AMAZING. But v1 was merely interesting. It was very cool, but also just a smidge too heavy to use comfortably. You couldn’t quite get lost or absorbed in your iPad activities. The weight kept you too conscious of the device, which distracted from the task.
Perfection in a digital experience is to make the physical details recede. Let the mind fully inhabit the sphere of imagination being co-created between software programmer and software inhabitor. A keyboard is a good example here. Once you’re a proficient touch typer, you’re not thinking about the keyboard. You’re not paying attention to it visually, and the feedback from the keys exists entirely for way finding and confirming presses. You’re thinking of words, and they’re appearing on a screen. Your mind and focus exist on the page.
Vision Pro is going to miss this mark a lot in v1.
It’s too early, and the problem is too hard for the reasons cited above. The device is bulky, you’re going to feel the fact that you’re wearing a weird electronic sock over your face.
I am certain that this is the absolute most comfortable solution for wearing such a sock. But we are not going to easily commune with the filmy veil of cyberspace in this version of the hardware. We’re not going to easily or sustainably forget we’re in there.
So for v1, the game is making the most of the time we get
If we want to edit financial documents, we have an incredibly powerful, mature solution at hand. We spend $800 on a commodity laptop, install Excel, and create the ghost plumbing of any business we want.
You don’t need to make spreadsheets in 3D. Whether you’re a developer, or a spreadsheet user, this is not time well spent in the land of holograms.
What’s something only volume can do?
That will be time well spent under the weight of this computer for the face. That’s what Vision Pro users will go through the trouble of gearing up for.
An experience so compelling, so unique, the sensation of all this glass and metal strapped to the front of the head is not only worthwhile, but the least interesting data the body is actually receiving.
Apple’s success rides on whether they make it easy enough to have amazing experiences that transcend the discomfort of a v1 device.
Can they do that?
As a software developer and product designer, I just made a $4,000 bet that they can and they will.
Apple has been building a strategy around this for years. It’s a little wild to see them actually bring all the threads together, like the latter half of a Game of Thrones season that suddenly connects all its storylines.
They’ve been shipping bits of hardware and software that contribute to this platform since 2014, when Swift was announced. Swift let them build SwiftUI, which lets you easily adapt your existing UI code across Apple platforms. They haven’t won all the developer love they could for this, and so not everyone is a strong believer in the SwiftUI strategy that would make Vision Pro a slam-dunk port on day one.
Still, SwiftUI and Apple’s accompanying frameworks, even if you started today, let you build a lot of cool shit quickly and without having to be an expert in all the math required to build a 3D experience.
They’re leveraging other work too, like all they’ve built for AR in iPhones and iPad.
They’ve been shipping a LiDAR sensor for years.
In other words, while Vision Pro is new, its subcomponents, and their integration, have been quietly battle tested for years ahead of this launch. Despite being a v1, much of the developer tooling that feeds it all is pretty mature.
More than that, there are so many developers who already know how to write this stuff, and they’ve been building for iPhones, iPads and Macs in this language for a decade. That means that Apple has maximal chances of finding the set of experiences that keep people coming for more hologram time.
Nothing is promised. But Apple has played the game well.
What comes next?
By market cap, Apple is a three trillion dollar company.
They had $162 billion in cash sitting around last year. These numbers are unfathomably large. It’s a lot of room to place bets and shape the landscape.
Rumor sites say that by 2027, they’re going to ship a v2 of Vision Pro. (Sometime ahead of that, they’ll ship a cheaper, simpler, non-pro device, having gotten manufacturing yields up on all the crucial components. Happens with consoles all the time.)
Let’s assume that’s true. Three years is a lot of time to let hardware mature. The current device is using M2-grade chips, with this configuration dating to 2022’s release. We can already buy M3-based Macs, so you can imagine quite a leap forward in performance for a 2027 iteration. That will make the vision illusion department of things even more successful.
That’s also more time to find ways to change the weight and balance of the device. The battery offloaded to the hip shows how desperate the situation is, and it means that unlike with phones and tablets, energy efficiency won’t translate immediately to lighter weight. They’ll have to find their gains elsewhere.
Still, if they could capitalize a v1, bet on the fact that they’re already paying to build the v2. It just takes that long.
And when they ship it, it’ll be less clunky, more comfortable. You’ll last a little longer, go a little deeper, be that much more absorbed by the content than you are distracted by the hardware.
So that’s what they’re doing
They want to build a platform based on spatial volume, not just flat planes.
They want to make it possible to build inside that volume with the same ease you build for the iPhone or iPad.
They’re constrained by the laws of physics and the limits of human technological prowess in computing devices.
And if they make it just good enough that you don’t mind the clunkiness, they get to make a bunch of money. It only works if the developers figure out interesting stuff that creates enormous value in the spatial computing context, such that an hour spent wearing the headset accomplishes things that are impossible to match any amount of time outside of it.
It’s a tall order. But I see the mechanics, I see what they brought to the party, and I think they’ve got a better chance than anyone else who has taken a swing so far.
This has already gone on far too long, but my parting thing is: I bet Microsoft is much more capable of competing with Apple for King of Holography than Meta is. I bet they killed HoloLens because they’re rebooting their program. It’s gonna be a personal productivity rematch for these old titans, while Meta sighs and abandons the hardware to finally build a 3D social platform people use.