Peripheral Visions

More and more, “ambiently intelligent” systems are manipulating the periphery of our senses — could VR teach us to wrest back some control?

What do you perceive at the far edges of your visual field? What sounds do you hear in your attentional background as you read? It’s all a bit fuzzy, right? But you are still picking up something. In the 20th century, this area of peripheral attention became a means to influence behavior, from background music in factories to ambient advertising streaming across urban space. As I explored in Ambient Media (2016), by the 1970s this background media became increasingly personalized, transformed into not just a mechanism of crowd control but a more voluntary practice of private mood regulation.

The new ambient media aspired to be both functional and aesthetically appealing; in Brian Eno’s classic definition, it had to be “as ignorable as it is interesting.” To many, this more personal use of ambient media felt liberating, granting people the freedom to experiment with their own peripheral attention. At the same time, bringing personal choice to background mediation often simply meant more individually tailored ways to meet larger social demands for efficiency, productivity, and emotional control: Some appropriately paced music on headphones might help you get through the workout — or the workday.

In the 21st century, ambient media increasingly leverages ambient “intelligence” — using sensors, machine learning, and other forms of data-driven optimization to regulate environmental features like lighting, temperature, or indeed background music itself. Ambient intelligence comes in a range of flavors. The so-called Internet of Things positions homes, offices, and public space as venues for context awareness, responding to human movement, voices, facial expressions, or even body heat. More direct human interfaces like augmented reality and virtual reality similarly use sensors and data to monitor and transform a person’s immediate sensory surroundings, introducing virtual elements into existing spaces or even substituting for those spaces entirely.

VR may be on a similar trajectory to the internet, shifting from an initial period of experimentation to more privatized forms, where engaging a user’s attention is not the purpose so much as the product

While earlier ambient media sometimes experimented with responsive feedback loops, computing introduced the possibility for far more fine-grained control over environmental sensing and perceptual modulation. For one example of what this now looks like, consider the Retail Store Sentiment Analysis service offered by Amazon’s Rekognition facial-scanning platform. Building on machine-vision research in affective computing, the service purports to “detect emotions like happy, sad, or surprised from facial images,” analyze them in real time, and provide periodic reports on sentiment trends from each store location. In other words, rather than only track explicit customer behaviors, Rekognition treats the facial expressions of the customers themselves as ambient data on in-store emotional trends. The output of this system could become an input to help guide in-store lighting, music, or even product placement, tightening the feedback loop between ambient environment and consumer behavior. None of this requires attention from customers or store employees. In fact, it would probably function most effectively if the apparatus went unnoticed, seamlessly and “calmly” blending into the retail environment.

A system like Rekognition fits easily into the dystopian vision of a total surveillance society. But ambient control will not only come from the top down, imposed by governments and large corporations on unsuspecting individuals. Much like the earlier turn to personal forms of ambient media internalized the principles of peripheral mood regulation, the “smart” automation of our shared perceptual background has a private corollary in virtual reality. While not often thought of in these terms, VR is the result of ambient intelligence put to personal use, reimagining context awareness as a way for individuals to carve out a virtual space of their own. As with ambient media’s personal turn, we should question whether the perceptual freedoms VR claims to offer really allow an escape from more public forms of environmental influence, or whether it might actually serve to wrap a person even more tightly inside a context-aware computational logic.

Virtual reality’s ability to capture both peripheral vision and hearing, paired with its movement tracking and environmental sensors, make it the most powerful interface yet for building and manipulating a private ambient space. Compared with more environmentally blended approaches like the Internet of Things or augmented reality, virtual reality places uncompromising demands on the user’s visual field — what game designer Theresa Duringer describes as an “absolute retinal monopoly.” You literally cannot look away, short of ripping the display off your head.

Pair this with headphones, and you are perceptually locked in to the device even further. Unlike a book, a television, or a mobile screen, VR doesn’t just ask for a person’s focused attention — it wants your peripheral attention too. The head-mounted display puts as little space as possible between your eyes and the screen, much like headphones remove the gap between your ears and the speakers. Like no other device before it, virtual reality really tries to bring you fully inside an ambient space. I call this peculiar situation the VR enclosure.

In truth, the enclosure is never absolute. VR presents people with two planes of peripheral awareness: the wrap-around audiovisual field produced by the device, and the existing environment around their body — a space glimpsed, if at all, at odd angles through the narrow gap around the nose bridge of the headset. This creates complications, clearly evident in how much work it takes at VR arcades or conference demos to get people in and out of the headsets and keep them from accidentally hitting or running into each other.

While VR promises newfound perceptual freedoms, it simultaneously threatens a loss of perceptual autonomy, because it cuts individuals off from their immediate sensory environment. This can register as particularly offensive in cultures deeply invested in ideals of self-determination and self-control. Perhaps to try and stave off such worries, in North America VR developers frequently seek to instead trumpet virtual reality’s potential to bring people together, enabling new degrees of mobility, empathy, and interpersonal understanding. In this model, VR cuts you off from the immediate environment only so it can expand your experiential horizons. This idea echoes earlier attempts to assuage social anxieties about solitary media consumption, such as arguments that reading novels or playing video games may actually help a person understand a larger, less mediated world.

In contrast, in Japan some have embraced the VR enclosure as a major part of virtual reality’s appeal. Dentsū’s Comolu VR application, named after the Japanese verb meaning “to withdraw,” aims to provide a space to “withdraw inside your head and think,” and provides monastic VR robes to match. Japanese VR sites and social media posts applaud the ability to use the standalone Oculus Go headset to create an immersive media environment even amid the tight spatial constraints of many urban Japanese homes — to lie down and watch Netflix on a big virtual screen, for example. Other intrepid Go users have ventured out in public with the device, gathering on trains and in restaurants to test how mobile VR might function in public space. Such uses of VR draw directly on the Japanese legacy of the Walkman and the internet-enabled mobile phone, extending their capacity to carve out a private media environment in even the most public of contexts.

VR’s most important role will be as a space to more directly engage with the environmental data that surrounds us

However, while the VR enclosure may allow a person to withdraw from the sensory constraints of their immediate physical environment, in exchange it subjects them to a digital interface expressly designed for the tracking and modulation of human perception. One particularly active area of VR display research is a technique called foveated rendering, which saves the really crisp pixels (and the processing power they demand) for where the eye is actually looking. Areas of peripheral vision render with less detail, mimicking the selective focus of the eye. Fixed foveated rendering, as on the Oculus Go, puts the sweet spot in the center of the lens, forcing you to move your whole head to reposition it. More advanced techniques use eye tracking to match high-resolution areas to where your eyes are actually pointed. This means that no matter where you look, the system is always right there with you, recalibrating what appears at the center and the edges of your attention.

In VR you may be free to look in any direction, but this perceptual autonomy makes you vulnerable to new forms of ambient manipulation. Not only does eye tracking in VR offer a kind of individual signature to identify people, it will also be a boon for advertisers eager to further operationalize peripheral vision. When advertising eventually makes further inroads into 3-D virtual space, the VR enclosure will offer unprecedented potential to monitor and transform almost the entirety of a user’s visual field.

It’s worth remembering that Facebook, the same company that legitimized the current consumer VR boom by purchasing Oculus for about $3 billion in 2014, makes 98 percent of its revenue from advertising. VR may be on a similar trajectory to the internet, shifting from an initial period of experimentation to more privatized forms of platform capitalism, where engaging a user’s attention is not so much the purpose as the product. As virtual data streams merge with the real-world environmental sensing coming from augmented reality and the Internet of Things (such as Apple’s plans for VR in self-driving cars), the reach of the VR enclosure may start to extend far beyond the headset itself.

Yet by demanding such thorough perceptual immersion, the VR enclosure might also allow individuals to experiment with environmental mediation, to critically engage with ambient intelligence rather than simply submitting to it.

VR proponents often argue virtual reality represents a paradigm shift from “narrative” to “experience,” abandoning earlier modes of aesthetic distancing to bring a person fully inside an alternative sensory reality. But it may be that VR’s most important role will be as a space to more directly engage with the environmental data that surrounds us. From this perspective, VR isn’t a tool for immersion in some other place or experience, but an opportunity to be perceptually immersed in mediation itself — to explore it, to tinker with it, and perhaps to invent other ways of using it.

Examples of what such experimentation might look like are beginning to emerge. Rather than insist on full immersion at all times, Tales of Wedding Rings VR (Square Enix, 2018) withholds 360-degree environments for select moments: what in cinematic terms would serve as establishing shots or scenes of heightened emotional intensity. Otherwise it presents most of its 30-minute narrative inside a range of floating manga-style frames. The portion of the world visible in each frame shifts as viewers move their heads, as if looking through a window. Thanks to this more limited spatial focus, viewers have the chance to decide for themselves where to look and how to position themselves vis-à-vis the ever-shifting frames. Along with its black-and-white hand-drawn aesthetic, the window frame approach allows Tales of Wedding Rings VR to reduce visual overload and overall demands on viewer attention, opening up a space for viewers to consider not only the work on view but the process of mediation itself.

To some degree, this also holds true for watching 2-D cinema and television within VR — as noted, one of the prime drivers of the Oculus Go excitement in Japan. Paradoxically, while the head-mounted screen sits just millimeters from the eyes, VR can easily position virtual screens and other virtual media objects at distances and scales implausible or prohibitively expensive in the real world (and relatively unusual in an era of arm’s-length media devices). Size and position are often also user-adjustable, meaning the VR enclosure allows viewers not only to open a significant gap between themselves and the (virtual) screen, but also to consider how distance and orientation mediate the experience.

In this way, the virtual space inside the VR enclosure does provide a kind of sensory freedom, if only the freedom to experiment with sensation itself. VR can allow people to experiment with the algorithmic settings of their enclosure — to adjust lighting, field of view, reverb, background landscape, etc. — and test how their subjective experience transforms. Much like how VR is used to experience and examine architectural projects before they are physically realized, virtual space might serve as a place to learn how to live in a world of context-aware environmental media — to learn how to effectively manipulate it, but also to learn how it can manipulate you. VR can cultivate not just artificial ambient intelligence but human ambient intelligence. By taking over our perceptual field, the VR enclosure further entangles us in environmental controls. But it can also provide a space to re-examine our relationship with ambience itself.

Paul Roquet is an associate professor in Comparative Media Studies at MIT, and the author of Ambient Media: Japanese Atmospheres of Self (Minnesota, 2016)