Seeing Without Looking

Full-text audio version of this essay.

In May of 2020, Sidewalk Labs announced it was abandoning its ambitious, eponymous, neighborhood-sized smart city project in downtown Toronto. The company, a small urban development-focused sister company to Google under the Alphabet umbrella, cited the “unprecedented economic uncertainty” brought on by the Covid-19 pandemic as the reason for its withdrawal. Last December, Sidewalk Labs founder and CEO Dan Doctoroff announced that he had been diagnosed with ALS, and was stepping down from the company. Sidewalk itself was atomizing, with some of its products — including a sensor kit and a data collection suite for parking utilization — being absorbed into Google. A small flock of spinoffs, including a mass timber construction company, a municipal transport planning firm, and an investment firm focused on infrastructure planning development, will all be housed within Alphabet or kept close through strategic partnerships.

Sidewalk Toronto was intended as more of an urban technology proof-of-concept, or research and development lab, than a planned neighborhood

While the marquee project has folded, the materials left behind by Sidewalk Toronto remain a valuable archive, a hyper-documented trove of operating theories and common sensibilities of civic technologies. Examining these dead projects is more than just an exercise in recent history. Particularly in the case of Sidewalk Toronto, the all-out persuasive public performance, and the acres of media coverage the project inspired were a key moment in setting public expectations about technology. Picking apart the promises and predictions of these endeavors can reveal the unexplored consequences of what people think a technology can do — which is different than what it actually can do, but just as important. Though the project, and now the firm, has shattered, its ideologies have afterlives. They’ve slipped out into the wild, under the guise of so many new firms with ironically ungoogleable names (Replica, Canopy Buildings, Affordable Electrification).

Sidewalk is dead. Long live Sidewalk.

When it was announced in 2017, the Sidewalk Toronto project attracted a great deal of attention, both positive and negative. Critics of the project, including myself, were quick to point out the substantial risks to privacy that seemed baked into the project’s basic operating logics. Depending on who you ask and when you asked them, the proposed neighborhood — the first built “from the internet up” — would occupy a plot of land that was either 12 acres or 880 acres of brownfield waterfront property in downtown Toronto. Sidewalk boasted this neighborhood would be “over-provisioned” with cameras, temperature and air quality sensors, microphones, pressure sensors, and other monitoring systems strewn through public and private spaces. With this “ubiquitous sensing” system, Sidewalk promised that everything from the ambient temperature of buildings to crosswalk timings to the assigned uses of public and private spaces could be “optimized.”

Sidewalk Toronto was, overall, intended as more of an urban technology proof-of-concept, or research and development lab, than a planned neighborhood. The data-fed systems Sidewalk proposed to develop for and with the neighborhood were meant to be sold, later, to other cities and municipalities. The data collected on its residents was used to do more than optimize the immediate environment; it was being used to develop and tune a whole suite of urban technology products.

Privacy activists were not fans of this proposal. How would this data be analyzed? Who would own it? Where would it be stored? What laws would govern its use? How long would it be kept? Who would be responsible for keeping it secure and the people it described safe? Could residents or visitors opt out?

When Sidewalk Toronto published its 1500-plus page Master Innovation and Development Plan in early 2019, it tried to address these questions and critiques with a specific computer vision system. Described as a major component of the neighborhood’s sensing apparatus, the combined camera-computer vision-analysis system would collect raw footage and extract pertinent data; then de-identify that data, produce an actionable analysis, and delete the source footage, all within the box body of the camera itself. This computer-vision-system-in-a-box could be installed to existing infrastructure, like utility poles or interior beams, using a “KOALA mount.” (So-called because of their theoretical ease of installation, as a koala clings to a tree branch.)

This contraption was presented by Sidewalk Labs as a privacy hedge. If the original recorded images were destroyed after analysis — a privacy auditor was even proposed, to observe only “very low amount of data leaving a computer-vision camera” — then surely there was no privacy violation.

Privacy is not simply concerned with being observed, or even being observed unawares. Privacy extends, as Warren and Brandeis famously described it, to the right “to be let alone,” to be free of untoward or illicit or improper influence. Further, for a community and civil society to function, it must maintain, in the Levinasian sense, the “response-ability” of all parties — that is, it must protect and preserve their ability to meaningfully respond to their own circumstances. Sidewalk Toronto’s proposed remedy saw the potential privacy violation merely in the unlikely escape or misuse of images. Its response indicated it had no sense of the ways the collection, analysis, and deployment of that data and its derivatives were themselves intrinsically the violation. What Sidewalk proposed both preserved its ability to interfere with those interpolated in its data regime, while also destroying the “response-ability” of those individuals by destroying the basis for that response: their images.

Privacy is not simply concerned with being observed, or even being observed unawares. Privacy extends to the right “to be let alone”

Sidewalk’s proposed devour-digest-destroy optical system was the exposed eye of its surveillant data regime. Like other recent smart city projects, Sidewalk Toronto relied on the idea of data, specifically Big Data, as a governance mechanism, using sensors to reveal the habits, preferences, and personal truths of residents and visitors, and establishing rules and regulations for the neighborhood based on these inputs. As documented by many observers including Zeynep Tufekci, danah boyd and MC Elish, big data projects like these make several key claims in their persuasive narratives: that big data analytics are an accurate, objective way to comprehensively model real people and real populations; and that this modeling provides users (that is, those utilizing the data set, not those interpolated within the data set) with the necessary information to make better choices for and about those implicated individuals. These analytical and modeling functions are achieved through a process (very broadly speaking) of observing, recording, categorizing, and drawing relationships of correlation and inferences between sets of observations and actions.

Susan Sontag’s 1977 book On Photography provides an incisive framework of critique, not just for image-based surveillance regimes, but other big data-based systems. Sontag’s key critiques of photography — its techno-scientific claims to objectivity; its claims to comprehensiveness; its intrinsic promotion of an acquisitive perspective on the outside world; and the way these aspects are conditioned by its ubiquitous, mundane availability — are all applicable to current data regimes, whether or not those regimes make primary use of photographic images.

Photographs, Sontag notes, are thought to be “more authentic” than other more discursive ways of interpreting the outside world, such as literature, “because they are taken to be pieces of reality.” According to Sontag, this has the effect of reversing the polarity of reality: the photographic image supplants the reality of the “real world” such that “it is reality which is scrutinized and evaluated, for its fidelity to photographs.” Photographs are judged as “pieces of reality” due to their (theoretically) unmediated nature: the technoscientific situation of the camera, the film, the light, the development process, all combine to produce a product animated by “reality” rather than one animated by human senses or thought.

Surveillant big data regimes make the same inherent claim: a person reveals themselves through incidental, collectible, observable, crunchable actions. It is these data points, taken in aggregate, that should be considered as “pieces of reality,” not messy discursive encounters. The technoscientific situation of the camera, and the faith placed in it as a realistic renderer, is the same as the technoscientific situation of sensors and analytics suites and the faith placed in them. Even critics maintain this faith in the big data situation, when they claim that the bias of these systems can be remedied by adjusting or expanding their data inputs, that is, the most identifiably human aspects of the system.

Photography makes detailed instants available, which it declares revelatory and uniquely valuable; thus, we must have more detailed instants available

Of course, photography, computer vision, and big data systems are all deeply human-mediated systems. However, each has a history — and in photography’s case, one that stretches back nearly 150 years — of intentionally obfuscating the human hand in the black box. Peter Buse has noted that Polaroid’s slogan, “The camera does the rest,” which came into use in the 1970s, was an adaptation of a much older Kodak advertising slogan dating from 1888: “You press the button… and we do the rest.” The “we” here stands in for a factory-based, labor-intensive process that was functionally opaque to the customer. Customers would mail their cameras to the Kodak factory, where workers would unload the exposed film, develop it, make prints, reload the camera with new film, and mail the whole thing back. Both slogans fence the camera-user away from the work of the production of photography, either in the black box of the camera, or the black box of the factory.

Similarly, as Thomas Smits and Melvin Wevers have noted, the presentation of hyper-modern computer vision sections off huge amounts of human labor from the audience observing those systems or those interpolated within them. Sidewalk Toronto’s KOALA computer-vision camera adds additional steps of obfuscation between the data subject (the person in front of the camera), the human labor that makes computer vision possible (the image tagger, the click worker, the engineer, etc), and those trying to use these systems as bureaucratic, mundane, tools of governance. It is a black box that spawns more black boxes.

Sontag observes that since the ubiquitization of photography, the average experience of the world outside of one’s immediate geographic and temporal environs is primarily mediated through photographs. It is impossible to judge those photographs against the “reality” they capture, as the photograph is our only “evidence” that the reality was, in fact, real. Part of the critical nature of this “realistic reality” as rendered through photographs is its stability, and further its momentary stability: what Sontag calls “the insolent, poignant stasis of each photograph,” which “keeps open to scrutiny instants which the normal flow of time immediately replaces… Life is not about significant details, illuminated with a flash forever. Photographs are.”

It is the impossibility of seeing and examining these fractions of time and place without photography that makes photography a valuable and necessary, though circular, technological epistemic medium. Photography makes detailed instants available, which it declares revelatory and uniquely valuable; thus, we must have more detailed instants available, which are revelatory and uniquely valuable.

In addition to the arresting of moments, photography makes possible the collection of such moments, and the arrangement of them together with each other. Sontag is sharply critical of this mode of photography. “By furnishing this already crowded world with a duplicate one of images,” she says, “photography makes us feel that the world is more available than it really is.” She points out that this acquisitive, lepidopterist mode, where a completed inventory can and therefore must be made, is not confined to personal memory-making. It transfers to other domains, such as the bureaucratic and naturalistic, where taking “specimen after specimen, seeking an ideally complete inventory, presupposes that society can be envisaged as a comprehensible totality.”

Sidewalk Toronto’s response to critics — disappearing those initial camera images — both exacerbates the privacy violation and multiplies it. All that is left is a recursive ecology of data

In data systems, this comprehensive presupposition lends itself to a paranoid preoccupation with prediction and anticipation. All this data is being collected, after all, so that correct anticipatory choices can be made easily and precisely, not reactively and slowly. For Sontag, photographs and the practices of ubiquitous photography flatten the world, creating an illusion of collectible, available comprehensibility. Big data regimes and their ubiquitous collection requirements have the same effect.

Sontag sees the world of images as doubling the world of reality; and the photographed world gives the illusion of a reality “more available than it really is,” one that can be taxonomized, acquired, and grasped wholly. Here, the “availability” of the information is merely a matter of pursuing the correct technological innovations to uncover it. Data — and the images collected to generate that data — double reality in the same way. Moreover, the world of data doubles is preferred: easier to interact with, easier to access, easier to model and manipulate.

A photograph constitutes a violation, and capture, of its subject, “by seeing them as they never see themselves, by having knowledge of them they can never have; it turns people into objects that can be symbolically possessed.” The accuracy of the knowledge photography provides in this case is irrelevant. In Sontag’s analysis, the subject of the photograph has been usurped from their creative place at the center of their own story. Their own subjectivity, their ownership of themselves, has been dented by the satisfied knowledge-seeking of the photographer and the photograph. The photograph is not simply a document. It contains the psychological impact of surveillance itself. Data collection and production, through cameras and other sensors, similarly is not simply an information bit. It is an artifact of surveillance, of usurpation. It is the testimony on the subject that supersedes the testimony of its subject. As reality is judged against photographs, so is reality judged against, and expected to accord with, data.

Though Sontag is clear in her belief that photography has damaged our relationship with reality, cheapening and flattening the world into a completable, collectable, consumable set, she is reconciled to its presence. “Reconciled” may be too strong a word; Sontag more concedes that the image-world created by photographs and the ubiquity of the camera is here to stay. On Photography ends with a call to moderation, as one would moderate an appetite — a making do, a making peace. She gestures at a “conservationist remedy,” where reality and the image-world are balanced with each other and neither is allowed to dominate.

Today, the ecology of images has flourished further and deeper than Sontag observed, but the counterpoint with reality that Sontag imagined has faltered. Proposed systems like Sidewalk Toronto’s ever-observing KOALA camera risk becoming self-referential and self-sustaining: images and the data derived from images, about images, and for images dominating the social, computational, and global spheres. “Smart” systems for bodies, buildings, homes, and municipalities elevate cameras, sensors, and the data they manifest to the level of directive coaches, superintendents, or governors. Programmatic actions are taken, or nudges issued based on closed epistemic loops seeded with those original disappeared images.

Sontag saw cameras as engines that produced an image-ecology that replaced reality. Now, big data systems and their sensing and analyzing apparatuses similarly produce their own reality to regulate and predict, one that lays stiflingly heavy on our own as well.

Sidewalk Toronto’s response to critics — disappearing those initial camera images — both exacerbates the privacy violation and multiplies it. All that is left is a self-referential, recursive ecology of data, referring back to the images it has provided to itself and then destroyed. Those of us on the outside of this system — that is, its subjects — would be left only with the outward facing manipulations it produces, with nothing but recursion to hang our response on. Sontag’s critiques of photography are fundamentally about how photographic images change how we interact with the world. Ironically, Sidewalk shows that the image does not have to persist to provoke that change.