The Next Big Cheap

Calling data “the new oil” takes its exploitation for granted

Onstage at the November 20 Democratic debate, presidential candidate and Universal Basic Income evangelist Andrew Yang used one of his precious minutes of speaking time to casually claim that “data is the new oil,” and that we need to create a “WTO for data” to help wrestle it under control. Yang’s statement continues the hackneyed but irrepressible tradition of talking metaphorically about data, which is “the new oil” unless it’s “the new nuclear waste” or, weirdly, “the new bacon.”

The prevalence of data metaphors has spawned its own subfield of meta-commentary. Scholars Cornelius Puschmann and Jean Burgess survey big-data metaphors and pull out “data as a force of nature” and “data as a resource” as the main throughlines; Sara M. Watson contrasts industrial data metaphors with embodied metaphors, and Irina Raicu summarizes the meta-summaries. Accruing like dust bunnies in the corners of our discourse, data metaphors proliferate for good reason: If we can hit on the right analogy to describe how data functions (in the world, in the economy) we might be better equipped to legislate its use, capitalize on its promise, and mitigate its harms. If it’s oil, tap it. If it’s soil, grow things from it. If it’s nuclear waste, bury it in the desert for a thousand years and be very fucking careful not to splash it on your clothes.

Colonizers of the early modern world created a new set of relations that conceived of nature as a “free gift”

Data, in these examples, generally refers to “big data”: large sets of data that are collected and analyzed for use in applications like predictive and behavioral analytics. The term “Big Data” was originally used in the 1990s to describe data sets that are too large or complex to be dealt with by traditional data processing software. In this era of massive computing power, where analysis of vast data sets can be performed with standard software on any laptop, data can be aggregated, shared, sold, and repurposed for applications far beyond what we expected when we initially signed up to digitally log our jogging routes or store our photos in the cloud. Kate Crawford and danah boyd propose that, today, “Big Data is notable not because of its size, but because of its relationality to other data. Due to efforts to mine and aggregate data, Big Data is fundamentally networked.” (Big) data’s value “comes from the patterns that can be derived by making connections” between data points, be those data about individuals, online interactions, the movement of objects in space, or the growth of plants.

The desperate hunger with which companies pursue and collect data, combined with its interconnected, shapeshifting nature, indicate that data is more than just a new product class or “the new X” — it looks like a new frontier. At the frontier, people and natures that were previously uncapitalized are turned into things that can be extracted, traded, and used to create profit, often with a huge human and environmental cost. While the “new oil” metaphor points towards some of these risks, calling data “the new X” misses the bigger point — it takes for granted the transformation of the world into commodities for use and exploitation, a process that isn’t natural and shouldn’t be inevitable.

Borrowing a term from Marxist geographer Jason Moore, I propose that data is the new big “cheap thing” — the new commodity class that is emerging to reshape the world and provide a new arena for accumulation and enclosure. Following Erich Hörl, whose essay “The Environmentalitarian Situation” briefly mentions data as a potential new entry in Moore’s litany of “cheap things,” I want to explore how framing data as a new cheap thing — rather than “the new oil” or “the new soil” or “the new nuclear waste” — gives us a way of looking directly at the process by which things become available for use and profiteering. Thinking about data in line with other cheap commodities throughout the history of capitalism might help us imagine better frameworks for its management and regulation, and provide models for how to successfully push back against the capture and exploitation of yet another aspect of our lives and the world that sustains us.

Despite its common invocation as a gushing and unruly force of nature, “cheap data” is not a natural resource: No resources are natural. Coal, says Moore, is just “a rock in the ground. “Only under definite historical relations” — of both power and (re)production — “did coal become fossil fuel.” It is the becoming resource, more specifically the becoming cheap resource, that turns a “rock in the ground” or, in our case, a set of networkable data points, into a new commodity that can change the way the world works. In his books Capitalism in the Web of Life and (with Raj Patel) A History of the World in Seven Cheap Things, Moore argues that this maneuver — the absorption of lives and “resources” into capitalist systems — is central to the history of capitalism.

Frontierism provides a way to fix capitalism’s crises without changing any of the extractive practices that created the crisis in the first place

For something (coal, data points, human life) to be born anew as a commodity, it first needs to be separated (conceptually, often physically) from the context in which it is embedded. The rise of capitalism, says Moore, was concurrent with the first big separation: The conceptual cleaving of “nature” from society. The human/nature binary is a false one, of course. Humans and our systems — social, economic, ideological — have always been enmeshed with “nature,” and the two constantly co-produce each other in what Moore calls “the web of life.” The separation that made nature available for cheap use was an act of rhetorical violence, reconfiguring nature as a non-human domain that encompasses not only trees and mountains but also (in a massive act of exclusion) Indigenous and colonized people, slaves, and most women. By separating “nature” from “society,” the colonizers and conquistadors of the early modern world created a new set of relations that conceived of nature as a “free gift,” available for appropriation and exploitation. This isn’t a new idea — “all production is appropriation of nature” is straight from the Grundrisse — but Moore’s contribution here is to develop the idea of the “web of life” and of “cheapness” as central to the appropriative maneuver.

Nature is only the first in a series of “cheap things” through which capitalism has shaped the modern world. Cheapness, Moore and Patel write, “is a strategy, a practice, a violence that mobilizes all kinds of work — human and animal, botanical and geological — with as little compensation as possible.” The cheapening of nature meant that trees, minerals, and fish were remade as independent entities available for harvest and collection, with little attention to the enmeshment and interdependence of humans and these “natural” resources. Cheap nature allowed for accumulation and profit generation, and when the rate of profits slowed, “cheap money” — massive loans and low interest — provided opportunities for expansion and further exploitation of nature’s resources. Moore and Patel chart a course through a series of additional “cheaps”: cheap work performed by Indigenous laborers, slaves, and exploited wage workers; cheap care provided by women and domestic servants that enabled labor power to be reproduced; and then cheap food, cheap energy, and (more abstractly) cheap lives, each required by the previous and enabling the next. In a cheap world, “capitalism transmutes these undenominated relationships of life-making into circuits of production and consumption,” leaving a legacy of destruction and dispossession.

Which brings us to cheap data. Just as data wasn’t always “big,” it wasn’t always cheap enough to accumulate like giant fatbergs in AWS’s digital sewers (data is the new fatberg). Governments, corporations, and institutions have long collected large data sets and wielded them as a tool of power, but those data weren’t nearly as interconnected, accessible, or easy to analyze as they are today. The transformation of data into “cheap data” required massive computing power, algorithmic accuracy, and cheap storage. Each of these was built on the backs of other cheaps: cheap energy (from fossil fuels), cheap money (often from Silicon Valley), cheap labor, and cheap nature (in the form of extracted minerals and metals) were all enlisted in the development of powerful and omnipresent computing technology used to transform data from just a collection of info points into an omnipresent strategy for profit making. This litany of enabling conditions didn’t conjure cheap data into existence. But I suspect that they created an imaginative fissure through which a new frontier could be glimpsed.

Frontiers are essential spaces in the history of capitalism. When the old methods of accumulation and profit have been tapped out, frontiers open up new arenas of existence to “cheapening” and extraction. Sociologist Wilma Dunaway describes frontiers as “zones of incorporation” where “noncapitalist zones are absorbed into the capitalist world-system.” With their often-abundant resources or entirely new life-worlds to incorporate, frontiers are, per Jason Moore, “places where the new cheap things can be seized — and the cheap work of humans and other natures can be coerced.” By separating a new “resource” from the web of life, frontierism provides a way to fix capitalism’s crises without changing any of the extractive practices that created the crisis in the first place. And so, when labor costs rise in China, T-shirt manufacturers shift production to Vietnam, or Bangladesh, or wherever the next frontier of cheap textile labor can be found. Frontiers fix the problem, and capitalism can continue at pace.

As soon as big data became a possibility, it was cheapened, swallowed up and forced into service

Frontier-thinking is a core tenant of the tech industry, and the language of the frontier is baked into tech discourse. Tech journalists consistently describe new areas of tech investment or market creation as “frontiers.” Jeff Bezos’s annoying plans to establish and fund space colonies are purportedly inspired by Gerard K. O’Neill’s 1976 book The High Frontier. Seasteader Patri Friedman (grandson of Milton) laid his own case for the frontier in libertarian blog Cato Unbound, writing “Only by starting with a blank slate can you make a better structure without having to overcome entrenched interests… Historically, the frontier has functioned as this canvas for experimentation.” A 2011 McKinsey report explicitly describes big data as “The next frontier for innovation, competition, and productivity.” While these writers and entrepreneurs may toss off the “frontier” metaphor without much thought, seasteading, space, and contemporary big data all function as (often literal) zones of incorporation where new cheap things can be seized and cheap resources can be mobilized.

What’s at risk when data is the next “big cheap”? With other “cheap things” like work, care, or nature, we might imagine a past (or future) where they exist in a non-alienated way within the web of life, highlighting the danger and tragedy of their cheapening. Big data’s emergence, however, was concurrent with its commoditization. As soon as big data became a possibility, it was cheapened, swallowed up and forced into service: Big data never existed as a commons on which we peasants could graze our electric sheep. Despite this difference, today’s emerging data ecosystem gives us some indication that the consequences of “cheap data” will follow the trajectory of other cheap things, enabling the continued and expanding subjugation of people and the environment in the name of growth and profit.

Cheap data is a new kind of frontier. Rather than moving outwards — westward, to the sea, into space — the cheap data frontier is an overlay, positioned on top of other spheres of life in order to siphon their juices. In this way, a second resource can be extracted from the people and natures already cheapened by capital. At the cheap data frontiers, industrial workers (cheap labor) like those working in Amazon fulfillment centers are tracked and monitored, doing double time for employers who profit from their labor while also accumulating screeds of data about the movement of their bodies in space, their time spent per task, and their response to incentives. Friends and families provide uncompensated but necessary social support (cheap care) for one another on digital platforms like Facebook, helping maintain social cohesion and reproducing labor forces while also producing waterfalls of valuable data for the platform owners. This magic trick, where cheap data is gleaned as a byproduct of different kinds of cheap work, is a great coup for capital and one more avenue for extraction from the rest of us. If, as Moore says, new “cheaps” emerge as strategies that allow capitalism to survive crises, then the overlaid frontier of cheap data helps solve the “crisis” of stagnant productivity and growth by enlisting all kinds of existing labor and care into service as data producing machines.

Shoshana Zuboff, in her book The Age of Surveillance Capitalism, describes the data that is sloughed off of other kinds of human activity as “behavioral surplus.” For Zuboff, it’s not data that is the new zone of extraction and exploitation, but rather human experience itself. Her concern is that we will become zombified servants of “surveillance capitalism,” a new and worse version of capitalism which aims to predict and modify our behavior in service of market objectives. The rise of cheap data, though, is not limited to data on human behavior. While Google and Facebook are indeed working to manipulate our clicks and purchasing habits, data are also being collected about everything from the movements of machinery to the growth of plants and the rate of interest. These data are used in pervasive and diverse ways — to train machine learning systems like GANs, or to predict weather, manage populations, and create new markets — that shape the world well beyond our lives as consumers. In isolating “human behavior” as the domain of extraction and control, Evgeny Morozov notes that Zuboff limits her argument to a critique of “surveillance,” leaving capitalism itself curiously unexamined.

The “behavioral surplus” model and the metaphors that describe data as flowing, cascading, and generally spilling from us as we move through the mediated world also elide the ways in which the production of cheap data often requires concerted and tedious labor. So, while we freely upload thousands of images of our faces and families and pets which are then scraped from the web by platform owners or under Creative Commons license terms, these images often need additional tagging or categorization in order to be useful for commercial purposes (“Images do not describe themselves,” write Kate Crawford and Trevor Paglan). This is where cheap work reenters the picture.

The digital piecework of casualized workers like those contracted by Amazon’s Mechanical Turk has been essential for building the cheap data repositories that underlie many AI systems and research projects. ImageNet, the most significant image database used for visual object recognition software development, relied on MTurk workers to sort and tag millions of images, which now comprise a dataset used for everything from military research to corporate projects by companies like IBM, Alibaba, and Sensetime, who provide technology used by Chinese officials to track and detain minority Uighur populations. Recent research has highlighted the stress and horror experienced by precarious workers in the digital factory, who annotate images of ISIS torture or spend their days scanning big social platforms for hate speech and violent videos. As with all cheap things, cheap data relies on massive externalities, the ability to offload risk and harm onto other people and natures, while the profits all flow in the opposite direction.

Harm to human workers is just one of the “externalities” produced in the pursuit of cheap data. The cheap energy required for training AI models and transferring massive amounts of data to and from the “cloud” is less visible than exploited human workers, but its cumulative effects are huge. Research suggests that the energy required to train a single AI model may have the carbon dioxide equivalent of five times the lifetime emissions of an average car. Similarly, the hardware needed to run all these models and collect all this data requires significant amounts of precious metals and new plastic in its construction. Cheap nature is called back into service, along with more cheap labor to extract and process it into the fiber optic cables and Ring doorbells and computer keyboards that sense, collect, and connect data. The abstracted nature of the environmental harm produced in the pursuit of cheap data contributes to what I call “technocapitalist sacrifice zones,” out-of-sight arenas of extraction and refuse that are permanently damaged as products and profits are extracted for use elsewhere.

What happens when cheap data becomes less cheap? The industries built on cheap data mean that if regulations are passed enforcing higher wages for precarious data workers, or increased privacy controls, the “behavioral surplus” becomes harder to tap. The history of cheap things gives reason to believe that data extraction will then push further and further into new and cheaper zones and frontiers. This process has already begun, with the offshoring of digital piecework and with big tech companies and foreign-owned startups alike setting up shop throughout the “Global South” in order to capture new markets and glean data from whole new population segments. Scholars Ulises Mejias and Nick Couldry explicitly call out this model of data collection as “data colonialism,” the new iteration of colonial extraction that exploited and oppressed indigenous people for centuries.

Demands that Indigenous peoples retain sovereignty over their own data points toward a future in which data is slower, smaller, and less alienated

Even when cheap data is proposed as a humanitarian solution to a problem like poverty or labor abuses, the way cheap things work, and the ownership of the data systems by capital often mean the virtuous promises are undercut. Already, aid organizations are driving an inadvertent program of data extraction (or “surveillance humanitarianism”) in countries where they operate, requiring biometric data and accumulating massive data sets in the interest of efficiency and fraud reduction. These programs can have unintended consequences, with minor discrepancies in databases causing chaos for displaced or otherwise marginalized people, and activists rightly worry about the potential for data leaks and commercialization. Dennison Bertram writes about the way seemingly benign initiatives like the Blockchain traceability startup he worked on — ostensibly designed to reduce illegal labor and get better prices for agricultural producers — provide only nominal benefits to the commodity producers while massive caches of valuable data go to the system owners. “Blockchain-powered supply chain startups like our own were promising farmers marginal increases in value,” he writes, “while simultaneously extracting data as [an] entirely new natural resource.”

If we accept that data is the new “cheap thing,” it is clear that the established models for regulating and monitoring data collection and use will be insufficient for the scope of the problem. Commentators, and politicians like Andrew Yang revert to the “new oil” metaphors in part as guideposts for how to deal with the unruly nature and uneven distribution of data wealth. If “data is the new oil” then perhaps the citizens from whom data is extracted can get a share of the eventual profits, in the way that Alaskan residents receive an oil dividend each year. If data is a forest to be logged, then researchers Luke Stark and Anna Lauren Hoffman suggest that we might require Google and Facebook to be better “stewards” of our data forests, sustainably “managing the resources” we provide them by adequately compensating their moderators, banning Nazis, and encouraging a better level of discourse.

But what if we don’t want the forest under corporate ownership at all? As Moore and Patel point out, “many of today’s politics take as given the transformation of the world into cheap things.” In the wake of the financial crash, liberal organizers campaigned for the improved regulation of housing markets, a compromise when what had been surrendered to cheap finance was housing itself. Unions fight for $15 an hour minimum, which is laudable and necessary, and yet wholly insufficient in a country where the entire “future of work” is up for grabs and available for perpetual reshaping and unbundling at the whims of Silicon Valley and corporate restructuring.

So, what would it look like to reject the regime of cheap data, and bring data — the bits of life we coproduce from our bodies with our technologies — back into the web of life? Can we “decolonize data” or reclaim a “data commons,” especially when big data itself is the direct product of previous appropriations of cheap natures? There are at least a few projects that are pushing back against the corporate data regime in genuinely radical ways that tackle the root of the problem (capitalism), not just its manifestations (surveillance).

The Indigenous Data Sovereignty movement, founded by scholars and activists from Australia, New Zealand, Canada, and the United States, uses principles from the United Nations Declaration on the Rights of Indigenous Peoples to contest the rights and abilities of governments and global corporations to collect and profit from Indigenous data. The UN principles are coupled with frameworks based on the cultural principles and worldview of each Indigenous group, which are often innately opposed to private ownership and depersonalized data. Te Mana Raraunga, New Zealand’s Māori Data Sovereignty Network, advocates for the self-governance and control of all Māori data ecosystems, accurate minimum metadata requirements reflecting that all data has whakapapa (genealogy), and collective and community data rights. If these kind of data frameworks gain traction, they could prove a major headache for companies and governments that rely on the fungibility and reusability of data for their operations or business models.

These demands that Indigenous peoples retain sovereignty over their own data, refuse to let it be stored by AWS or reused without their consent, and re-inscribe it with Indigenous principles point towards an alternative data future in which data is slower, smaller, and less alienated. In this future, some kinds of data collection and use may be abolished entirely, as Ruha Benjamin suggests for algorithms and surveillance that amplify racial hierarchies; while other kinds of collection may continue, but in a less-networked way that is controlled and decided by the communities to whom the data pertain.

Full data sovereignty could not take place in isolation. It would ideally be part of a “reparation ecology,” which Moore and Patel discuss as a process of radical reparations that weighs historic injustices and redistributes care, land, and work, resacralizing human relations within the web of life. This is a big task, but a necessary one. Because cheap things don’t stay cheap forever, and the ongoing cheapification of big data will require an ever-expanding appropriation of land, labor, and human life. We can’t afford it.

Kelly Pendergrast is a writer, researcher, and curator based in San Francisco. She works with ANTISTATIC on technology and environmental justice, and she writes about natures, visual culture, and laboring bodies.