A few weeks ago, Nikki StevensJacqueline Wernimont, and I published an exposé on the use of non-consensually gathered, highly sensitive facial-recognition datasets by the National Institute of Standards and Technology, a part of the U.S. Department of Commerce. To test and refine facial-recognition systems, NIST is using immigrant visa photos, photos of abused children, and myriad other troubling datasets, without the consent of those within them.

One particular set of files stands out, innocuously named: the Multiple Encounter Dataset, or MEDS, contains multiple mugshots of 380 people, all now deceased, during their multiple arrests over their lifetimes by the police; the dataset is meant to “assist the FBI and partner organizations refine tools, techniques, and procedures for face recognition,” NIST offers. In the manner of the U.S. carceral system, this dataset, like those drawn from photos of immigrants, is incredibly biased. Though black people make up just shy of 13 percent of the U.S. population, they make up nearly 50 percent of the MEDS photographs. What is unique about this among the datasets NIST uses is that it is publicly available, uploaded to the institute’s website for retrieval by all and sundry. It also meant that I could download it and look at precisely what photographs NIST are using.

Without communication, connection, and empathy, it becomes easy for actors to take on the “gardener’s vision”

So I did. I downloaded the zip file and I looked through it, with every photo hitting me like a fist. Some crying, some with bloodstained faces, tired faces with gazes that could be read as boredom at a process that was familiar or as communicating nothing, volunteering nothing. Roughly bandaged wounds from their arrests. Tears streaked through stubble (which could be mine) around lip-glossed mouths (which could be mine). Some old and collecting social security, some so young they couldn’t even vote.

The photo that hit me the hardest was highlighted in the “description document” accompanying the dataset. It was a photo of a black man in his 50s, looking away from the camera and mouth agape, in distress. The documentation presented it as a specimen example — what the researchers who assembled the dataset, and who were testing a tool called Stasm, called an “Example of Stasm Output Requiring Manual Editing.”

That’s it. That was the only note. Where I saw a screaming middle-aged man, Andrew P. Founds, Nick Orlans, Genevieve Whiddon (of the Mitre Corporation), and Craig Watson (of NIST) saw an example of the sort of image that requires manual editing to be useful in facial-recognition testing. And edit it they subsequently did, overlaying the “source material” with linked red dots to tell the algorithm how to adjust it to fit its expectations of facial structure.

And I can’t get it out of my head, and I can’t stop asking how exactly it is that these people got to the point where they saw a person in anguish as an “example output.” What is it that let researchers get to this mind-set?

Writing about drone warfare, political scientist Neil C. Renic describes what he calls the “gardener’s vision” of warfare, in which the stronger party loses “the basic recognition that those we fight are fellow moral agents”:

When this bond of shared humanity, victimhood and moral equality is sundered, an erosion of behavioral restraint often follows … what remains is a gardener’s vision of war; a war of pest control.

Renic’s point is applicable not only to war but to any situation where those subject to an action are separated from the actor. Without communication, connection, and empathy, it becomes easy for actors to take on the “gardener’s vision”: to treat those they are acting upon as less human or not human at all and to see the process of interacting with them as one of grooming, of control, of organization. This organization, far from being a laudable form of efficiency, is inseparable from dehumanization.

This kind of dehumanization is not a new problem: large aspects of the state feature it — as when a judiciary is unrepresentative, or a lawmaking process is isolated from its consequences through geography and gerrymandering — and even more are oriented around maintaining it. Activists argue for a more diverse judiciary in part because concerns around racism, sexism, ableism, and other forms of discrimination can seem distant and irrelevant to those not subject to it. But the civil service’s one-size-fits all, standardized approaches to what constitutes “poverty” or “acceptable forms of identification” are ultimately about organization and abstraction, and consequently treat expediency and efficiency as their own ethical virtues.

As a state action, the existence of the MEDS dataset could be explained by that administrative orientation toward separation — treating those the state interacts with as less than and distinct from the processes of the state itself. But the MEDS dataset stemmed not from a monolithic, bureaucratic but from the efforts of a small team of named individuals. Even if its existence could be blamed on state bureaucracy (who obviously hold some of the responsibility, here), it would do nothing to explain who so many scientists in public and private institutions have been comfortable reusing this data.

This organization, far from being a laudable form of efficiency, is inseparable from dehumanization

To explain that, one might look at science’s long history of dehumanizing subjects in the name of “objectivity.” Consider what it means that research participants are frequently referred to as “subjects”: On countless occasions over the past few centuries, positivist views of science have assisted scientists in dehumanizing the subjects of their research. This has been assisted by larger-scale societal biases that already treat certain populations as less human. The case of Henrietta Lacks or of the victims at Tuskegee shows how structural and individual racism play a role in dehumanization, but the case of Dan Markingson, a white man who died in 2004 during a psychiatric research study while institutionalized, shows that scientists are perfectly capable of justifying inhumanity without the pretense of race.

The problem has stemmed from the positivist treatment of people as a means to a particular end: “data.” That this data stems from human bodies, lives, and traces is (to too many scientists) an unfortunate corollary, a distraction from science’s purity. As feminist epistemologists have long noted (and Rosi Braidotti details here), those bodies and people are treated as consumable and disposable, a source of data that can often then be discarded, particularly under a capitalist regime.

The scientists preparing and reusing NIST’s data to train facial recognition models may simply be subject to these tendencies and these philosophies — comfortable with screaming, bloodied, and non-consenting research subjects because they are not people to them but abstract sources of abstract data.

But there are signs that this isn’t the only thing that is going on. It appears that data science — of which facial recognition is a part — is altering the “objective” positivist research methodology, and not in ways that produce more humane outcomes.

Science and Technology Studies have talked a lot about the implications of data science for how science is done and technology is developed, both skeptically (rejecting the idea it fundamentally changes science, as Martin Frické does here) and forebodingly (pointing to, among other things, how “the datalogical turn” affects disciplines such as sociology as well the traditional sciences). In a particularly evocative paper, Dan McQuillan argues that algorithmic, AI-based approaches to science embody what he calls machinic neoplatonism, descended from the “Renaissance Platonism” practiced by such scientists as Copernicus and Galileo.

A machine-learning system that assesses facial matches tells you what faces match but not why

Neoplatonism as a philosophical approach to science, McQuillan argues, was premised on a belief in “a hidden mathematical order that is ontologically superior to the one available to our day-to-day senses.” It presumed that there is one grand unified reality that could be approached without considering the “instruments” that collected our data about it, be they telescopes or bodies. It has no need for causal explanations, no need to consider how the nature of the scientist or their perspective might alter the results – indeed, it has only incidental need for a scientist at all.

For McQuillan, this philosophy is made manifest in machine learning, which is premised on the idea that algorithms, when fed the right data, spit out the singular hidden truth of how the universe functions in a particular domain, generating not theory (because you can’t necessarily see what the algorithms are doing, or why) but the guidance we need to structure society and reality. A machine-learning system that folds proteins tells you which proteins fit, which should be further explored, but it does not tell you why — you don’t need it to. A machine-learning system that assesses facial matches tells you what faces match but not why. In other words: there’s no theory. Really, there’s no need for a human scientist at all. You just get decisions, outcomes, things you can use.

To me, this machinic neoplatonism is the central difference between traditional scientific approaches and those of data science. While the issues with traditional science remain — among them the idea of a single, objective Truth and the abstraction of data from the people who participate, voluntarily or not — and are magnified, data science puts forward the idea that this “truth” does not have to be known by the scientist and may not even be knowable to us.

This shift renders the consequences of that truth and our experiences of them irrelevant. That an algorithm “works” for any post-hoc purpose becomes a justification for that algorithm, its efficacy and its deployment, and the rationale for collecting even more data to feed it. In pursuit of the one big truth — the truth that only an algorithm can determine — anything is permissible. That “truth” is simply superior; qualitative concerns, subjective concerns such as human cost, bias, or base morality, do not even register as legitimate.

Data science represents yet another degree of abstraction away from the bodies and minds that data comes from, making those bodies and minds even more disposable and interchangeable. The abstraction and dehumanization is not just through a one-way mirror or epidemiological records, as in conventional scientific experiments, but separated by time, distance, anonymity, and scale so that it is even harder for human scientists (if they are involved at all) to see people rather than data sources.

This is especially true when “big data” is premised on getting as many data points (or photos) as possible, fading any individual into just … numbers. A gardener’s view, again, but to a far worse degree because of the myriad additional layers of abstraction — because there may not be a scientist in the loop at all, just an algorithm that demands to be fed.

If machinic neoplatonism — this degree of abstraction and reliance on the model to reveal a singular truth — is a source of data science’s inhumane outcomes, then we should be particularly concerned by projects for “better” AI that ultimately adopt the same techniques, even or especially when they are billed as more ethical and humane. These projects often attempt to engineer ethics — to make algorithms for gauging the ethics of other algorithms.

AI ethicists themselves must be willing to decenter their own perspectives

Such projects abound. The conference on Fairness, Accountability and Transparency (FAT*) often features this kind of work, from modifications to common machine learning algorithms that produce more equitable outcomes to “meta-algorithms” for assessing fairness (computationally proven meta-algorithms, at that!).

It’s safe to say I am generally skeptical about such approaches and have not so gently skewered them in the past, due to the way that they (as Greene, Hoffmann, and Stark put it in their evaluation of the wider movement for ethical AI) “frame ethical design as a project of expert oversight, wherein primarily technical, and secondarily legal, experts come together to articulate concerns and implement primarily technical, and secondarily legal solutions.”

But the problem of the gardener’s vision — the risks that come with abstraction — pose additional reasons to worry. As a prominent case study, take Aequitas. Developed at the University of Chicago, Aequitas is “an open source bias audit toolkit for machine learning developers, analysts, and policymakers to audit machine learning models for discrimination and bias, and make informed and equitable decisions around developing and deploying predictive risk-assessment tools.” Essentially, it’s a machine-learning system for identifying biases in machine-learning systems. After uploading a dataset and selecting “protected groups” (and a mathematical measure of fairness), the system spits out a friendly, human-readable report on the fairness of the dataset and the likely biases of any models trained on it. The example report Aequtias provides, for instance, uses race as a protected variable.

On the face of it, this tool is useful, providing a quick and easy way of evaluating (un)fairness as a red flag for possible injustices. But crucially, it does not do so through direct mechanisms — engaging with the populations impacted or interrogating how the data was gathered. It does so through a model; it reduces fairness itself to a matter of abstract mathematics, to a Truth, replicating precisely the “gardener’s vision” that makes machine-learning systems so risky in the first place.

While Aequitas and tools like it endeavor to reduce injustices, they do so by introducing an additional level of abstraction: Users and developers are now not only expected to take the attitude that the model they are developing has primacy, but also that any issues with it can be clarified and resolved by shifting giving primacy to Aequitas, further foregrounding algorithmic systems and separating the developers and users from those subject to the algorithm’s outcomes. Aequitas does not alter the balance of power to favor algorithmic subjects but simply redistributes power between different groups of developers further and further away from the consequences of their actions.

This is not to say that the tool cannot be useful; it is fundamentally useful to highlight early warning signs. But there is no reason an algorithmic system should be seen as a better way of doing that than talking to one’s users or seeing what the algorithm’s consequences are in the social context you intend to deploy it in. Along with the overall attitude of “leave it to the experts” that Greene et al. found, and the gardener’s vision these projects implicitly depend on, algorithmic approaches to ethics and equity risk enabling further inequity, detaching developers from their victims and further centering machine learning as a singular source of a singular truth.

In exploring ways around machinic neoplatonism within data science, McQuillan has some great suggestions: adapting feminist and postcolonial critiques and methods, and explicitly building “antifascist AI,” an approach I empathize with. But for this project to be successful, it must be critical and critique-able; it must be accompanied by ethical forms and monitoring that do not themselves replicate and depend on the same systems of abstraction, power, and inhumanity that feminist and postcolonial methods seek to unravel. AI ethicists themselves must be willing to decenter their own perspectives, decenter computational approaches, and adopt the same interrogations of power and contextuality McQuillan exhorts for AI itself. Only then can we avoid being both source of, and ultimately victim to, the gardener’s vision.