Ambient Cruelty

The ability to ruin a stranger’s life is a feature, not a bug of consumer rating systems

It is a truism, backed with some evidence, that negativity makes a person seem smarter. In the 1980s, Harvard researcher Teresa Amabile took two pieces of literary criticism from the New York Times’ book reviewing section — one positive, one negative — and showed them to 55 students. The students found the writer voicing negative opinions much more intelligent and persuasive than the one voicing praise. In fact, it was the same reviewer, and the two pieces of criticism were adapted versions of the same review. John Stuart Mill wrote, “I have observed that not the man who hopes when others despair, but the man who despairs when others hope, is admired by a large class of persons as a sage.”

In part, this is because blame is actually more rare than praise. In the past 50 years, cross-cultural studies have demonstrated a phenomenon referred to as linguistic positivity bias: human speech is studded with words like “great,” “adorable,” and “amazing,” while words like “dreadful,” “ugly,” or “terrible” show up less frequently. It may be that people use language primarily as a means of drawing closer together, which raises the frequency of words that create a feeling of community. Negative words stick out because they are not the norm, and this in turn signals to readers or listeners a person who is setting themselves apart from the group.

Ratings don’t just measure emotional responses: they create them

For this reason, negativity as a tonal choice not only lends an air of discernment, but brims with expressive opportunity: the diction of dissatisfaction offers its own satisfactions. On Twitter, a winning persona blends quotidian venting with cultural critique. Hating on things can scratch an individual itch or put a finger on shared experiences — there’s a bond in hating the same stuff, as evidenced by the popularity of the “Gopher Gripes” segment on the Gimlet podcast Reply All (the spiritual heir of a long succession of indie-media rant lines). Yelp offers a platform for individuals to denounce bad service, whether creatively or simply self-righteously. Of the services I use most frequently that are also the most universally hated, Greyhound buses seem to inspire some inventive criticism. In 2012, a user called Sonia B. typed the following ode:

Greyhound, Greyhound
You’re not that fast,
If you were in a race you’d probably finish last.

I use you sometimes when I’m going far,
Even though your service is kind of sub-par.

But when I consider gas prices these days,
You really often are the cheapest way.
(Especially the advance webfare).

I will hazard a guess and say Sonia B. probably doesn’t get a lot of opportunities to publish her verse in the traditional press. This is a major factor that motivates Yelp reviewers: not necessarily to express passionate opinions about products and services, but to express themselves period, and to make themselves visible to others. In a 2014 Fast Company interview, Yelp’s vice president said, “if you’re writing great things on Yelp, you know that a lot of people are going to read them. You’re going to have a voice. You’re going to have a megaphone. Yelp is that megaphone.”

More megaphones means more opportunities to emote. But amplifying negative expression has serious consequences in the contemporary gig economy. The freedom to vent feels empowering, but when unleashed on a reputation-based labor market, where a widespread reliance on reviews and ratings is the primary monitor of quality assurance, negative self-expression allows users of apps like Uber or TaskRabbit to enjoy the benefits of an arbitrary power of punishment free of guilt. By emphasizing the user’s “right” to have their opinions heard, and to dissatisfaction with any less-than-perfect “experience,” platforms encourage users to be cruel without feeling cruel. Normalizing negativity creates a slush fund of data that employers can use at their discretion against employees.

If Yelp offers an opportunity for a quasi-literary or artistic self-expression, apps like Uber or Ziosk (an app that encourages customers at chain restaurants to rate their service) ask users to express themselves primarily through numbers, sometimes accompanied by potted comments that can be selected from a list. Users don’t request an opportunity to fill out ratings; rather, the company prompts or requires user feedback, as if one’s satisfaction or dissatisfaction is the company’s top concern.

Under ordinary circumstances, it’s likely that most people getting a ride from A to B or ordering chili cheese fries wouldn’t be focused on how the interaction makes them feel at all. But ratings don’t just measure emotional responses: they create them. This is called the mere-measurement effect, a sociological quirk in which being asked for an opinion makes people drum one up on the spot. The effect is powerful enough to change behavior — in one study, simply being asked if they intended to vote made students more likely to do so. As users get used to being asked for their opinion on services, they are likely to develop stronger opinions. The market transaction itself — the exchange of french fries or a ride for money — fades into the background, while the affective experience of consuming the product or service looms larger.

Yelp differs from Uber and Ziosk, of course, in another significant way: The audience for these ratings is the company and the worker rather than the public at large, which makes the experience less like using a megaphone and more like using a mandatory snitch line. Yelp is a third-party platform, while Ziosk and Uber directly control whether workers keep their jobs. Yet the exercise of converting one’s feelings into commentary is basically the same — and without the textual element, these feelings are more quickly converted into hard data. In June, Buzzfeed reported that information gathered from Ziosk was being used to reduce shifts for poorly rated waitstaff. A few bad reviews on Uber can influence riders not to book a particular driver, and can get a driver kicked off the platform — even if the ratings are simply the result of mistaken judgment about whether riders are in the mood to chat.

In displaying the language patterns of victims, reviewers illustrate how access to this form of expression can feel good — even healing — while drastically mischaracterizing who is harming whom

As an expressive mode, feeling by numbers is highly idiosyncratic. To one person, a “three” might seem average; to another, a three is an insult. Personally, were I to rate all my day’s experiences, I’d always find a grain of discontent — that borscht recipe wasn’t quite sour enough, that text exchange could have been more satisfying, the view from my office window would be nicer with more trees. Five seems like a number for a perfect world, and I always feel that things could be better. I’m pretty sure this doesn’t even make me a pessimist — I expect my experiences to be about average, which should translate to a three.

And yet many platforms have established five — the highest number, from which no improvement is possible — as the standard workers should be maintaining. As a server commented in Buzzfeed’s article on Ziosk, “The company only counts fives as good scores … Everything else is basically a complaint.” For Uber and Lyft, the deactivation threshold can vary from district to district, but drivers often report a 4.6 or 4.7 as the minimum required to keep their jobs. New users don’t necessarily know this — it’s counterintuitive for perfection to be considered normal — so a driver’s ratings can paradoxically be more heavily influenced by inexperienced users. (Drivers also rate riders, but because stakes are so asymmetrical — drivers can arbitrarily lose their livelihood, while riders may simply lose out on a convenient ride — the semblance of parity is an illusion.)

Assessors aren’t there to interpret what users mean — the goal (unlike in, say, a psychology survey) isn’t really to understand the feelings of participants. I might think being solicited for my opinion means that I’m being invited to talk about myself — it feels personal, as if the company wants to get to know me. However, my answers to questions about how I feel aren’t being used to better understand me as a person; they’re being used to fire workers who make the company look bad. It’s an odd bait-and-switch: ratings assure me that my subjective experience is the most important thing, and yet, at the same time, their aggregation enables me to pretend my complaints don’t have serious consequences. I can tell myself that a driver will only get fired if enough other people also give them a low rating; at the same time, I’m being nudged into accepting the underlying premise that the power to destroy someone’s livelihood through ratings and reviews is a legitimate — even just — part of the consumer experience.

As the user, I come to believe that services should be tailored to me and my preferences — I can rate a driver down for playing music I don’t like. It allows me to see myself as the kind of quirky individual tech companies market themselves as serving, while also elevating my own opinions to the level of quality assurance benchmarks. In my starring role as customer and de facto manager, my power over the worker is complete: I am not so much expressing an opinion about the quality of a service but giving the worker permission to keep their job.

Last year the Guardian’s digital etiquette columnist, Emma Brockes, wrote about her experience as a customer looking to book someone to put up shelves over TaskRabbit: “I found myself rejecting handymen left, right and centre because I didn’t like the look of them. One was wearing a suit, which I thought protested too much; one looked a bit shifty; I didn’t like the goatee on another. This was clearly ridiculous, and extremely vulnerable to racial and other biases.” Users become accustomed to putting workers through an idiosyncratic filter — no goatees, no suits — as though every casual laborer were a song on a playlist or a book on a shelf. If my likes and dislikes are all indulged, everything can become a curated reflection of my tastes. Given sufficient control, perhaps I can scrub goatees from my world entirely. It’s a cruelty that doesn’t feel like cruelty; it just feels like a right to choice.

But as Brockes points out, feeling justified in rejecting workers based on superficialities skids quickly into serious harm. In the same way that teaching evaluations replicate racism, sexism, and other forms of discrimination in the way students rate their professors, platforms that give the customer power to select workers proffer the opportunity not only to indulge bias but to make their world into a gated community — to experience only that which is comfortable and friction-free. The ability to curate one’s experience to this degree comes to feel like a natural right.

A 2016 study out of Northeastern University in Boston found that Black and Asian workers offering their services on TaskRabbit and Fiverr received lower ratings than white workers; they also found racial bias in TaskRabbit’s recommendation algorithm. A New York Times article from earlier this year documents a Black female driver’s experience with a racist passenger who yelled epithets at her, and how complaining to Uber simply earned her a generic response (“We’re sorry to hear about this. We appreciate you taking the time to contact us and share details.”) On a platform like TaskRabbit, where customers scroll through photos of the many casual workers they might hire, users don’t have to consciously select workers on the basis of race — they are being given permission to make decisions with financial repercussions for others based on a split-second gut feeling. The mechanism allows customers to discriminate without seeming to themselves to be doing so, and to feel justified in indulging their own moods and prejudices.

There is a paradox inherent to expressions of negative emotion: it can feel good to say something mean, and yet for most people, it feels bad to hurt others. Being negative may make me seem smart, but it also makes me seem mean, and thus unlikeable. Some people are consciously and overtly racist and sexist, but many others wield these prejudices unconsciously; few people want to believe their selection or rating of workers is racist. Rating workers on subjective criteria, with only one’s feelings as a guide, means users don’t have to articulate, even to themselves, clear reasons why the worker didn’t meet their expectations in providing a comfortable and affirming experience. By giving users an outlet for negativity in prompting them to rate or review their experiences, these platforms disguise the consumer’s financial power over workers as simple self-expression, and solicit a kind of expression people may seldom be given social permission to exercise.

When people use online megaphones for their judgments, they are also engaged in constructing narratives as a form of self-creation. Since customers are the heroes of their own stories, the journey into the marketplace takes on a Manichean tint. A collaboration between researchers from Stanford and Carnegie Mellon universities resulted in a 2014 study of the type of narrativizing that occurs in online reviews of restaurants. Among other interesting findings (cheaper foods are described using the language of drug addiction, while people describe expensive foods in the language of sex), the study found that very negative reviews of restaurants shared significant linguistic markers with another type of self-expression: trauma narratives.

Ratings aren’t about creating information for users; they’re about creating paranoia for workers

“Shortly after a disaster or tragedy people experience emotional upheavals and obsessive thoughts and feelings,” the study’s authors wrote. “In this phase they share these thoughts and feelings with others, including strangers, and the phase is marked by expressions of collectively shared grief, in which people seem to emphasize their belonging to groups.” They use words like ‘we’ or ‘us’ more frequently, and past tense verbs like ‘told,’ ‘said,’ or ‘did.’ Where a simple comment on food might demand no more complicated narrative structure than “Great salmon,” the verbs in negative reviews indicate that the reader is being told a story. “One–star reviews are narratives of negative emotion, stories about something bad that happened involving what other people said and did,” the authors write. Restaurant reviews in which people sound traumatized by perceived injustice don’t tend to comment much on the food — it’s usually the perception of being treated rudely or uncaringly that seems to have pushed people into processing by writing out their feelings in a public forum.

How did consumers get the idea that pleasing service is a matter of justice? There is no fundamental human right to a short wait time, a smile, or a perfect evening out. “Good service” doesn’t have a universally agreed-upon definition, and yet customer service representatives are expected to meet every customer’s expectations as a matter of principle. Customers come to take any service that doesn’t give them the feelings they want as a personal insult. When the customer’s money doesn’t produce the hoped-for experience, this starts to provoke shock and moral outrage rather than simple disappointment. In displaying the language patterns of victims, reviewers illustrate how access to this form of expression can feel good — even healing — while drastically mischaracterizing who is harming whom.

Servers, of course, are already often working in an environment with little guarantee of fair labor practices or stable employment. It’s not enough to do the job well — restaurant servers don’t control how long it takes for the kitchen staff to prepare meals, or what those meals taste like, but customers rating their servers on Ziosk may give a low rating as a result of a long wait or a dish they didn’t enjoy. The customer is the expert on their own “experience,” and managing that experience becomes the server’s real job. When the weightiest measures of performance are not tied to the real value of a service or product, but are instead reflections of how an experience made a customer feel, workers with whom the public interacts gain an additional affective workload, and come in for the heaviest criticism. My power of self-expression becomes a power of punishment, and I am able to enjoy the privileges of both while acknowledging only the former.

As more users become aware of the consequences of bad reviews, an etiquette has evolved. In a 2018 study examining an (unnamed) online marketplace in which freelance laborers are matched with people looking to hire remote gigging workers, researchers from New York University’s Stern School of Business found a high rate of reputation inflation — in a period of six years, the percentage of laborers being rated five stars shot up from 33 to 85 percent. In another of her columns, Brockes responded to a Guardian reader’s question about whether they could give an Uber driver a bad review (the car smelled bad, the seat belt didn’t work, and the driver talked on their phone while driving). Brockes wrote: “to my mind this is relatively straightforward. You received bad service and you are a complete monster if you go ahead with a bad rating.” It has become widely known that a five-star rating is the equivalent of tipping: obligatory rather than optional.

Tipping can be embarrassing — it’s a moment when the customer demonstrates their power to withhold part of the worker’s wage arbitrarily. And yet, the server is obliged to say thank you, as if the customer were handing over something extra, a gift. Disguising the decision not to cause harm as a benevolent gesture normalizes an ambient cruelty while making the person wielding power seem generous. In the same way that the expectation of tipping means servers are (in some jurisdictions) paid less than minimum wage, the reputation layer of the gig economy builds the social exchange of ratings anxiety into our conception of work, just as “likes” and other attention metrics build it into our social lives. Work is subtly reframed as socializing, so that pleasing people is the job. Precarious laborers are further encouraged to see themselves as vulnerable — to feel lucky they are allowed to work at all. Ratings aren’t about creating information for users; they’re about creating paranoia for workers.

In the Victorian era, consumer power was posited as a way to align consumers with workers: Purchases could be mobilized for the good of society through boycotts aimed at commodities like cotton or sugar produced through the labor of enslaved people. Consumer leagues and cooperatives later sprang up to agitate for a minimum wage and safe working conditions. When channelled into collective dissent attacking prevailing systems of injustice, dissatisfaction and negative expression are powerful tools for good. Yet corporations have co-opted this potential, reducing consumer power to punitive capability. By offering widespread customer feedback opportunities, companies are able to visit the consequences of any consumer dissatisfaction onto workers.

The repercussions of using ratings as a widespread way to monitor workers go beyond specific apps or platforms; they chip away at our sense of responsibility towards one another. Through online ratings systems, consumers are being encouraged to see the cruelty of arbitrary punishment as a benefit they can enjoy. The consumer power they are encouraged to embrace asks them to review the worker rather than the company, and to focus on the minutiae of their own comfort and satisfaction rather than the labor conditions under which services are produced.

The roles of customer and worker are hardly fixed; most of us occupy both of these, and an economy in which more and more affective labor is required to stave off punishment is empowering only for companies managing through paranoia and fear. As customers are ever more closely aligned with bosses in everyday transactions, we work against our own collective interests.

Linda Besner’s most recent book is Feel Happier in Nine Seconds. Her poetry and nonfiction have appeared in the New York Times Magazine, the Boston Review, the Globe & Mail, and Enroute, and aired on CBC Radio. She lives in Montreal.