A hidden army of contract workers who have been doing the behind-the-scenes labor of teaching AI systems how to analyze data so they can generate the kinds of text and images that have wowed the people using newly popular products like ChatGPT.
Bloomberg | Bloomberg | Getty Images
Alexej Savreux, a 34-year-old in Kansas City, says he’s done all kinds of work over the years. He’s made fast-food sandwiches. He’s been a custodian and a junk-hauler. And he’s done technical sound work for live theater.
These days, though, his work is less hands-on: He’s an artificial intelligence trainer.
Savreux is part of a hidden army of contract workers who have been doing the behind-the-scenes labor of teaching AI systems how to analyze data so they can generate the kinds of text and images that have wowed the people using newly popular products like ChatGPT. To improve the accuracy of AI, he has labeled photos and made predictions about what text the apps should generate next.
The pay: $15 an hour and up, with no benefits.
Out of the limelight, Savreux and other contractors have spent countless hours in the past few years teaching OpenAI’s systems to give better responses in ChatGPT. Their feedback fills an urgent and endless need for the company and its AI competitors: providing streams of sentences, labels and other information that serve as training data.
“We are grunt workers, but there would be no AI language systems without it,” said Savreux, who’s done work for tech startups including OpenAI, the San Francisco company that released ChatGPT in November and set off a wave of hype around generative AI.
“You can design all the neural networks you want, you can get all the researchers involved you want, but without labelers, you have no ChatGPT. You have nothing,” Savreux said.
It’s not a job that will give Savreux fame or riches, but it’s an essential and often overlooked one in the field of AI, where the seeming magic of a new technological frontier can overshadow the labor of contract workers.
“A lot of the discourse around AI is very congratulatory,” said Sonam Jindal, the program lead for AI, labor and the economy at the Partnership on AI, a nonprofit based in San Francisco that promotes research and education around artificial intelligence.
“But we’re missing a big part of the story: that this is still hugely reliant on a large human workforce,” she said.
The tech industry has for decades relied on the labor of thousands of lower-skilled, lower-paid workers to build its computer empires: from punch-card operators in the 1950s to more recent Google contractors who’ve complained about second-class status, including yellow badges that set them apart from full-time employees. Online gig work through sites like Amazon Mechanical Turk grew even more popular early in the pandemic.
Now, the burgeoning AI industry is following a similar playbook.
The work is defined by its unsteady, on-demand nature, with people employed by written contracts either directly by a company or through a third-party vendor that specializes in temp work or outsourcing. Benefits such as health insurance are rare or nonexistent — which translates to lower costs for tech companies — and the work is usually anonymous, with all the credit going to tech startup executives and researchers.
The Partnership on AI warned in a 2021 report that a spike in demand was coming for what it called “data enrichment work.” It recommended that the industry commit to fair compensation and other improved practices, and last year it published voluntary guidelines for companies to follow.
DeepMind, an AI subsidiary of Google, is so far the only tech company to publicly commit to those guidelines.
“A lot of people have recognized that this is important to do. The challenge now is to get companies to do it,” Jindal said.
“This is a new job that’s being created by AI,” she added. “We have the potential for this to be a high-quality job and for workers who are doing this work to be respected and valued for their contributions to enabling this advancement.”
A spike in demand has arrived, and some AI contract workers are asking for more. In Nairobi, Kenya, more than 150 people who’ve worked on AI for Facebook, TikTok and ChatGPT voted Monday to form a union, citing low pay and the mental toll of the work, Time magazine reported. Facebook and TikTok did not immediately respond to requests for comment on the vote. OpenAI declined to comment.
So far, AI contract work hasn’t inspired a similar movement in the U.S. among the Americans quietly building AI systems word-by-word.
Savreux, who works from home on a laptop, got into AI contracting after seeing an online job posting. He credits the AI gig work — along with a previous job at the sandwich chain Jimmy John’s — with helping to pull him out of homelessness.
“People sometimes minimize these necessary, laborious jobs,” he said. “It’s the necessary, entry-level area of machine learning.” The $15 an hour is more than the minimum wage in Kansas City.
Job postings for AI contractors refer to both the allure of working in a cutting-edge industry as well as the sometimes-grinding nature of the work. An advertisement from Invisible Technologies, a temp agency, for an “Advanced AI Data Trainer” notes that the job would be entry level with pay starting at $15 an hour, but also that it could be “beneficial to humanity.”
“Think of it like being a language arts teacher or a personal tutor for some of the world’s most influential technology,” the job posting says. It doesn’t name Invisible’s client, but it says the new hire would work “within protocols developed by the world’s leading AI researchers.” Invisible did not immediately respond to a request for more information on its listings.
There’s no definitive tally of how many contractors work for AI companies, but it’s an increasingly common form of work around the world. Time magazine reported in January that OpenAI relied on low-wage Kenyan laborers to label text that included hate speech or sexually abusive language so that its apps could do better at recognizing toxic content on their own.
OpenAI has hired about 1,000 remote contractors in places such as Eastern Europe and Latin America to label data or train company software on computer engineering tasks, the online news outlet Semafor reported in January.
OpenAI is still a small company, with some 375 employees as of January, CEO Sam Altman said on Twitter, but that number doesn’t include contractors and doesn’t reflect the full scale of the operation or its ambitions. A spokesperson for OpenAI said no one was available to answer questions about its use of AI contractors.
The work of creating data to train AI models isn’t always simple to do, and sometimes it’s complex enough to attract would-be AI entrepreneurs.
Jatin Kumar, a 22-year-old in Austin, Texas, said he’s been doing AI work on contract for a year since he graduated college with a degree in computer science, and he said it gives him a sneak peak into where generative AI technology is headed in the near-term.
“What it allows you to do is start thinking about ways to use this technology before it hits public markets,” Kumar said. He’s also working on his own tech startup, Bonsai, which is making software to help with hospital billing.
A conversational trainer, Kumar said his main work has been generating prompts: participating in a back-and-forth conversation with chatbot technology that’s part of the long process of training AI systems. The tasks have grown more complex with experience, he said, but they started off very simple.
“Every 45 or 30 minutes, you’d get a new task, generating new prompts,” he said. The prompts might be as simple as, “What is the capital of France?” he said.
Kumar said he worked with about 100 other contractors on tasks to generate training data, correct answers and fine-tune the model by giving feedback on answers.
He said other workers handled “flagged” conversations: reading over examples submitted by ChatGPT users who, for one reason or another, reported the chatbot’s answer back to the company for review. When a flagged conversation comes in, he said, it’s sorted based on the type of error involved and then used in further training of the AI models.
“Initially, it started off as a way for me to help out at OpenAI and learn about existing technologies,” Kumar said. “But now, I can’t see myself stepping away from this role.”