KLista Pawlowski recalls a pivotal moment that influenced her views on the ethics surrounding artificial intelligence. As a worker on Amazon Mechanical Turk, a platform where businesses hire individuals for tasks like data entry and validating AI outputs, Pawlowski dedicates her time to overseeing and assessing AI-generated text, images, and videos, along with fact-checking them.
Approximately two years ago, she accepted a job categorizing tweets as racist or not, working from her dining room table. When she encountered a tweet stating, “Listen to the Mooncricket song,” she nearly clicked “no” before researching the term “Mooncricket,” only to discover it was a racial slur against Black Americans.
“I sat there contemplating how many times I might have made the same error without realizing it,” Pawlowski reflected.
The possible enormity of her own mistakes, alongside those of countless other workers like her, plunged Ms. Pawlowski into a troubling contemplation. How many others have unwittingly overlooked offensive content or worse, permitted it?
After years of observing the inner workings of AI systems, Pawlowski has made a personal decision to refrain from using generative AI products, and she has advised her family to do the same.
“In my house, it’s off-limits,” Pawlowski said regarding her teenage daughter’s use of tools like ChatGPT. When meeting people socially, she encourages them to question AI about topics they are knowledgeable about. This way, they can identify AI’s inaccuracies and appreciate how fallible the technology is. Each time Pawlowski looks at a new set of tasks available on the Mechanical Turk platform, she wonders if her actions might inadvertently harm others, and her answer is consistently “yes.”
Amazon stated that employees have the discretion to select tasks and can review task details prior to accepting them. According to Amazon, requesters define the specifics for tasks, including estimated time, payment, and instruction level.
“Amazon Mechanical Turk serves as a marketplace connecting businesses and researchers, known as requesters, with workers who perform online tasks, including labeling images, answering surveys, transcribing text, and reviewing AI outputs,” explained Amazon spokesperson Montana McLachlan.
Pawlowski isn’t alone. Twelve AI evaluators, responsible for verifying the accuracy and reasoning behind AI responses, reported to the Guardian that after recognizing the inaccuracies in chatbots and image generators, they began to caution friends and family against using generative AI altogether, or at least advised them to adopt a cautious approach. These evaluators work with various AI models, including Google’s Gemini, Elon Musk’s Grok, and other popular technologies, including some lesser-known bots.
One evaluator from Google, who assesses responses generated by Google Search’s AI summaries, noted that the company aims to minimize AI usage whenever possible. She expressed concern about the organization’s handling of AI responses to health-related queries and requested anonymity to avoid professional backlash. She observed that colleagues assessed AI-generated medical responses without critical evaluation and that she herself had to evaluate such queries despite lacking medical qualifications.
At home, she restricts her 10-year-old daughter from using chatbots. “Without critical thinking skills, she won’t be able to determine if the information is valid,” the evaluator stated.
“Ratings represent just one of many aggregated data points that inform us about our systems’ performance, but they do not directly affect our algorithms or models,” Google clarified in a statement. “We have implemented comprehensive safeguards to ensure that high-quality information is provided across our products.”
Bot watchers raise concerns
These individuals constitute a global workforce of tens of thousands dedicated to making chatbots more human-like. While assessing AI’s responses, they strive to prevent the dissemination of incorrect or harmful information.
However, when those ensuring AI appears credible have the least trust in it, experts suggest that’s indicative of a more substantial issue.
“This suggests a tendency to prioritize product launch and scaling over thorough testing, and that the feedback from evaluators is often disregarded,” said Alex Mahadevan, director of MediaWise at Poynter, a program focused on media literacy. “So, if you observe the finalized versions of chatbots, expect to encounter similar mistakes. This can be troubling for the general public increasingly looking toward LLMs for news and information.”
AI professionals express skepticism toward the models they work with because they often prioritize fast turnaround times over quality. Brook Hansen, an AI worker at Amazon Mechanical Turk, conveyed that while she does not trust generative AI conceptually, she also holds reservations about the organizations creating and implementing these tools. A significant turning point for her was realizing how little support is provided to those training these systems.
“We are expected to enhance the model, but often face vague or insufficient instructions, little training, and unrealistic deadlines,” stated Hansen, who has been involved in data work since 2010 and contributed to training some of Silicon Valley’s leading AI models. “If employees lack the necessary information, resources, and time, how can the results be safe, accurate, or ethical? The disparity between expectations and the actual support provided is a clear indication that companies prioritize speed and profit over responsibility and quality.”
Experts point out a fundamental flaw in generative AI: an inability to refrain from providing answers when none are available, often delivering false information assuredly. A NewsGuard audit of the top ten generative AI models, including ChatGPT, Gemini, and Meta AI, found that non-response rates dropped from 31% in August 2024 to 0% in August 2025. Simultaneously, these chatbots were found to be more likely to disseminate misinformation, with the rate nearly doubling from 18% to 35%. None of the companies responded to NewsGuard’s request for comment at that time.
“I don’t have any faith in the accuracy of the bot. [It] lacks ethical integrity,” said another Google AI evaluator, who sought anonymity due to a non-disclosure agreement with the contracting firm, echoing sentiments from another evaluator who warned against using AI, particularly in sensitive medical or ethical matters. “This is not an ethical robot.” It is merely a robot.
After newsletter promotion
“We joke about [chatbots] wishing we could get them to stop falsifying information,” remarked an AI trainer who has worked with Gemini, ChatGPT, and Grok, requesting anonymity due to a non-disclosure agreement.
“Garbage in, garbage out.”
Another AI evaluator, beginning their assessment of Google’s products in early 2024, found themselves doubting the AI’s credibility after six months. Tasked with identifying the model’s limitations, they had to pose various questions to Google’s AI.
“I probed into Palestinian history, but regardless of how I rephrased my questions, I received no answers,” remembered this individual, who preferred to remain anonymous due to a non-disclosure agreement. “When asking about Israeli history, however, the AI readily provided extensive information. We reported this inconsistency, but Google seemed uninterested.” Google did not issue a statement regarding the matter when specifically questioned.
For this Google employee, the primary concern lies in the quality of feedback given to AI models by evaluators like them. “After witnessing the poor quality of data intended for training the model, I realized it was utterly impossible to train it effectively under such conditions,” they noted, employing the phrase “garbage in, garbage out.” This programming principle illustrates that poor or incomplete data inputs inevitably lead to faulty outputs.
This evaluator mentioned they refrain from using generative AI and actively advise friends and family against purchasing new phones with integrated AI, urging them to resist automatic updates that incorporate AI, and to withhold personal information from AI.
Fragile, not futuristic
Whenever discussions of AI arise, Hansen reminds her audience that AI isn’t magical, emphasizing the invisible workforce supporting it, the unreliability of its information, and its negative environmental impacts.
“When you analyze how these systems are constructed—considering biases, expedited timelines, and constant compromises—you cease to see AI as an advancement and begin viewing it as fragile,” explained Adio Dinica, who studies the workforce behind AI at the Decentralized AI Institute, reflecting on the people working behind the scenes. “In my experience, those fascinated by AI are typically those who lack a deep understanding of it.”
The AI workers who spoke with the Guardian expressed a commitment to making better choices and raising awareness among their communities, particularly emphasizing that, per Hansen, AI “doesn’t guarantee the best information; the value lies in those working with the AI.” She and Pawlowski presented at the Michigan School Boards Association spring conference in May, engaging with a room filled with school board members and administrators from across the state, discussing the ethical and environmental ramifications of artificial intelligence, aspiring to foster dialogue.
“Many attendees had never considered the human labor and environmental costs associated with AI, so they were astonished by our insights,” Hansen revealed. “While some appreciated the perspective, others pushed back, claiming we were being ‘hopeless and bleak’ about a technology they deemed exciting and filled with potential.”
Pawlowski compares AI ethics to that of the textile industry. In an era when consumers were unaware of how inexpensive clothing was produced, they were pleased to find bargains. However, as stories of sweatshops emerged, consumers learned they had choices and responsibilities. She believes a similar awakening is necessary in the AI sector.
“Where does the data originate? Is this model developed from piracy? Were the contributors fairly compensated for their efforts?” she questioned. “Often, the truth remains obscure to the public, as we are only beginning to inquire. But change is feasible if we persist in questioning and advocating for better practices, analogous to the textile industry.”
Source: www.theguardian.com
