A Fake Company Staffed Only With AI Agents Was a Total Disaster

An experiment from Carnegie Mellon put AI agents to the test and they couldn't even accomplish a quarter of their tasks.

Written by

Published on April 28, 2025

AI isn’t ready to take over all the jobs just yet, with a recent experiment demonstrating that a fake company run exclusively by AI agents would have, at best, a 24% rate of success.

Since the ChatGPT-fueled AI boom, the business world has been abuzz with talk of AI taking people’s jobs. Even Bill Gates came out and said that, in just 10 years, the majority of jobs will be obsolete.

That time definitely isn’t here yet, though, with researchers putting the theory to the test with disastrous results.

Researchers Created a Fake Company Staffed Entirely by AI Agents

The experiment from Carnegie Mellon, reported first by Business Insider, saw professors from the esteemed university creating a fake company run by different AI models from companies like OpenAI, Anthropic, Meta, and Google.

The AI models were then instructed to complete tasks that employees in a small software startup would need to accomplish, like analyzing spreadsheet data, conducting performance reviews, and picking a new office space.

This just in! View
the top business tech deals for 2025 👨‍💻

The fake company in question, called TheAgentCompany, was then evaluated based on how effective it was at accomplishing these basic tasks, and if you can believe it, the company was an absolute mess.

The Fake Company Staffed by AI Was a Disaster

The experiment from Carnegie Mellon showed that AI simply isn’t ready to take over the business world, with not a single model able to achieve even a moderate level of success when it comes to running a business without human intervention.

Claude from Anthropic was the highest performing AI model in the research, accomplishing just 24% of the jobs it was tasked with. Other models, like Gemini from Google and ChatGPT from OpenAI fared even worse, achieving approximately a 10% success rate on requested tasks. The worst performing AI model was Nova from Amazon, which only accomplished 1.7% of jobs.

To make matters worse, the study found that these AI-powered companies were not only inefficient but problematically expensive, with each task averaging a cost of about $6. Considering each job was averaging approximately 30 tasks to be accomplished, those numbers could really stack up.

Why Isn’t AI Ready for Business Autonomy?

The new era of AI has certainly show serious potential over the last few years, but there are some clear AI shortcomings that will likely prevent it from running a business on its own for a long time. Because of an inherent lack of common sense, the technology runs into some borderline embarrassing problems when it doesn’t have human intervention.

The clearest example from this experiment was the AI model’s inability to access a file containing important data needed to assign projects to other “employees.” An unexpected pop-up hampered the simple task, soliciting IT requests and eventually abandoning the task entirely.

Any human would have easily recognized that by simply pressing the small X in the upper right hand corner of the pop-up ad, you could move on and accomplish the task. But alas, without common sense and a history of internet use, AI was unable to get the job done on its own.

Yes, big tech is investing in AI like crazy, lording its potential as a complete gamechanger for everyone on the planet. And while the environmental effects will certainly impact all of us, this experiment shows that we’re still quite far from a world where humans are no longer required to keep the wheels from falling off.

A Fake Company Staffed Only With AI Agents Was a Total Disaster

Researchers Created a Fake Company Staffed Entirely by AI Agents

The Fake Company Staffed by AI Was a Disaster

Why Isn’t AI Ready for Business Autonomy?

Written by:

Truck Drivers Among Workers Least Impacted by AI, Study Finds

OpenAI Responds to Fresh Privacy Concerns, Removes Feature

Best Free AI Training Courses for August 2025

Cloudflare Says Perplexity’s AI Bots Ignore No-Crawl Directives