A Fake Company Staffed Only With AI Agents Was a Total Disaster

An experiment from Carnegie Mellon put AI agents to the test and they couldn't even accomplish a quarter of their tasks.

AI isn’t ready to take over all the jobs just yet, with a recent experiment demonstrating that a fake company run exclusively by AI agents would have, at best, a 24% rate of success.

Since the ChatGPT-fueled AI boom, the business world has been abuzz with talk of AI taking people’s jobs. Even Bill Gates came out and said that, in just 10 years, the majority of jobs will be obsolete.

That time definitely isn’t here yet, though, with researchers putting the theory to the test with disastrous results.

Researchers Created a Fake Company Staffed Entirely by AI Agents

The experiment from Carnegie Mellon, reported first by Business Insider, saw professors from the esteemed university creating a fake company run by different AI models from companies like OpenAI, Anthropic, Meta, and Google.

The AI models were then instructed to complete tasks that employees in a small software startup would need to accomplish, like analyzing spreadsheet data, conducting performance reviews, and picking a new office space.

 

About Tech.co Video Thumbnail Showing Lead Writer Conor Cawley Smiling Next to Tech.co LogoThis just in! View
the top business tech deals for 2025 👨‍💻
See the list button

The fake company in question, called TheAgentCompany, was then evaluated based on how effective it was at accomplishing these basic tasks, and if you can believe it, the company was an absolute mess.

The Fake Company Staffed by AI Was a Disaster

The experiment from Carnegie Mellon showed that AI simply isn’t ready to take over the business world, with not a single model able to achieve even a moderate level of success when it comes to running a business without human intervention.

Claude from Anthropic was the highest performing AI model in the research, accomplishing just 24% of the jobs it was tasked with. Other models, like Gemini from Google and ChatGPT from OpenAI fared even worse, achieving approximately a 10% success rate on requested tasks. The worst performing AI model was Nova from Amazon, which only accomplished 1.7% of jobs.

To make matters worse, the study found that these AI-powered companies were not only inefficient but problematically expensive, with each task averaging a cost of about $6. Considering each job was averaging approximately 30 tasks to be accomplished, those numbers could really stack up.

Why Isn’t AI Ready for Business Autonomy?

The new era of AI has certainly show serious potential over the last few years, but there are some clear AI shortcomings that will likely prevent it from running a business on its own for a long time. Because of an inherent lack of common sense, the technology runs into some borderline embarrassing problems when it doesn’t have human intervention.

The clearest example from this experiment was the AI model’s inability to access a file containing important data needed to assign projects to other “employees.” An unexpected pop-up hampered the simple task, soliciting IT requests and eventually abandoning the task entirely.

Any human would have easily recognized that by simply pressing the small X in the upper right hand corner of the pop-up ad, you could move on and accomplish the task. But alas, without common sense and a history of internet use, AI was unable to get the job done on its own.

Yes, big tech is investing in AI like crazy, lording its potential as a complete gamechanger for everyone on the planet. And while the environmental effects will certainly impact all of us, this experiment shows that we’re still quite far from a world where humans are no longer required to keep the wheels from falling off.

Did you find this article helpful? Click on one of the following buttons
We're so happy you liked! Get more delivered to your inbox just like it.

We're sorry this article didn't help you today – we welcome feedback, so if there's any way you feel we could improve our content, please email us at contact@tech.co

Written by:
Conor is the Lead Writer for Tech.co. For the last six years, he’s covered everything from tech news and product reviews to digital marketing trends and business tech innovations. He's written guest posts for the likes of Forbes, Chase, WeWork, and many others, covering tech trends, business resources, and everything in between. He's also participated in events for SXSW, Tech in Motion, and General Assembly, to name a few. He also cannot pronounce the word "colloquially" correctly. You can email Conor at conor@tech.co.
Explore More See all news
Back to top
close Building a Website? We've tested and rated Wix as the best website builder you can choose – try it yourself for free Try Wix today