share_log

AI hype meets sobering reality: Companies lower expectations for AI agents, with full automation still years away.

wallstreetcn ·  Nov 5 01:00

Media reports indicate that companies are scaling back from the initial hype surrounding AI agents: while AI chatbots and coding tools have improved efficiency, AI agents capable of “taking over entire jobs” have encountered frequent setbacks during implementation. These agents are not only difficult and costly to deploy but also often produce confidently incorrect outputs, making them unsuitable for critical functions such as customer service and cybersecurity. Many companies have slowed their full automation plans, shifting toward a “human-AI collaboration” model, and now view AI agents as long-term R&D projects with limited short-term benefits. Some tech executives predict that it will take several more years for AI agents to reach true maturity and widespread adoption.

Media reports indicate that AI, through general-purpose chatbots and AI programming tools, is transforming the way people work, driving revenue growth for companies like OpenAI and $Microsoft (MSFT.US)$other firms. Companies have been experimenting with delegating employees' tasks to artificial intelligence agents (AI agents).

However, many businesses have encountered difficulties when using more sophisticated AI agents, which often “fail to perform adequately.” As a result, AI providers have had to step in personally, working with clients to troubleshoot issues and prevent AI from “causing disruptions.”

For instance, European retailer Fnac faced challenges while implementing an AI-powered customer service agent. Fnac tested models from OpenAI, Google, and other labs, but the results were unsatisfactory. Olivier Theulle, the company’s Chief Digital and E-commerce Officer, told the media that reliability was an issue: when customers reported defective products, the AI requested serial numbers but confused them with those of other products, even though the numbers differed by only one digit.

Fnac generates annual revenues of $100 billion. Theulle stated that the AI agent’s performance only stabilized after partnering with Israeli company AI21 Labs and receiving assistance from its engineers. Ori Goshen, Co-CEO of AI21, said,

“The issue is that the model performs well on various benchmarks straight out of the box, but it does not fare as well in real-world enterprise environments.”

“A significant level of customization is required.”

Some companies told the media that they could only truly benefit after their in-house software engineers spent months deploying AI agents and received direct technical support from AI firms. Tech company leaders now acknowledge that enterprises cannot expect complex AI projects to run smoothly without “hands-on guidance” from AI vendors.

Venture capitalist Vinod Khosla said in an October media interview,

“It’s like saying, ‘We have a race car, and anyone can drive it,’ but ordinary people simply cannot harness the car’s full potential.”

Khosla, an early investor in OpenAI, has recently invested in an AI consulting startup that dispatches engineers to enterprises such as T-Mobile to assist them in implementing AI within large organizations. This startup, Distyl, is just one of many companies emerging in this field, providing high-tech consulting services to businesses in need. OpenAI, Anthropic,$Salesforce (CRM.US)$and$Snowflake (SNOW.US)$and other AI developers and AI agent providers have also begun hiring Frontline Deployment Engineers (FDEs) or launching similar consulting services, though this often increases their costs.

Another example is Cox Automotive, a company specializing in software for car dealerships with annual sales of $9 billion. Previously, the company developed an AI agent to create marketing web pages for dealers. Since Cox Automotive is$Amazon (AMZN.US)$one of the largest clients of AWS cloud services in the automotive sector, it received 'white-glove treatment.'

Marianne Johnson, Chief Product Officer at Cox, told the media that AWS engineers and engineers from Anthropic, who provided the AI technology for the agent, flew to Cox's headquarters in Atlanta and worked alongside Cox’s software developers for several days to build the tool. She declined to disclose how much Cox paid AWS and Anthropic but estimated that the tool could save millions of dollars in labor costs over the next few years, as the company would no longer need to manually create websites for customers.

"It confidently spouts nonsense."

The goal of AI agents is to handle tasks such as customer service issues and IT system management. AI and cloud service providers are banking on revenue generated from enterprises using AI agents as justification for investing hundreds of billions of dollars in building AI data centers over the next year or two.

However, these vendors, along with some client executives, say that AI agents are too difficult to configure and often behave unpredictably, making them unsuitable for tasks where errors could lead to severe consequences. As a result, customers have lowered their expectations, no longer hoping that AI agents can automate too many tasks, and have postponed deploying AI agents in critical roles such as customer support and cybersecurity.

For example, the IT services giant$Kyndryl (KD.US)$began testing Microsoft's Security Copilot earlier this year. This is a chatbot designed to integrate with corporate IT systems and explain potential security vulnerabilities in plain English, effectively automating the work of a cybersecurity analyst. However, Scott Owenby, who is responsible for internal cybersecurity at the company, told the media that when Kyndryl employees tried asking basic questions, such as “Which company devices are running outdated software,” the answers provided by Security Copilot were evidently incorrect. Owenby stated,

"It confidently spouts nonsense, and I admire the confidence, but I can't trust its data."

Kyndryl spent approximately $50,000 testing Security Copilot over six months before deciding to discontinue its use. Owenby said,

"I basically burned $50,000. It's not a lot, and we would have continued using it if it had even been slightly useful, but we didn't expect it to be completely unusable."

Owenby also mentioned that other AI tools perform better, such as$Palo Alto Networks (PANW.US)$software that can automatically handle repetitive and tedious tasks in cybersecurity, such as investigating employee logins from new locations or capturing screenshots of sensitive data. This has allowed him to reduce the size of his security team over the past year. However, he emphasized that human oversight is still necessary to monitor these AI tools and they cannot be fully entrusted with complete autonomy.

“There is some hype involved.”

Bosch POWER Tools, with annual revenue exceeding USD 5.7 billion, has been testing a chatbot for over a year, according to Florian Haustein, the company's head of digital customer experience. The chatbot is designed to answer customer inquiries about tool usage and troubleshooting.

However, Haustein noted that this chatbot frequently provides incorrect answers, some of which could even lead to user injuries. As a result, the project remains in the pilot stage. He also mentioned that Bosch is testing models from various labs, including Google and OpenAI.

Haustein told the media that Bosch has seen better results with another, less aggressive customer service chatbot, which only answers more basic questions, such as where to purchase a specific product. Additionally, an AI tool provided by SAP can read customer inquiries and automatically assign them to the appropriate human employees. Haustein said,

“I think there is some hype around ‘fully AI-driven customer service.’”

“You must ensure that the answers are nearly 100% accurate… but we still see hallucinations and incorrect answers. I don’t think we’ve reached the level of confidence needed for full automation.”

Some technology providers also admit that AI agents are not yet mature. Amazon CEO Andy Jassy said during last Thursday’s earnings call:

“At this stage, building AI agents remains more challenging than anticipated.”

“But over time, much of the value companies derive from AI will come from AI agents.”

Revenue from AI agent products is difficult to calculate.

Currently, the adoption of general chatbots, programming assistants, AI search, and AI video generation tools has helped engineering, marketing, and product management teams improve efficiency, corporate executives told the media.

This has driven new revenue growth for AI vendors: according to the media's generative AI database, 20 AI-native startups led by OpenAI and Anthropic have reached an annualized revenue of $23 billion from AI office applications, compared to almost zero three years ago.

However, it is very difficult to separately calculate the revenue generated by 'AI agents.' In cloud companies such as Google, Microsoft, and Amazon, most of the revenue growth comes from large AI developers like OpenAI, Anthropic, and Meta renting servers, rather than enterprise-level AI applications.

Among enterprise software companies selling AI agents, results have been mixed. Earlier this year, Salesforce reported that its Agentforce product, used for automating sales emails, tracking invoices, and other tasks, generated over $100 million in annual revenue.$ServiceNow (NOW.US)$ServiceNow stated that its AI software for automating IT service desk tickets is expected to achieve $1 billion in revenue by the end of 2026. However, revenue growth at both companies has slowed in recent quarters compared to most of 2023.

SAP has not yet disclosed AI product revenue separately, but CEO Christian Klein said in this month’s earnings call that AI will bring 'double-digit revenue growth' in the next two years.

Many software companies offering AI agents, including Salesforce, Snowflake, and Xero, are not currently charging for these products. They hope to start charging only after customers truly recognize their value.

Paul Fipps, President of Global Customer Operations at ServiceNow, told the media that clients have recently become less excited about piloting AI capabilities because they have become more pragmatic, starting to consider what tasks AI agents can reasonably automate. Fipps said,

“Over the past 12 to 18 months, due to the rapid pace of generative AI development, many clients actively piloted these AI capabilities, pushing the pendulum to an extreme.”

“Now you see the pendulum beginning to swing back.”

He remains optimistic, believing that as AI agents continue to improve, companies will maintain significant investment in the coming years.

Currently, AI agents have been most successful in the software development field. AI coding agents are becoming a standard part of many companies' engineering teams. However, software engineers still need to review the code generated by AI, as it can make mistakes, meaning tasks cannot yet be fully automated.

"Staying realistic"

Nikesh Arora, CEO of Palo Alto Networks, stated that companies selling AI tools must be cautious not to overpromise how much work AI can automate. He believes it will take several more years before cybersecurity roles can be fully automated.

"We remain realistic; achieving full automation requires more effort, and we must be absolutely certain that when handing operations over to AI, its actions are correct because there are consequences in cybersecurity."

Nevertheless, companies still recognize the benefits brought by AI agents, even if they require 'someone to oversee them.' For example, Cirque du Soleil in Canada is using an AI agent provided by SAP to track invoices from its costume and set suppliers.

When suppliers send emails inquiring about invoice status, the AI agent checks whether the invoice has been processed in the SAP system and drafts a reply email. Previously, the company had two full-time employees handling this task; now, these two have been reassigned to other departments, with only one person needed to review the AI-generated draft before sending it out.

The operational cost of this tool is lower than the salary of a full-time employee, Vice President Philippe Lalumière told the media:

"Sometimes the emails written by AI are not very polite, but suppliers receive responses faster and more clearly, resulting in higher overall satisfaction. We haven't laid off staff because of it, but the productivity improvement is evident."

Meanwhile, other AI agent vendors are also reminding customers to treat these tools as experimental projects rather than investments that yield immediate returns.

Asha Sharma, President of Core AI Product Development at Microsoft, stated last week at The Information’s WTF Summit:

“Think of AI agents as R&D budgets… an investment that will pay off in the next five to ten years.”

“I think we are still in a very early stage… We now have millions of AI agents in production use, but people are still figuring out how to make AI agents truly useful.”

AI Portfolio Strategist!One-click insight into holdings,Fully grasp opportunities and risks.

Editor/Joryn

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment