Salesforce Executives Voice Concerns Over Eroding Trust in Large Language Models
In a recent earnings call for Salesforce’s second fiscal quarter, top executives candidly addressed a growing skepticism toward large language models (LLMs), signaling a pivotal shift in the company’s AI strategy. Parker Harris, Salesforce’s Chief Technology Officer and co-founder, explicitly stated, “I think there’s declining trust in LLMs.” This remark underscores broader industry challenges as enterprises grapple with the limitations of generative AI technologies that have dominated headlines since their explosive rise.
The discussion emerged during the Q2 2025 earnings call, where Salesforce reported robust financial results, including revenue of $9.33 billion, surpassing analyst expectations. However, beneath the positive headlines, executives highlighted persistent pain points with LLMs. Harris elaborated that while these models excel at certain tasks, their propensity for hallucinations—generating plausible but inaccurate information—has eroded confidence among users. “Customers are experiencing frustration with the unreliability,” he noted, pointing to instances where LLMs produce outputs that sound authoritative yet lack factual grounding.
Amy Weaver, Salesforce’s President and Chief Financial Officer, echoed these sentiments, emphasizing customer demands for greater reliability and control. She described how businesses are increasingly wary of deploying LLMs in mission-critical workflows due to risks of errors propagating through systems. Weaver highlighted that clients seek solutions where AI outputs are verifiable and aligned with enterprise data, rather than relying on probabilistic predictions from vast, opaque training datasets.
This declining trust manifests in Salesforce’s strategic pivot. The company is doubling down on “agentic AI,” a paradigm where AI agents perform structured, multi-step reasoning rather than free-form generation. Harris explained that agentic systems incorporate guardrails, such as retrieval-augmented generation (RAG) and integration with trusted data sources like Salesforce’s Data Cloud. These agents break down complex tasks into verifiable steps, reducing hallucination risks and enhancing transparency. For instance, an agent might query a CRM database, validate facts, and only then formulate a response, ensuring outputs are grounded in reality.
Salesforce’s Agentforce platform exemplifies this approach. Launched as a suite of autonomous AI agents, Agentforce leverages LLMs as a foundational layer but overlays them with proprietary orchestration tools. During the call, executives detailed how Agentforce has seen rapid adoption, with thousands of customers piloting it. Harris cited examples where agents handle sales processes end-to-end, from lead qualification to contract negotiation, all while maintaining audit trails for accountability.
The executives also addressed competitive dynamics. While rivals like Microsoft and Google continue to push ever-larger LLMs, Salesforce is advocating for a “reasoning-first” mindset. Harris predicted that future AI success will hinge on models capable of chain-of-thought reasoning, where intermediate steps are exposed and validated. He referenced ongoing research into smaller, specialized models fine-tuned on enterprise data, which offer comparable performance with lower latency and cost.
Weaver provided context on market trends, noting that total remaining performance obligations (RPO)—a key indicator of future revenue—grew 11% year-over-year to $27.1 billion. Much of this growth stems from AI-related contracts, particularly those involving Data Cloud and Einstein AI features. However, she cautioned that macroeconomic pressures and AI evaluation fatigue are tempering enthusiasm. Customers, she said, are in an “experimentation phase,” testing LLMs extensively before committing to production-scale deployments.
This scrutiny aligns with wider industry observations. Reports from Gartner and Forrester indicate that up to 30% of generative AI projects may be abandoned by 2025 due to poor business value or ethical concerns. Salesforce’s forthrightness differentiates it from peers who often gloss over LLM shortcomings in marketing materials.
Looking ahead, Harris outlined R&D priorities: enhancing model safety through constitutional AI principles, where models are trained to adhere to predefined rules; expanding multimodal capabilities for processing images and voice alongside text; and deepening integrations with external APIs for real-time data verification. He also teased advancements in “zero-trust AI,” ensuring no sensitive data escapes the enterprise perimeter.
Weaver closed on an optimistic note, affirming Salesforce’s $1 billion investment in AI infrastructure, including GPU clusters and custom silicon partnerships. “We’re building the AI operating system for the enterprise,” she declared, positioning Salesforce not as an LLM provider but as an orchestrator of reliable AI ecosystems.
These executive insights reveal a maturing AI landscape, where hype gives way to pragmatism. As trust in standalone LLMs wanes, innovations like agentic architectures promise to restore confidence, enabling AI to deliver tangible value without the pitfalls of unchecked generation.
Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.
What are your thoughts on this? I’d love to hear about your own experiences in the comments below.