Terence Tao proposes "artificial general cleverness" as a more honest label for what AI actually does

amu · December 17, 2025, 8:12pm

Terence Tao Advocates for “Artificial General Cleverness” as a More Accurate Term for AI Capabilities

Renowned mathematician Terence Tao, a Fields Medal winner and professor at UCLA, has sparked discussion in the AI community by proposing a shift in terminology. Instead of “Artificial General Intelligence” (AGI), which implies human-like reasoning and broad adaptability, Tao suggests “Artificial General Cleverness” (AGC). This rebranding, outlined in his recent blog post, reflects a more precise characterization of what large language models (LLMs) and similar systems truly achieve: impressive pattern-matching prowess within familiar domains, rather than genuine understanding or extrapolation to novel scenarios.

Tao’s argument stems from a deep analysis of AI’s core mechanisms. Modern AI systems, particularly LLMs like those powering ChatGPT or Grok, are trained on enormous datasets comprising trillions of tokens from human-generated text. This training enables them to predict the next token with remarkable accuracy, a process Tao describes as sophisticated interpolation. When presented with inputs that resemble their training data—situations within the statistical distribution they have encountered—these models perform exceptionally, generating coherent responses that mimic expertise across diverse topics.

However, Tao emphasizes a critical limitation: poor performance on extrapolation. Tasks requiring reasoning beyond the training distribution expose the models’ brittleness. For instance, LLMs struggle with simple arithmetic when numbers exceed those in their training examples or when problems involve subtle shifts in context. Tao illustrates this with the “Default Reasoning Test” (DRT), a benchmark he devised to probe default assumptions in LLMs. In one variant, the model must infer implicit spatial relationships, such as the position of objects on a number line. While humans intuitively grasp these defaults, LLMs falter, often defaulting to probabilistic guesses rather than robust logical inference.

Tao draws an analogy to function approximation in mathematics. LLMs excel at fitting smooth curves to dense data points (interpolation), but they fail to extend those curves reliably to uncharted regions (extrapolation). This mirrors historical observations in numerical analysis, where polynomial interpolation shines locally but diverges wildly outside sampled intervals—a phenomenon known as Runge’s phenomenon. In AI terms, this manifests as hallucinations: confident but incorrect outputs when venturing into unfamiliar territory.

The mathematician further dissects LLM behavior through the lens of “stochastic parrots,” a phrase popularized earlier in AI critiques, but refines it with mathematical rigor. He categorizes LLM strengths into three tiers:

Surface-Level Mimicry: Reproducing factual recall or stylistic imitation from training data.
Shallow Reasoning: Chain-of-thought prompting elicits step-by-step outputs that align with patterns, but these chains often collapse under adversarial perturbations.
Emergent Cleverness: Rare instances of novel problem-solving, like solving unseen math puzzles, arise not from comprehension but from recombinations of memorized solutions.

Tao cautions against overhyping these capabilities. Even “emergent” behaviors, such as in-context learning where models adapt to new tasks from few examples, rely on implicit pattern matching rather than abstract generalization. He points to empirical evidence from benchmarks like GSM8K (grade-school math) and ARC (abstraction and reasoning), where LLMs score high on interpolation-heavy subsets but plummet on extrapolation demands.

This perspective has profound implications for AI development and expectations. Pursuing AGI under current paradigms may lead to diminishing returns, as scaling compute and data amplifies interpolation prowess but does little for true generalization. Tao advocates for hybrid approaches: integrating symbolic AI, which excels at rule-based extrapolation, with neural networks’ interpolative strengths. He also calls for better evaluation metrics that prioritize out-of-distribution performance, moving beyond saturated leaderboards.

Tao’s proposal resonates amid hype cycles surrounding models like GPT-4 and beyond. Industry leaders often tout proximity to AGI, yet real-world deployments reveal gaps— from unreliable code generation to biased decision-making. By dubbing it AGC, Tao urges humility: these systems are extraordinarily clever tools, amplifying human productivity in bounded contexts, but not replacements for human intelligence.

In his blog, Tao experiments with prompting techniques to elicit better default reasoning, finding modest gains from explicit instructions like “think step-by-step” or role-playing as a careful reasoner. Yet, these are Band-Aids; fundamental advances may require rethinking training objectives, perhaps incorporating formal verification or causal inference modules.

Ultimately, Tao’s intervention reframes the AI narrative. AGC acknowledges achievements without inflating promises, fostering realistic progress toward systems that bridge interpolation and extrapolation. As AI permeates society, such terminological precision could guide ethical deployment and resource allocation more effectively.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.

What are your thoughts on this? I’d love to hear about your own experiences in the comments below.