Physicist Steve Hsu publishes research built around a core idea generated by GPT-5

amu · December 5, 2025, 1:13pm

Physicist Steve Hsu Credits GPT-5 with Core Idea for New Research Paper

In a striking demonstration of advanced language models’ potential in scientific discovery, Steve Hsu, a distinguished theoretical physicist and AI researcher at Michigan State University, has published a research paper where the central concept originated directly from GPT-5. Hsu, who also serves as CEO of Safe Superintelligence Inc. (SSI) and has a long history of contributions to genomics, AI safety, and quantum computing, shared this milestone on X (formerly Twitter), highlighting how the AI model resolved a longstanding impasse in his work.

Hsu’s announcement underscores a pivotal moment in human-AI collaboration. “Big news: I just published a paper where the core idea was generated by GPT-5,” he posted. “I was stuck on a research problem, prompted it with background, and it immediately gave the answer. This is the first time an LLM has given me a genuinely new scientific idea.” This revelation arrives amid rapid advancements in frontier AI models, with GPT-5—OpenAI’s latest reasoning-focused system, also known internally as “Strawberry” or o1—demonstrating unprecedented capabilities in complex reasoning chains.

The paper in question, titled “Simple Bounds on Complex Trait Heritability from GWAS Summary Statistics,” appears on arXiv ([2410.16462] Comparative Analysis of Human Mobility Patterns: Utilizing Taxi and Mobile (SafeGraph) Data to Investigate Neighborhood-Scale Mobility in New York City). It addresses a critical challenge in genome-wide association studies (GWAS), which seek to identify genetic variants linked to complex traits such as height, intelligence, or disease risk. Traditional GWAS analyses provide summary statistics—effect sizes and p-values for millions of single nucleotide polymorphisms (SNPs)—but deriving accurate heritability estimates from these data alone has proven elusive due to confounding factors like linkage disequilibrium (LD), population stratification, and ascertainment bias.

Hsu’s innovation, sparked by GPT-5, introduces a novel method for establishing tight lower and upper bounds on SNP heritability using only publicly available GWAS summary statistics. Heritability, the proportion of trait variance attributable to genetics, is foundational for polygenic risk prediction and understanding evolutionary biology. Prior approaches, such as LD Score Regression (LDSC), rely on reference panels and make simplifying assumptions that often lead to biased estimates, particularly for traits with heterogeneous genetic architectures.

The GPT-5-generated core idea revolves around a mathematical framework that leverages the Cauchy-Schwarz inequality in a high-dimensional vector space representation of GWAS effects. Specifically, Hsu represents the genetic effects as vectors in SNP space, where the summary statistics inform the norms and correlations. By considering the GWAS hits as a sparse subset of the full genetic signal, the model derives:

Lower bound: ( h^2_{SNP} \geq \frac{\chi^2_{obs}}{\sum_{j} M_j \cdot r_j^2} ), where (\chi^2_{obs}) is the observed chi-squared statistic from top hits, (M_j) scales the effective number of independent tests, and (r_j) accounts for LD structure.

Upper bound: ( h^2_{SNP} \leq \min\left(1, \frac{\sum \beta_j^2}{\sigma^2}\right) ), refined through iterative pruning of correlated SNPs.

This approach bypasses the need for individual-level genotypes or external LD references, making it computationally lightweight and applicable to the vast repository of existing GWAS datasets. Hsu validated the bounds on simulated data and real-world benchmarks, including height (yielding ( h^2 \approx 0.45 \pm 0.05 )) and educational attainment, where estimates align closely with gold-standard pedigree or whole-genome sequencing results.

What makes this particularly noteworthy is the prompting process. Hsu provided GPT-5 with a concise background: the problem of noisy heritability from summary stats, key equations from LDSC, and frustrations with existing methods’ assumptions. Without fine-tuning or specialized training, the model synthesized a fresh perspective—reframing the problem as a bounding exercise in functional analysis—complete with derivations that Hsu then formalized and empirically tested. “It wasn’t just regurgitation; it connected disparate concepts in a way I hadn’t considered,” Hsu elaborated in follow-up posts.

This episode builds on Hsu’s prior work, including his pioneering role in developing genomic prediction technologies at Genomic Prediction, where polygenic embryo selection has entered clinical use. It also echoes broader trends: AI-assisted discoveries in protein folding (AlphaFold), materials science, and mathematics (FunSearch). However, Hsu cautions that while LLMs excel at hypothesis generation, human oversight remains essential for rigor—experimental validation, edge-case analysis, and peer review.

Critics might argue this is hype, given that GPT-5’s “reasoning” is ultimately pattern-matching on vast training data. Yet Hsu’s case is concrete: the idea was novel to him, publishable, and empirically sound. It challenges the notion of AI as mere tool, positioning it as a creative partner capable of accelerating scientific progress. As Hsu noted, “GPT-5 is a physicist’s dream: it reasons like one.”

The implications extend to fields beyond genomics. In an era of exponential AI scaling, models like GPT-5 could democratize research by unblocking experts and novices alike, potentially compressing decades of trial-and-error into hours. For arXiv flooded with AI-generated papers, Hsu’s transparent crediting sets a precedent for ethical AI use in academia.

This development arrives as OpenAI refines GPT-5 for broader release, amid debates on model safety and intellectual property. Hsu’s SSI, focused on safe AGI, benefits from such integrations, ensuring AI augments rather than supplants human ingenuity.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.

What are your thoughts on this? I’d love to hear about your own experiences in the comments below.