Named after the influential crystallographer Rosalind Franklin, who played a pivotal role in uncovering the structure of DNA, GPT-Rosalind marks OpenAI’s first foray into developing domain-specific models tailored for biochemistry, genomics, and protein engineering.
On Thursday, OpenAI announced the launch of GPT-Rosalind, a cutting-edge reasoning model specifically engineered for the life sciences sector. This new AI model aims to facilitate critical scientific processes such as evidence synthesis, hypothesis formulation, experimental design, and complex scientific workflows across various domains in biochemistry and genomics.
Currently available as a research preview in platforms like ChatGPT, Codex, and the OpenAI API, GPT-Rosalind access is limited to a trusted-access program for qualified enterprise customers based in the United States.
The model honors Rosalind Franklin, the British chemist whose groundbreaking X-ray crystallography work was crucial in revealing the double helix structure of DNA. Despite her significant contributions, Franklin's role was overlooked when the 1962 Nobel Prize was awarded to Watson, Crick, and Wilkins. The choice of name serves as a meaningful acknowledgment of her foundational impact on modern molecular biology and highlights ongoing discussions about the historical erasure of women in science.
OpenAI positions GPT-Rosalind as an innovative tool designed to expedite the journey from scientific concepts to clinical evidence. The company estimates that the current timeline for advancing a drug from target discovery to regulatory approval in the United States spans approximately 10 to 15 years. GPT-Rosalind aims to assist during the early stages of this process by enabling queries of specialized databases, parsing scientific literature, and suggesting new experimental pathways, all within a unified interface.
In conjunction with the model's release, OpenAI is also launching a Life Sciences research plugin for Codex. This plugin will provide access to over 50 scientific tools and data sources, allowing researchers programmatic access to biological databases and computational workflows.
Among the early adopters of GPT-Rosalind are notable organizations such as Amgen, Moderna, and Thermo Fisher Scientific, along with collaborations involving the Allen Institute and Los Alamos National Laboratory to explore AI-guided protein and catalyst design.
Performance benchmarks released by OpenAI indicate that GPT-Rosalind achieved a 0.751 pass rate on BixBench, a bioinformatics benchmark created by Edison Scientific to assess models on real-world computational biology tasks. Additionally, on LABBench2, a broader research benchmark, GPT-Rosalind outperformed GPT-5.4 in six out of eleven tasks, particularly excelling in CloningQA, which involves the comprehensive design of reagents for molecular cloning protocols.
Further performance evaluations conducted by Dyno Therapeutics, a company specializing in gene therapy and the design of AAV capsid proteins, illustrated the model's capabilities. Utilizing unpublished RNA sequences to prevent benchmark contamination, GPT-Rosalind was assessed on sequence-to-function prediction and sequence generation tasks. The model's best submission ranked above the 95th percentile of human experts in prediction tasks and around the 84th percentile in sequence generation, according to OpenAI, with validation from multiple media reports covering the launch.
However, the launch comes with significant dual-use concerns that OpenAI has proactively addressed through its access model. Experts have cautioned that AI models trained on biological data could potentially be misused to engineer harmful pathogens. In response, OpenAI's strategy involves restricting access to a vetted trusted-access program, requiring organizations to demonstrate their commitment to improving human health outcomes and enforcing robust security and governance controls. During the research preview phase, users will not deplete existing API credits.