Review coordinated by Life Science Editors Foundation
Reviewed by: Dr. Angela Andersen, Life Science Editors Foundation & Life Science Editors
Potential Conflicts of Interest: None
PUNCHLINE
Evo 2 is a biological foundation model trained on 9.3 trillion DNA bases across all domains of life. It predicts the impact of genetic variation—including in noncoding and clinically relevant regions—without requiring task-specific fine-tuning. Evo 2 also generates genome-scale sequences and epigenomic architectures guided by predictive models. By interpreting its internal representations using sparse autoencoders, the model is shown to rediscover known biological features and uncover previously unannotated patterns with potential functional significance. These capabilities establish Evo 2 as a generalist model for prediction, annotation, and biological design.
BACKGROUND
A foundation model is a large-scale machine learning model trained on massive and diverse datasets to learn general features that can be reused across tasks. Evo 2 is such a model for genomics: it learns from raw DNA sequence alone—across bacteria, archaea, eukaryotes, and bacteriophage—without explicit labels or training on specific tasks. This enables it to generalize to a wide range of biological questions, including predicting the effects of genetic variants, identifying regulatory elements, and generating genome-scale sequences or chromatin features.
Evo 2 comes in two versions: one with 7 billion parameters (7B) and a larger version with 40 billion parameters (40B). These numbers reflect the number of trainable weights in the model and influence its capacity to learn complex patterns. Both models were trained using a context window of up to 1 million tokens—where each token is a nucleotide—allowing the model to capture long-range dependencies across entire genomic regions.
Evo 2 learns via self-supervised learning, a method in which the model learns to predict masked or missing DNA bases in a sequence. Through this simple but powerful objective, the model discovers statistical patterns that correspond to biological structure and function, without being told what those patterns mean.
QUESTION ADDRESSED
Can a large-scale foundation model trained solely on genomic sequences generalize across biological tasks—such as predicting mutational effects, modeling gene regulation, and generating realistic genomic sequences—without supervision or task-specific tuning?
SUMMARY
The authors introduce Evo 2, a foundational model for genomics that generalizes across DNA, RNA, and protein tasks. Without seeing any biological labels, Evo 2 learns the sequence rules governing coding and noncoding function, predicts variant effects—including in BRCA1/2 and splicing regions—and generates full-length genomes and epigenome profiles. It also enables epigenome-aware sequence design by coupling sequence generation with predictive models of chromatin accessibility.
To probe what the model has learned internally, the authors use sparse autoencoders (SAEs)—a technique that compresses the model’s internal activations into a smaller set of interpretable features. These features often correspond to known biological elements, but importantly, some appear to capture novel, uncharacterized patterns that do not match existing annotations but are consistently associated with genomic regions of potential functional importance. This combination of rediscovery and novelty makes Evo 2 a uniquely powerful tool for exploring both the known and the unknown genome.
KEY RESULTS
Evo 2 trains on vast genomic data using a novel architecture to handle long DNA sequences
Figures 1 + S1
Goal: Build a model capable of representing entire genomic regions (up to 1 million bases) from any organism.
Outcome: Evo 2 was trained on 9.3 trillion bases using a hybrid convolution-attention architecture (StripedHyena 2). The model achieves long-context recall and strong perplexity scaling with increasing sequence length and model size.
Evo 2 predicts the impact of mutations across DNA, RNA, and protein fitness
Figures 2A–J + S2–S3
Goal: Assess whether Evo 2 can identify deleterious mutations without supervision across diverse organisms and molecules.
Outcome: Evo 2 assigns lower likelihoods to biologically disruptive mutations—e.g., frameshifts, premature stops, and non-synonymous changes—mirroring evolutionary constraint. Predictions correlate with deep mutational scanning data and gene essentiality assays. Evo 2 embeddings also support highly accurate exon-intron classifiers.
Clarification: “Generalist performance across DNA, RNA, and protein tasks” means that Evo 2 can simultaneously make accurate predictions about the functional impact of genetic variants on transcription, splicing, RNA stability, translation, and protein structure—without being specifically trained on any of these tasks.
Evo 2 achieves state-of-the-art performance in clinical variant effect prediction
Figures 3A–I + S4
Goal: Evaluate Evo 2's ability to predict pathogenicity of human genetic variants.
Outcome: Evo 2 matches or outperforms specialized models on coding, noncoding, splicing, and indel variants. It accurately classifies BRCA1/2 mutations and generalizes to novel variant types. When paired with supervised classifiers using its embeddings, it achieves state-of-the-art accuracy on BRCA1 variant interpretation.
Evo 2 representations reveal both known and novel biological features through sparse autoencoders
Figures 4A–G + S5–S7
Goal: Understand what Evo 2 has learned internally.
Outcome: Sparse autoencoders decompose Evo 2’s internal representations into distinct features—many of which align with well-known biological elements such as exon-intron boundaries, transcription factor motifs, protein secondary structure, CRISPR spacers, and mobile elements. Importantly, a subset of features do not correspond to any known annotations, yet appear repeatedly in biologically plausible contexts. These unannotated features may represent novel regulatory sequences, structural motifs, or other functional elements that remain to be characterized experimentally.
Note: Sparse autoencoders are neural networks that reduce high-dimensional representations to a smaller set of features, enforcing sparsity so that each feature ideally captures a distinct biological signal. This approach enables mechanistic insight into what the model “knows” about sequence biology.
Evo 2 generates genome-scale sequences with realistic structure and content
Figures 5A–L + S8
Goal: Assess whether Evo 2 can generate complete genome sequences that resemble natural ones.
Outcome: Evo 2 successfully generates mitochondrial genomes, minimal bacterial genomes, and yeast chromosomes. These sequences contain realistic coding regions, tRNAs, promoters, and structural features. Predicted proteins fold correctly and recapitulate functional domains.
Evo 2 enables design of DNA with targeted epigenomic features
Figures 6A–G + S9
Goal: Use Evo 2 to generate DNA sequences with user-defined chromatin accessibility profiles.
Outcome: By coupling Evo 2 with predictors like Enformer and Borzoi, the authors guide generation to match desired ATAC-seq profiles. Using a beam search strategy—where the model explores and ranks multiple possible output sequences—it generates synthetic DNA that encodes specific chromatin accessibility patterns, such as writing “EVO2” in open/closed chromatin space.
STRENGTHS
First large-scale, open-source biological foundation model trained across all domains of life
Performs well across variant effect prediction, genome annotation, and generative biology
Demonstrates mechanistic interpretability via sparse autoencoders
Learns both known and novel biological features directly from raw sequence
Unsupervised learning generalizes to clinical and functional genomics
Robust evaluation across species, sequence types, and biological scales
FUTURE WORK & EXPERIMENTAL DIRECTIONS
Expand training to include viruses that infect eukaryotic hosts: Evo 2 currently excludes these sequences, in part to reduce potential for misuse and due to their unusual nucleotide structure and compact coding. As a result, Evo 2 performs poorly on eukaryotic viral sequence prediction and generation. Including these genomes could expand its applications in virology and public health.
Empirical validation of novel features: Use CRISPR perturbation, reporter assays, or conservation analysis to test Evo 2-derived features that don’t align with existing annotations.
Targeted mutagenesis: Use Evo 2 to identify high-impact or compensatory variants in disease-linked loci, and validate using genome editing or saturation mutagenesis.
Epigenomic editing: Validate Evo 2-designed sequences for chromatin accessibility using ATAC-seq or synthetic enhancer assays.
Clinical applications: Fine-tune Evo 2 embeddings to improve rare disease variant interpretation or personalized genome annotation.
Synthetic evolution: Explore whether Evo 2 can generate synthetic genomes with tunable ecological or evolutionary features, enabling testing of evolutionary hypotheses.
AUTHORSHIP NOTE
This review was drafted with support from ChatGPT (OpenAI) to help organize and articulate key ideas clearly and concisely. I provided detailed prompts, interpretations, and edits to ensure the review reflects an expert understanding of the biology and the paper’s contributions. The final version has been reviewed and approved by me.
FINAL TAKEAWAY
Evo 2 is a breakthrough in foundation models for biology—offering accurate prediction, functional annotation, and genome-scale generation, all learned from raw DNA sequence. By capturing universal patterns across life, and identifying both well-characterized and unknown sequence features, Evo 2 opens powerful new directions in evolutionary biology, genomics, and biological design. Its open release invites widespread use and innovation across the life sciences.