← Back to Spotlight
Spotlight

AI models reveal how noncoding DNA mutations shape human development and disease risk

From Pepkio Team · 17 June 2026 · 3 min read

Unlocking the mysteries of the genome’s noncoding regions—the vast stretches of DNA that regulate gene activity rather than building proteins—has long frustrated geneticists trying to pinpoint the causes of inherited diseases. Now, by using artificial intelligence to predict how millions of these noncoding genetic variants behave across different cell types, scientists have uncovered a striking divide: common genetic tweaks tend to have highly specific effects in single cell types, while ultra-rare mutations often cause broad disruptions across multiple tissues, especially in the developing fetal brain. These findings, which offer a new framework for identifying elusive disease-causing mutations, scientists report today in Nature Genetics. The work, led by senior authors Anshul Kundaje and Stephen B. Montgomery at Stanford University, with Andrew R. Marderstein as lead researcher, introduces a powerful new predictive tool named FLARE.

To map this regulatory landscape, the researchers used deep learning models to generate 3 billion predictions of how different DNA variants alter chromatin accessibility—essentially, how open and active a stretch of DNA is—across 132 fetal and adult cell types in the brain and heart. By comparing the predicted effects of common variants with ultra-rare ones, the team found that the rare mutations have much larger, broader regulatory impacts. This pattern points to strong evolutionary pressure, known as purifying selection, which actively weeds out highly disruptive mutations before they can become common in a population. This selective pressure was found to be exceptionally strict in fetal neurons, underscoring how sensitive early brain development is to genetic disruption.

Building on these insights, the team developed FLARE (Functional Lasso Analysis of Regulatory Evolution). This computational model integrates the AI-driven regulatory predictions with evolutionary conservation data to pinpoint noncoding variants with the most extreme and potentially damaging effects. When applied to genomic data from patients, FLARE successfully prioritized previously hidden de novo (new, non-inherited) mutations linked to autism spectrum disorder and congenital heart disease. Furthermore, the study’s framework helped isolate regulatory changes in complex adult conditions: the ChromBPNet model prioritized candidate causal variants for Alzheimer's disease, while FLARE captured genetic heritability patterns in schizophrenia.

While this computational approach significantly narrows down the search for harmful mutations, the study authors note an important limitation: these AI-generated predictions prioritize candidates, but definitively confirming a mutation's biological effect still requires complex and costly laboratory experiments.

Ultimately, this research provides a vital map for navigating the noncoding genome, promising to accelerate the discovery of genetic drivers behind both rare developmental disorders and common adult diseases. As massive genomic datasets continue to grow, these approaches will be essential for translating raw DNA sequences into actionable clinical insights.

Reference: Marderstein, A.R., Kundu, S., Padhi, E.M. et al. Decoding common and rare noncoding variant effects across cellular and developmental contexts. Nat Genet (2026). https://doi.org/10.1038/s41588-026-02619-6