Researchers have developed a revolutionary “stacking” approach that combines multiple MRI modalities to predict cognitive abilities with unprecedented accuracy. The technique achieves correlations of 0.5-0.6 when predicting current cognitive performance and remarkably predicts childhood intelligence from middle-aged brain scans.

A groundbreaking study published in PNAS Nexus on 24 June 2025 has demonstrated that combining multiple brain imaging modalities through machine learning “stacking” can dramatically improve the prediction of cognitive abilities whilst addressing critical challenges in brain-wide association studies (BWAS).
The research, led by Narun Pat and colleagues from the University of Otago and collaborating institutions, analysed brain imaging data from 2,131 participants aged 22 to 100 across three major datasets in the United States and New Zealand. Their innovative approach tackles three fundamental challenges that have long plagued neuroimaging research: predictability, test-retest reliability, and cross-cohort generalizability.
Revolutionary approach combines diverse brain imaging data
Traditional brain-wide association studies typically rely on a single MRI modality to predict cognitive abilities, often yielding modest results. The stacking technique represents a paradigm shift by integrating structural MRI measures (such as cortical thickness), resting-state functional connectivity, task-based functional connectivity, and task-evoked blood-oxygen-level-dependent (BOLD) contrasts into a unified prediction model.
“Scientists have had limited success in predicting cognitive abilities from brain MRI,” the authors explain in their significance statement. “We proposed a machine learning method, called stacking, to draw information across different types of brain MRI.”
The methodology works by first building separate prediction models for each MRI modality, then treating the predicted values from these individual models as features for a higher-level “stacked” prediction model. This hierarchical approach allows researchers to capture complementary information from different brain imaging techniques that might be missed when using any single modality in isolation.
Remarkable predictive accuracy achieved across datasets
The stacked models demonstrated exceptional predictive performance across all three datasets examined: the Human Connectome Project Young Adults (873 participants, aged 22-35), Human Connectome Project Aging (504 participants, aged 35-100), and the Dunedin Multidisciplinary Health and Development Study (754 participants, aged 45).
When predicting cognitive abilities at the time of scanning, stacked models achieved out-of-sample correlations of approximately 0.5-0.6, substantially higher than the 0.42 correlation reported in recent meta-analyses of single-modality approaches. The “Stacked: All” models, which incorporated all available MRI features, consistently outperformed individual modalities across different machine learning algorithms.
Perhaps most remarkably, the Dunedin Study’s longitudinal design enabled a unique demonstration of the technique’s power.
Using multimodal MRI data collected when participants were 45 years old, the stacked models successfully predicted cognitive abilities measured at ages 7, 9, and 11 years with a correlation of 0.52.
“Using brain imaging at age 45, the model predicted childhood cognitive scores (ages 7, 9, and 11) with a .52 Pearson’s correlation – indicating a substantial degree of predictive accuracy,” according to the authors.
Task-based imaging emerges as key driver
The analysis revealed that task-evoked BOLD contrasts were the primary drivers of the improved predictive performance. Specific tasks showed particularly strong associations with cognitive abilities: the working-memory task in younger adults and the facename task in older adults and middle-aged participants.
The authors note that “stacking, especially with fMRI task contrasts, allowed us to use MRI of people aged 45 years to predict their childhood cognitive abilities reasonably well.” This finding challenges the traditional reliance on resting-state connectivity and structural measures in cognitive prediction studies.
However, not all task contrasts contributed equally. Some tasks, such as gambling paradigms, showed poor predictive performance, highlighting the importance of task selection in cognitive neuroimaging studies.
Reliability challenges addressed through ensemble approach
Test-retest reliability has been a persistent concern in neuroimaging research, particularly for task-based measures. Previous studies had identified poor reliability for task contrasts across different scanning sessions, raising questions about their utility as stable markers of individual differences.
The stacking approach substantially improved test-retest reliability, achieving excellent intraclass correlations (ICC > 0.75) even when using only task-based fMRI data. The “Stacked: All” models reached ICC values of 0.79 and 0.89 for the two datasets with repeat scanning sessions.
“For test-retest reliability, stacked models reached an excellent level of reliability across HCP Young Adults and Dunedin Study, even when we only included fMRI during tasks in the models,” the authors report.
This improvement appears to result from the ensemble nature of stacking, where multiple sources of information compensate for the variability inherent in any single measure.
Cross-dataset generalizability demonstrates robustness
Perhaps most importantly for clinical translation, the stacked models showed significant cross-dataset generalizability. When models trained on one dataset were applied to completely independent datasets with different participants, scanners, and protocols, they maintained above-chance predictive performance.
The “Stacked: Non Task” models, which combined structural and resting-state measures, achieved a correlation of 0.25 when tested across datasets. Whilst lower than within-dataset performance, this level represents meaningful cross-sample applicability and suggests the models capture fundamental brain-cognition relationships rather than dataset-specific artefacts.
Generalizability was strongest between the two Human Connectome Project datasets, which shared similar protocols, compared to the independently conducted Dunedin Study. This pattern highlights both the promise and limitations of current approaches.
Clinical implications and future directions
The study establishes what the authors describe as “a valuable benchmark for how stacking can strengthen the use of brain MRI as a reliable and robust neural marker of cognitive function.” This has significant implications for both research and clinical applications.
In their discussion, the authors emphasise the potential for understanding stable cognitive traits: “If the aim of BWAS is to capture the stable trait of cognitive abilities, the current approach of stacking multimodal MRI data from one time point seems appropriate.”
The ability to predict childhood cognitive abilities from middle-aged brain scans is particularly intriguing, suggesting that neural signatures of cognitive capacity remain detectable decades later. This finding could inform our understanding of cognitive development and potentially identify early markers of cognitive decline.
However, the authors acknowledge important limitations. The inability to test task-based models across all datasets due to different protocols limits the generalizability assessment of the most predictive approaches. They recommend that researchers applying these models to new data “follow the procedures of the original datasets as much as possible.”
Methodological innovation drives progress
The success of the stacking approach reflects broader trends in computational neuroscience towards ensemble methods and multimodal integration. By combining information across different scales and types of brain measurement, researchers can capture a more comprehensive picture of brain-behaviour relationships.
The authors conclude that “combining different modalities of MRI into one prediction model via stacking seems to be a viable approach to realize this dream of cognitive neuroscientists” of reliably associating cognitive abilities with brain variations.
This research represents a significant advance in brain-based prediction of cognitive abilities, offering a methodological framework that could transform how neuroscientists approach individual differences research. As the field moves towards more robust and generalisable findings, the stacking approach provides a promising pathway for developing clinically useful neural markers of cognitive function.
Reference:
Tetereva, A., Knodt, A. R., Melzer, T. R., et. al. (2025). Improving predictability, reliability, and generalizability of brain-wide associations for cognitive abilities via multimodal stacking. PNAS Nexus, 4(6), pgaf175.
https://doi.org/10.1093/pnasnexus/pgaf175




