Reviewer #2 (Public Review):
The purpose of this study is to develop a tool that serves as a starting point for investigating and uncovering genes and pathways associated with aging. The tool utilizes information from the GTEx public database, which contains post-mortem human data. It focuses on identifying age-related gene expression changes across different age range, biological sexes, and medical histories, with a focus on specific tissues.
Additionally, the authors envision the platform as continuously evolving, with ongoing development and expansion to include new data and features, ensuring it remains a cutting-edge resource for researchers studying aging.
# Strengths<br /> voyAGEr presents a tool for exploring gene expression changes across multiple tissues in the context of aging. One of the main strengths of the tool is its intuitive and user-friendly interface, which allows for easy navigation and exploration of gene expression patterns for biologists. Users can explore changes in gene expression of single genes across multiple tissues, enabling them to identify genes of interest that can be further investigated.
A particularly noteworthy strength of the tool is its ability to show tissue-specific gene expression patterns. This feature is essential for elucidating the paradigm of tissue-specific asynchronous aging and provides a unique and valuable resource for the aging community.
Overall, the tool offers an entry point for further investigation of genes involved in aging, and its ability to show tissue-specific gene expression patterns provides a unique and valuable resource for the scientific community.
Lastly, the tool is accompanied by a clear and thorough tutorial that explains each of its functionalities and provides examples. The authors also acknowledge the limitations of the statistical inference tests used in the tool, which adds to its overall transparency.
# Weaknesses
## Underlying data analysis<br /> In this tool/resource paper, it is crucial that the data used is up-to-date to provide the most comprehensive and relevant information to users. However, the authors utilized GTEx v7, which is an outdated (2016) version of the dataset. It is worth noting that GTEx v8 includes over 940 individuals, representing a 35% increase in individuals, and a 50% increase in the total number of samples. The authors should check the newer versions of GTEx and update the data.
The authors did not address any correction for batch effects or RNA integrity numbers, which are known to affect transcriptome profiles. For instance, our analysis of GTEx v8 Cortex tissue revealed that after filtering out lowly expressed genes, in the same way authors did, PC1 (which accounts for 24% of the variation) had a Spearman's correlation value of 0.48 (p<6.1e-16) with RNA integrity number.
The data analyzed in the GTEx dataset is not filtered or corrected for the cause of death, which can range from violent and sudden deaths to slow deaths or cases requiring a ventilator. As a result, the data may not accurately represent healthy aging profiles but rather reflect changes in the transcriptome specific to certain diseases due to the age-related increase in disease risk. While the authors do acknowledge this limitation in the discussion, stating that it is not a healthy cohort and disease-specific analysis is not feasible due to the limited number of samples, it would be useful for users to have the option to analyze only cases of fast death, excluding ventilator cases and deaths due to disease. This is typically how GTEx data is utilized in aging studies. Alternatively, the authors should consider including the "cause of death" variable in the model.
The age distribution varies across tissues which may impact the results of the study. The authors' claim that age distribution does not affect the outcomes is inconclusive. Since the study aims to provide cross-tissue analysis, it is important to note that differing age distributions across tissues can influence the overall results. To address this, the authors should conduct downsampling to different age distributions across tissues and evaluate the level of tissue-specific or common changes that remain after the distributions are made similar.
The GTEx resource is extremely valuable, however, it comes with challenges. GTEx contains tissue samples from the same individuals across different tissues, resulting in varying degrees of overlap in sample origin across tissues as not all tissues are collected for all individuals. This could affect the similar/different patterns observed across tissues. As this tool is meant for broader use by the community, it is crucial for the authors to either rule out this possibility by conducting a cross-tissue comparison using a non-parametric model that accounts for the dependency between samples from the same individual, or to provide information on the degree of similarity between samples so that the users can keep this possibility in mind when using the tool for hypothesis generation.
## Visualisation and analysis platform<br /> The authors aimed to create an open-source and ever-evolving resource that could be adapted and improved with new functionality. However, this goal was only partially achieved. Although the code for the web app is open source, crucial components such as the statistical tests or the linear model are not included in the repository, limiting the tool's customizability and adaptability.
Furthermore, the authors' choice of visualization platform (R shiny) may not be the best fit for extensibility and open-source collaboration, as it lacks modularity. A more suitable alternative could be production-oriented platforms such as Flask or FastAPI.
To facilitate collaboration and improve the tool's adaptability, data resulting from the pre-processing pipeline should be made publicly available. This would make it easier for others to contribute and extend the tool's functionality, ultimately enhancing its value for the scientific community.