Reviewer #1 (Public Review):
Summary and strength:
The authors undertook to assemble and annotate the genome sequence of the Malabar grouper fish, with the aim of providing molecular resources for fundamental and applied research. Even though this is more mainstream, the task is still daunting and labor-intensive. Currently, high-quality and fully annotated genome sequences are of strategic importance in modern biology. The authors make use of the resource to address the endocrine control of an ecologically and developmentally relevant life cycle transition, metamorphosis. As opposed to amphibian and flat fish where body plan changes, fish metamorphosis is anatomically more subtle and much less known, although it is clear that thyroid hormone (TH) signaling is a key player. The authors thus provide a repertoire of TH-relevant gene expression changes during development and across metamorphosis and correlate developmental stages with changes in gene expression. Overall, this work has a strong potential to meet its target.
Weaknesses:
The manuscript needs proper editing and is not complete. Some wordings lack precision and make it difficult to follow (e.g. line 98 "we assembled a chromosome-scale genome of ..." should read instead "we assembled a chromsome-scla genome sequence of ...". Also, panel Figure 2E is missing.
The shortcomings of the manuscripts are not limited to the writing style, and important technical and technological information is missing or not clear enough, thereby preventing a proper evaluation of the resolution of the genomic resources provided:
- Several RNASeq libraries from different tissues have been built to help annotate the genome and identify transcribed regions. This is fine. But all along the manuscript, gene expression changes are summarized into a single panel where it is not clear at all which tissue this comes from (whole embryo or a specific tissue ?), or whether it is a cumulative expression level computed across several tissues (and how it was computed) etc. This is essential information needed for data interpretation.
- The bioinformatic processing, especially of the assemble and annotation, is very poorly described. This is also a sensitive topic, as illustrated by the numerous "assemblathon" and "annotathon" initiatives to evaluate tools and workflows. Importantly, providing configuration files and in-depth description of workflows and parameter settings is highly recommended. This can be made available through data store services and documents even benefit from DOIs. This provides others with more information to evaluate the resolution of this work. No doubt that it is well done,<br /> but especially in the field of genome assembly and annotation, high resolution is VERY cost and time-intensive. Not surprisingly, most projects are conditioned by trade-offs between cost, time, and labor. The authors should provide others with the information needed to evaluate this.
- Quantifications of T3 and T4 levels look fairly low and not so convincing. The work would clearly benefit from a discussion about why the signal is so low and what are the current technological limitations of these quantifications. This would really help (general) readers.
- Differential analysis highlights up to ~ 15,000 differentially expressed genes (DEG), out of a predicted 26k genes. This corresponds to more than half of all genes. ANOVA-based differential analysis relies on the simple fact that only a minority of genes are DEG. Having >50% DEG is well beyond the validity of the method. This should be addressed, or at least discussed.