Reviewer #2 (Public Review):
First, I want to congratulate the author team on this manuscript, which I read with great pleasure. I think this will be a fine addition to the literature!
The present MS by Clement et al. provides a comprehensive overview of the brain shapes of lungfishes. Besides previously known/described brain endocasts, the work includes models and descriptions of previously undescribed taxa. Notably, all CT data are deposited online following best practices when working with digital anatomy. The specimen sample is impressive, especially as the sampled material is housed in museum all over the world. Although the sample size may seem numerically low (12 taxa), this actually is a comprehensive sample of fossil (and extant) lungfishes in terms of what's preserved in the first place.
The study at hand has several goals: (1) The description of lungfish brains for taxa that were previously undescribed; (2) the quantification of aspects of brain shape using morphometric measurements; (3) the characterization of brain shape evolution of lungfishes using exploratory methods that ordinate morphometric measurements into a morphospace.
The provided 3D data and descriptions will serve as valuable comparisons in future lungfish work. This type of data is imperial for palaeontological studies in general, and the anatomical information will be extremely valuable in the future. For example, anatomical characters related to brain architecture have been shown to be informative about phylogeny in the past, and the presented data may inform future phylogenetic studies.
The quantification of brain shape via (largely linear) measurements is relatively simplistic, and can thus only detect gross trends in brain shape evolution among lungfishes. The authors describe several such trends - such as high variation in the olfactory brain region in comparison to other parts of the brain. The results and interpretations drawn from the authors are supported by their data, and the approach taken is valid, even if more sophisticated shape quantification methods (e.g. 3D landmarking) and analytical methods (e.g. explicit phylogenetic comparative methods) are available, which could provide additional insights in the future. The presented results and interpretations in this regard must be seen as a preliminary assessment of lungfish brain evolution, but it is clearly written and generally well performed.
A potential shortcoming of the paper is the lack of explicit hypothesis testing, which is not problematic per se, but puts limits on the conclusions the authors can draw from their data. For example, the authors state that different anatomical parts of the labyrinth (particularly, the utricle with respect to the semicircular canals or saccule) may show modular dissociation from other labyrinth modules, based on the polarity of eigenvalue signs of the PCA analysis. I think this is fine as a first approximation, but of course there are explicit statistical tools available to test for modularity/integration, such as two-block partial least squares regression analysis (Rohlf & Corti 2000, Syst. Biol.). I don't see the lack of usage of such methods as problematic, because you cannot do everything in one paper, and the authors remain careful in their interpretation. It may be advisable, however, to add the odd sentence or statement about how some findings are preliminary or hypothesized, and that these should receive further treatment and testing using other methods in the future. I think this approach is actually very rewarding, because then you can inspire future work by outlining outstanding research problems that arise from the new data presented herein.
In the following, I comment on a few aspects of the manuscripts. These represent instances where I had additional thoughts or ideas on how to slightly improve various aspects of the manuscript.
1. Presentation of PCA results
The authors provide several PCA analyses (preliminary analyses on partial matrices, BPCA, InDaPCA), and are very explicit about the procedures in general. For instance, I appreciate they explicitely state using correlation matrices for PCA analyses due to the usage of different measurement units among their data.
Visually, the BPCA and InDaPCA are presented in figures 2 and 3, whereas the preliminary partial matrix PCAs are only reported as supplementary figures. While I don't object to any of this, I find the sequence of information given in the results section suboptimal.
The authors start by discussing the partial matrix analyses, although none of these analyses are visually/graphically depicted in the main text figures, and although their results do not seem to be of real importance for the narrative of the discussion. The other two PCA analyses actually are presented afterwards and separately, but they convey some common signals, particularly that the major source of variation seems to be a decreasing olfactory angle with increasing olfactory length, and a scaling relationship between all linear measurements (which all have the same eigenvector signs on the first PC axis). I wonder if an alternative way of presenting the PCA results would be better for this particular MS. For example, the authors could give "first level observations" first ("PCA analyses agree in X,Y,Y"), and then move to second order observations ("Morphospace of BPCA has some interesting taxon distribution with regard to chirodipterids"; "InDaPCA axis projections continuously retrieve clustering of specific variables"). I suspect this would shorten the text somewhat and could serve as a clearer articulation of the take home messages?
2. Selection of PC axes for interpretation
You describe how you use the broken-stick method to decide how many PC axes are retained for the interpretation of results, which I agree is a good procedure. However, I have a few questions regarding this.
First, in line 331 (description of InDaPCA) you state that the first three axes are non-trivial "based on the screeplot" - which got me confused because it sounds a bit like eyeballing off the screeplot. Have you used the broken stick method for all your PCA analyses?
The second question relates to the results of the broken stick method, which I did not find reported. Unless I am mistaken, for the xth axis, the method sums the fractions of 1/i (whereby i = x..n; n = number of axes), and divides this number by n to get a value of expected variation per axis. This number is then compared with the actual value of variance explained by the axis. So for the 1st of 17 axes, the broken-stick expectation is = (1 + 1/2 + .. + 1/17) / 17. If you apply this to your BPCA, the third axis' value (i.e., (1/3 + ... + 1/17)/17) is 0.114, which is smaller than the reported 0.120 that PC3 explains. Thus, following the broken stick method, PC3 does explain more variation that expected (and should thus be retained, contra your comment in line 311 which refers to two non-trivial axes)? Related to this potential issue is the presentation of the BPCA results in Fig. 2: You present loadings of three PC axes, although only the first two are considered in morphospace bi-plots and although the text also mentions only two non-trival axes. If the third axis is indeed non-trivial, then the loading-presentation could be retained in the figure, but then the authors should consider showing a PC1 vs. PC3 plot in addition to the currently presented biplot showing the first and second axis only. If the third axis indeed is trivial, as currently suggested by the text, then showing the loadings is unnecessary.
It would be great if you clarify the usage/application of the broken stick method for all your PCAs. An easy way to report the results may be the add a row to each of your PCA loading tables in the supplements, in which you divide the actual value of variation explained by the value expected under the broken stick method - this way, all axes which explain more variation than expected by the stick method have values larger than 1, and axes which explain less have values lower than 1.
3. Missing commentary on allometry
In basically all PCA analyses, the first PC axis seems to be dominated by allometric size effects, given that all linear measurements have the same eigenvalue signs. The authors do acknowledge this (lines 314-316; 335-336), but offer no further comment on size effects/allometry. For example, it would be interesting to see how the linear measurements scale with overall head size. Similarly, the authors note that the semicircular canal measurements covary strongly, as do the utricle and saccule height/length measurements (paragraph line 346). Basically, it seems that the semicircular canal measurements scale with one another: as one gets bigger, so gets the other. It is interesting that the utricle does not seem to follow the same scaling pattern as the saccule and semicircular canals, and it would be good to hear if the authors think that there is a functional implication for this. Increases in utricular/saccular/semicircular canal sizes are usually explained by increased sensitivity - so is an increased utricular size a compensatory development to decreased semicircular canal+saccule size to retain an overall level of sensitivity, or does it maybe related to a relative change of importance of the specific functions, e.g. increased importance of linear accelerations in the horizontal plane with simultaneous decrease of importance of angular and vertical accelerations?
4. Labyrinth size
With the above mentioned utricular exception, labyrinth size measurements particularly on the semicircular canals seem to imply that there is a relative consistent scaling relationship between the canals. When one canal gets larger, so do the others, perhaps thereby retaining canal symmetry across different absolute labyrinth sizes. Labyrinth size in tetrapods is often interpreted in relation to body size/mass or head size (e.g. Melville Jones & Spells 1963, Proc. R. Soc. Lond. Biol. Sci.; Spoor & Zonneveldt 1998, Yearb. Phys. Anthr.; Spoor et al. 2002, Nature; Spoor et al. 2007, PNAS; Bronzati et al. 2021, Curr. Biol.), as deviations from the expected labyrinth size per head size indicate increased or decreased relative labyrinth sensitivities. Large relative head sizes of birds and (within) mammals have generally been interpreted as indicative of "active" or "agile" behaviour, although doubt has been casted on these relationships recently (e.g., Bronzati et al. 2021). Increased sampling of relative labyrinth size from various vertebrate groups would be important to better understand labyrinth size-function relationships. Melville Jones & Spells (1963) have shown that fishes have large labyrinth sizes compared to most tetrapods, but they don't have lungfish data and the large labyrinth sizes of fishes have often remained uncommented on in tetrapod works. I think this study offers a fantastic opportunity to provide comparative labyrinth size data for lungfishes. In this regard, it would be really interesting to quantify labyrinth size relative to head size, and show a respective (phylogenetic) regression analysis. Ideally, the size of the labyrinth could be quantified along the arc lengths of the semicircular canals, but other ways are also thinkable (for example a box volume of labyrinth size by the existing measurements, contrasted with a box volume of the skull, i.e. height*width*length).