Motivation
The questions treated in the paper are very important and very topical. Specifically, the authors set out to find out
- how much information is contained in weak lensing bispectra,
- how much covariances change due to the inclusion of non-Gaussian statistics,
- if a new term called halo sample variance matters.
These questions are very important because much of the weak lensing signal is generated on small scales by nonlinear structures. Commonly, one increases the variance of the density on small scales in an empirical way but does not account for the deviation from a Gaussian distribution. In order to give the reader some sense of scale: The weak lensing data set from Euclid will have a total statistical significance of 1000 sigma, 80% of which originate from nonlinear structures.
Context
Nonlinear structure formation turns close to Gaussian initial conditions into a evolved density field that shows strongly non-Gaussian characteristics that are picked up by observables. As lensing depends (to lowest order) linearly on the density field, it inherits its statistics and therefore its non-Gaussianities. It is clear from the scaling behaviour of spectra and bispectra that they contain different parameter dependences and their combination should be able to improve cosmological constraints. Issues are the prediction of non-Gaussian properties of the lensing signal, the exact increase of the variance of the density field, and even more difficult from a technical point of view, the covariance properties of estimators in the non-Gaussian limit.
Methods
The methods and approximations used (flat-sky, weak lensing spectra and bispectra in Limber's approximation, halo-model, full-sky coverage) are all appropriate for demonstrating the author's point.
Alternatives
I see the necessity of using the Fisher-matrix formalism in particular when dealing with the bispectrum and all non-Gaussian covariance contributions out of numerical feasibility. Likewise, all lensing approximations are justified (most importantly the flat-sky approximation, because non-Gaussianities are most prominent on small angular scales). Using the formalism in the context of a Monte-Carlo Markov-chain would be really difficult, but would relax the assumption that the models for spectra and bispectra depend linearly on the parameters and would give rise to a Gaussian parameter likelihood.
Criticism
The authors use a Planck-type CMB-prior for their results on the statistical errors, and one can see that the prior determines size and orientation of the Fisher-ellipses a lot - I would be very interested to see the "naked" lensing results for a tomographic survey, and with the non-Gaussian contributions to the covariances individually switched on: I would argue that this gives a much better feeling about the magnitude of the effect on parameter inference.
I like a lot that the authors provide a good deal of technical detail, in particular concerning permutations of bispectra with equal tomography bin indices and equal multipoles. I would have appreciated a more detailed justification about the selection of a comparatively small number of bispectrum configurations and and accuracy of binned summation, as well as more detail on the estimation process of the covariance matrix from numerical simulations - this is a particularly tricky issue and a lot of interesting developments have been done in the recent time.
The only point where I am a bit lost would be the justification of using the halo model for computing the signal (where everything is done consistently) but the restriction to 1-halo-terms for the higher-order contributions to the covariance. Again, I appreciate this restriction on reasons of feasibility, but it would have been nice to see this in comparison to a simulation. Or, on the other hand, the halo-model could be replaced by standard perturbation theory, but again I see how this would not be able to be a feasible model for the HSV-term.
One issue which I find very difficult (and where the authors are certainly not to blame) is that implicitly the estimates of the spectrum and bispectrum assume that those are estimated from a large number of modes - I wonder if differences at low multipoles, where only small numbers of modes are available, this effect might be similarly large compared to the effects discussed here.
Open questions
Personally, I find the HSV-term very interesting because it links large-scale and small-scale modes in a suble and funny way. I wonder if there might be other applications where analogous effects can play a role.