than
Related to my previous comment in this sentence: after McMurray's review of our paper, we did end up including a "sensitivity analysis" in the preprint, in which we used a magnitude based threshold; specifically, 0.075 logits, even though 0.05 or 0.1 logits (or all three) would be possible. We now discuss in the preprint that there are advantages to using a significance-based (or, as we call it now, evidence-based criterion), which are the ones I detail in my other comment here. However, we recognise that a magnitude-based criterion could be appropriate, and perhaps especially if there are differences in noise between conditions. It is likely that the high quality camera will produce less noisy estimates. This can perhaps lead to earlier onsets. However, the estimated curves over time may still be similar between the two conditions. That is, the two curves (in the two quality conditions) could cross the magnitude threshold at approximately the same time, but they would cross the evidence based criterion at different times. If that is the case, perhaps a magnitude-based threshold would be more appropriate... I understand we don't want to preregister many different analyses (it's already quite extensive as it is), but that kind of magnitude-based sensitivity could enhance the paper.
I copy here the full passage in our preprint discussing the two criteria: "We emphasise that, in our method, the criterion for onset detection is reached only when the predicted trajectory reliably exceeds the baseline value, operationalised here by the use of a one-sided confidence interval. This kind of definition has three related advantages. First, it anchors our onset criterion to evidential strength, because onsets are detected only when the baseline value is exceeded with enough certainty. This prevents small fluctuations due to measurement noise from being treated as onsets. Second, as is the case for sequential testing approaches (e.g., Ben-David et al., 2011; Ito et al., 2018) and for the bootstrap-based method (Stone, Lago, & Schad, 2021), our method does not require specifying an arbitrary threshold, thereby reducing researcher degrees of freedom. Third, because the criterion is standardised (i.e., tied to the standard error on the modelled scale), it is broadly portable across tasks and designs, effects of different magnitudes, and model specifications. By contrast, absolute magnitude thresholds and percent-of-maximum thresholds have often required study-specific criteria in different visual world studies (Galle et al., 2019; McMurray, Clayards, et al., 2008; Reinisch & Sjerps, 2013).
On the other hand, a limitation of using an evidence-based criterion is that onset estimates can depend on the uncertainty around the estimated fixation curves. For example, studies with greater uncertainty (e.g., with smaller sample sizes or greater variability) may yield later onsets, because greater uncertainty may delay when the criterion is satisfied. For this reason, we have additionally implemented a version of the GAMM-based method that uses a magnitude threshold, in which an onset is detected when the baseline is exceeded by a fixed amount. We use the magnitude criterion as a robustness check in Studies 1 and 2 (Appendix S1 in the Supplementary Materials). In addition, we evaluate the extent to which our default criterion is indeed dependent on sample size and variability in the simulations presented in Study 3."