Reviewer #2:
The manuscript introduces a computational account of meta-control in value-based decision making. According to this account, meta-control can be described as a cost-benefit analysis that weighs the benefits of allocating mental effort against associated costs. The benefits of mental effort pertain to the integration of value-relevant information to form posterior beliefs about option values. Given a small set of parameters, as well as pre-choice value ratings and pre-choice uncertainty ratings as inputs to the model, it can predict relevant decision variables as outputs, such as choice accuracy, choice confidence, choice induced preference changes, response time and subjective effort ratings. The study fits the model to data from a behavioral experiment involving value-based decisions between food items. The resulting behavioral fits reproduce a number of predictions derived from the model. Finally, the article describes how the model relates to well-established accumulator models, such as the drift diffusion model or the race model.
Before I get into more detailed comments, I would like to highlight that this work addresses a timely and heavily debated subject, namely the role of cognitive control (or mental effort) in value-based decision making (see Shenhav et al., 2020). While there are plenty of models explaining value-based choice, and there is a growing number of computational accounts concerning effort-allocation, little theoretical work has been done to relate the two literatures (but see Major Comment 1). This work contributes a novel and interesting step in this direction. Moreover, I had the impression that the presented model can account for a broad range of behavioral phenomena and that the authors did a commendable amount of work to validate the model (but see Major Comments 2 and 3). The manuscript is also well written in that it seems accessible to a broad audience, including non-technical readers. However, while I remain curious about what the other reviewers have to say, the manuscript misses to address a few issues that I elaborate below.
Major Comments:
1) Model Comparison(s): While the manuscript compares the presented computational approach to existing accumulator models, it could situate itself better in the existing literature, ideally in the form of formal model comparisons. For instance, as someone less familiar with choice-induced preference changes in value-based decision making, I wonder how the model compares to existing computational work on this matter, e.g. the models described in Izuma & Murayama (2013) or the efficient coding account of Polanía, Woodford, & Ruff (2019). I do understand that the presented model can account for some phenomena that the other models cannot account for, at least without auxiliary assumptions (e.g. subjective effort ratings), but the interested reader might want to know how well the presented model can explain established decision-related variables, such as decision confidence, choice accuracy or choice-induced preference changes compared to existing models, by having them contrasted in a formal manner. Finally, it would seem fair to compare the presented account to emerging, more mechanistically explicit accounts of meta-control in value-based decision making (e.g. Callaway, Rangel & Griffiths, 2020; Jang, Sharma, & Drugowitsch, 2020). As these approaches are still in preprint, it may not be necessary to relate them in a formal model comparison. However, the manuscript might benefit from discussing how these approaches differ from the presented model in the text.
2) Fitting Procedure: This comment concerns the validation of the described model based on its fits to behavioral data. If I understand correctly, the authors first fit the model to each participant while "[a]ll five MCD dependent variables were [...] fitted concurrently with a single set of subject-specific parameters" and then evaluate whether model fits match the predicted qualitative relationship between experimental variables (e.g. pre-choice value ratings and pre-choice confidence ratings) and dependent variables (e.g. choice accuracy). I'm happy to be convinced otherwise, but it appears that the model's predictions could be tested in a more stringent manner. That is, it doesn't appear compelling to me that the model, once fitted, matches the behavior of participants -- please note that this is not to diminish the value of the results; I still think that these results are valuable to include in the manuscript. Instead, rather than fitting the model to all dependent variables at once, it would be more compelling to fit the model to a subset of established decision-related variables (e.g. accuracy, choice confidence, choice induced preference changes) and then evaluate how the fitted model can predict out-of-sample variables related to effort allocation (e.g. response time and subjective effort ratings). Again, I am happy to be convinced otherwise but the latter would seem like a much more stringent test of the model, and may serve to highlight its value for linking variables related to value-based decision making to variables related to meta-control.
3) Parameter Recoverability: Given that many of the results rely on model fits to human participants, it would seem appropriate to include an analysis of parameter recoverability. That is how well can the fitting procedure recover model parameters from data generated by the model? I apologize if I missed this, but the manuscript doesn't appear to report this kind of analysis.
References:
Callaway, F., Rangel, A., & Griffiths, T. L. (2020). Fixation patterns in simple choice are consistent with optimal use of cognitive resources. PsyArXiv: https://doi.org/10.31234/osf.io/57v6k
Izuma, K., & Murayama, K. (2013). Choice-induced preference change in the free-choice paradigm: a critical methodological review. Frontiers in psychology, 4, 41.
Jang, A. I., Sharma, R., & Drugowitsch, J. (2020). Optimal policy for attention-modulated decisions explains human fixation behavior. bioRxiv: 2020.2008.2004.237057.
Polania, R., Woodford, M., & Ruff, C. C. (2019). Efficient coding of subjective value. Nature neuroscience, 22(1), 134-142.
Shenhav, A., Musslick, S., Botvinick, M. M., & Cohen, J. D. (2020, June 16). Misdirected vigor: Differentiating the control of value from the value of control. PsyArXiv: https://doi.org/10.31234/osf.io/5bhwe