Reviewer #2:
In this paper the authors report data from a series of online and one neuroimaging study in which participants played a simple game in which they had to select between a sure outcome and a gamble. Participants reported their current mood throughout the game and the authors compared the performance of a number of models of how the mood ratings were generated. They focus on two models, a standard model which assumes that participants' expectations assume a 50:50 gamble and an adapted model that uses average experienced outcomes as the expected value. They frame these models in terms of recency vs. past weighting and suggest that the results provide evidence in favour of a higher weight of earlier events on reported mood.
The question of how humans combine experienced events into reported mood is topical. This paper takes an interesting approach to this issue.
I struggled a bit to understand the logic of some of the arguments in the paper, in part because important experimental and methodological detail is missing. I list my points below. The overriding question is, I think, how certain we can be that the results reported by the authors reflect a true primacy effect, as opposed to some other process (e.g. just learning an expected value) that appears in this case to be a primacy effect.
1) I didn't really understand where the weights from the primacy graph in Figure 1B came from. The recency weights make sense-there is a discount factor in the model that is less than 1, so there is an exponential discount of more distant past events. However, for the primacy model the expectation is calculated as the mean (apparently arithmetic mean) of previous outcomes (which suggests a flat weight across previous trials) and the discount factor remains-so how does this generate the decreasing pattern of weights? It would be really useful if the authors could spell this out.
2) The models seem to differ in terms of whether they learn about the expected value of the gamble outcomes or whether they assume a 50:50 gamble (the recency model assumes this, the primacy model generates an average of all experienced outcomes). Might the benefit of the primacy model when explaining human behaviour simply be that people use experienced outcomes to generate their expectations rather than taking stated outcome probabilities as absolutes? In other words, it is not so much that people place more weight on earlier events, but that they learn.
3) Linked to the above, the structured and adaptive environments seem to have something to learn (blocks with positive vs. negative RPEs), so it is perhaps not surprising that humans show evidence of learning here and a model with some learning outperforms one with none. The description of these environments isn't really sufficient at present-please explain how RPEs were manipulated (was it changing the probability of win/loss outcomes, if so, how? Or was it changing the magnitude of the options? For the adaptive design was the change deterministic? So was the outcome, and thus RPE, always positive if mood was low, or was this probabilistic and if so with what probability?). Also, did the recency model still estimate its expectations here as 50:50, even when (if) this was not the case? If so, can the authors justify this?
4) What were participants told about the gambles (i.e. were they told they were 50:50, including in structured/adaptive environments)?
5) Please report the estimated parameter values of the models (and tell us where the common parameters differed between models). This would help in understanding how they are behaving.
6) In addition to changing the expectation term of the recency model, the primacy model also drops the term of for the sure outcomes (because this improves the performance of the primacy model). Does this account for the relative advantage of the primacy over the recency model? i.e. if the sure outcome term is dropped from the recency model, does the primacy model still perform better?