584 Matching Annotations

Sep 2020
Local file Local file

pcbi.1003364 1..14

4
1. sven.wientjes 16 Sep 2020
  
  in Public
  
  executed one after another without considering feedbackfrom the environment during the sequence
  
  The hierarchical account states we do not take the second stage context into account and simply pick what we already decided before we even saw the second stage context.
2. sven.wientjes 16 Sep 2020
  
  in Public
  
  action control in stage two should not dependon stage one
  
  This is key to the whole setup of the two-step markov decision task right: Once we have arrived in the second stage, we do select based on the known context, regardless of MB or MF RL. The effect happens on the switching the next trial.
3. sven.wientjes 16 Sep 2020
  
  in Public
  
  Here we show that first stage habitual actions, explained by themodel-free evaluation in previous work, can also be explained byassuming that first stage actions chunk with second stage actions
  
  So this does not actually account for the model-based behaviour, which we hope we can build
4. sven.wientjes 16 Sep 2020
  
  in Public
  
  may best be viewed as actionsequence
  
  So a habit - something learned according to a model-free scheme - can trigger an action sequence
Annotators

sven.wientjes
Local file Local file

Electrophysiological correlates of state transition prediction errors

6
1. sven.wientjes 16 Sep 2020
  
  in Public
  
  participant could use this information to select the action that has a relatively high expected valueon common transitions. Thus, arare transition would lead to a state with lower expected value,yieldinga negative RPE
  
  This seems very likely to me! Is there some natural way to correct for this? Current estimation of expected value as a covariate, regress it out, or something?
2. sven.wientjes 16 Sep 2020
  
  in Public
  
  analysing frequency of stay on the first step choice should reveal an interaction effect between transition type and second choice feedback outcome in the preceding trial on the frequency of repeating the first step choice
  
  Participants have selected one spaceship because they like its major planet -> move to minor planet -> obtain reward -> select alternative spaceship (win-shift)
  
  Move to minor planet -> obtain no reward -> select same old spaceship again (you want to go to major planet) = lose-stay
3. sven.wientjes 16 Sep 2020
  
  in Public
  
  Thesefindings implicate the aMCC in the processing of theSPEs
  
  Fronto-medial theta has before already been 'localized' to the aMCC (likely)
4. sven.wientjes 16 Sep 2020
  
  in Public
  
  Predicted Response-Outcome(PRO)
  
  But this one model did say it should! (no data)
5. sven.wientjes 16 Sep 2020
  
  in Public
  
  Gläscher et al. (2010) conducted an fMRI experiment usinga paradigm that featuredcommon and uncommon transitions and found that the intraparietal sulcus and lateral PFC are sensitive toSPEs
  
  So empirical results show: aMCC does not do this
6. sven.wientjes 16 Sep 2020
  
  in Public
  
  transition model
  
  e.g. a successor representation (but not limited to this)
Annotators

sven.wientjes
Local file Local file

Untitled document

9
1. sven.wientjes 16 Sep 2020
  
  in Public
  
  ACC appears to encodehierarchical structure with distributed representations, which makes the parcellation problemeven harder
  
  It is not immediately obvious how hierarchy can be encoded in continuous time distributed neural representations. This is a different beast from a symbolic AI algorithm that updates a and then draws b etc.
2. sven.wientjes 16 Sep 2020
  
  in Public
  
  Individual ACC neurons seemcapable of responding to most task events, with particular mixtures of sensitivities within andacross neurons continually reallocated according to changing task conditions [72]
  
  Very cool target for a modeling study?
3. sven.wientjes 16 Sep 2020
  
  in Public
  
  phasic bursts of norepinephrine, which may serve as a neural-interrupt signal[67], can reset network activity in ACC [68] and thus allow for module re-binding
  
  So the LC-NE system can 'reprogram' the communication through the ACC?
4. sven.wientjes 16 Sep 2020
  
  in Public
  
  Specifically, when task demands are high (e.g., afteran error), ACC would send a synchronizing signal to lower-order modules, with consequentsynchronization and thus improved communication between those lower-order modules
  
  ACC is now also the great orchestrator of communication everywhere - What is left for the dlPFC?
5. sven.wientjes 16 Sep 2020
  
  in Public
  
  ACC motivates sticking to a plan
  
  Framing the ACC for extended control of sequences thus states that it keeps track of how much of this cost of planning would likely still be worth it. This is basically the same idea as the 'expected value of control' theory, although the function of ACC is expanded upon much by HMB-HRL theory.
6. sven.wientjes 16 Sep 2020
  
  in Public
  
  At face value, such a self-regulating controlmechanism is both computationally [48] and evolutionarily [49] maladaptive
  
  No! A self-regulating control systems should sometimes turn itself off! This is the whole reason we have cost added into the mix.
7. sven.wientjes 16 Sep 2020
  
  in Public
  
  feedback-based control mechanisms constitute thebread-and-butter of control theory in engineering (Box 1), but these always concern theregulation of subordinate systems, never self-regulation
  
  So a theory in which the ACC adapts its own control by detecting conflict is not 'natural' from an engineering standpoint - it should modify subordinate systems?
8. sven.wientjes 16 Sep 2020
  
  in Public
  
  For example, aprominent computational model of ACC contains units that exhaustively predict all possiblestates of the task environment, generating prediction errors to unexpected transitions; thoughnot explicitly used in the model for this purpose, in principle the prediction errors can providelearning signals for MB-RL [47]
  
  ACC-dlPFC theory as a super-learner for all unexpected events? Sound a bit predictive-processing ish
9. sven.wientjes 16 Sep 2020
  
  in Public
  
  ACC could use such models to plan overtemporally extended action sequences
  
  So it would take a planning function, which is in much literature associated with the HPC? How do the two interact, seems like a very relevant question!
Annotators

sven.wientjes
Local file Local file

Linear reinforcement learning: Flexible reuse of computation in planning, grid fields, and cognitive control

16
1. sven.wientjes 16 Sep 2020
  
  in Public
  
  can be useful in the context of multitask learning to extract useful, reusable policies
  
  The DR is some sort of generalized representation of shared structure of a task (family)?
2. sven.wientjes 16 Sep 2020
  
  in Public
  
  how it is learned
  
  There is no clear proposal yet as to how the DR would be learned, it could be very similar to SR?
3. sven.wientjes 16 Sep 2020
  
  in Public
  
  default policy plays the role of prior over policy space and rewards play the role of the likelihood function
  
  Options also function as some sort of prior over action selection potentially?
4. sven.wientjes 16 Sep 2020
  
  in Public
  
  empiricallyunderconstrainedtheoreticalflexibilityinspecifyinghowatask’sstatespaceshouldbe.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted May 5, 2020. . https://doi.org/10.1101/856849doi: bioRxiv preprint
  
  Exactly a problem of PFC research - empirically underconstrained in what should be represented
5. sven.wientjes 16 Sep 2020
  
  in Public
  
  finite decision problem
  
  So linear RL cannot be a general theory of open-ended lifelong learning
6. sven.wientjes 16 Sep 2020
  
  in Public
  
  distinguishing between terminal states (representing goals), and nonterminal states (those that may be traversed on the way to goals)
  
  How exactly is this implemented, and what does this mean for our RNN architecture, which has a goal-representation space separately?
7. sven.wientjes 15 Sep 2020
  
  in Public
  
  “control cost,” KLL𝜋||𝜋MN, whichisincreasinginthe dissimilarity (KL divergence) between the chosen distribution 𝜋andsomedefaultdistribution,𝜋M.
  
  Control cost inherent in the model! Can be linked to expected-value of control model very naturally
8. sven.wientjes 15 Sep 2020
  
  in Public
  
  assumea one-to-one, deterministic correspondence between actions and successor states
  
  The next state is directly and exclusively dependent on the action we take
9. sven.wientjes 15 Sep 2020
  
  in Public
  
  The only way to find the latter using equation(2)is by iteratively re-solving the equation to repeatedly update 𝜋and𝐒untiltheyeventuallyconvergeto𝜋∗
  
  Policy iteration (solving through search)
10. sven.wientjes 15 Sep 2020
  
  in Public
  
  assuming that all choices are made followingpolicy 𝜋
  
  Since in practice, the state transition function is likely to be dependent on the chosen actions, it is wise to note it as S(pi), as the long-run state visits depend on the action selection according to this fixed policy
11. sven.wientjes 15 Sep 2020
  
  in Public
  
  The default policy and cost term introduced to make linear RL tractable offers a natural explanation for these tendencies,quantifies in units of common-currency reward how costly it is to overcome them in different circumstances,and relatedly offersa novel rationale and explanation for a classic problem in cognitive control: the source of the apparent costs of “control-demanding” actions
  
  Deviations from the default policy are what constitute 'control demanding' actions? => So the more we deviate from this, the more dACC activity we can expect, something like this??
12. sven.wientjes 15 Sep 2020
  
  in Public
  
  SR theory predicts grid fields must continually change to reflect updated successor state predictions as the animal’s choice policy evolves, which is inconsistent with evidence
  
  Entorhinal grid cells have this 'fourier-domain map of task space' but do not continuously change their representations to fit with different goals - as would be necessary under vanilla SR theory
13. sven.wientjes 15 Sep 2020
  
  in Public
  
  Fourier-domain map of task space
  
  Figure out what this means exactly...
14. sven.wientjes 15 Sep 2020
  
  in Public
  
  stable and useful even under changes in the current goalsand the decision policy they imply
  
  How exactly would they achieve this difference from the SR? Long-run state expectancies sieems to be the definition and also the problem of SR
15. sven.wientjes 15 Sep 2020
  
  in Public
  
  For instance, a change in goals implies a new optimal policy that visits a different set of states, and a different SR is then required to compute it.
  
  This is exactly what an option would look like!
16. sven.wientjes 15 Sep 2020
  
  in Public
  
  However, it simply assumes away the key interdependent optimization problem by evaluating actions under a fixed choice policy(implied by the stored state expectancies)for future steps.
  
  Assumes fixed, constant, probabilities of future state visits
Annotators

sven.wientjes
Local file Local file

Neural representations of task context and temporal order during action sequence execution

8
1. sven.wientjes 12 Sep 2020
  
  in Public
  
  The present study confirms that the aMCC’s distributed code for temporal information is not sufficiently consistent across blocks and sequence types to be detectable using the ROI classification approach, revealing only a weak effect size in the generalization analysis.The two approaches therefore appear to provide complementary information
  
  Re-examine the methodology of the RSA in the aMCC study - is it very tailored?
2. sven.wientjes 12 Sep 2020
  
  in Public
  
  domain-general role to pars orbitalis in learning the relationship between environmental events and transition probabilities between various environmental states
  
  Is this sucessor representation learning??
3. sven.wientjes 12 Sep 2020
  
  in Public
  
  By contrast, inconsistentwith the past literature, in the ROI analysis, we did not find evidence for involvement of the aMCC and hippocampus
  
  So basically our theory is already under heavy scrutiny? Exactly the opposite of what we want to see!
4. sven.wientjes 12 Sep 2020
  
  in Public
  
  whether the stir action was performed in a tea or a coffee sequence, irrespective of the sequence position of the stir actio
  
  Only context, no temporal information
5. sven.wientjes 12 Sep 2020
  
  in Public
  
  discriminate between the first and second instance that the stir action was performed
  
  Temporal information -> progression through sequence information (regardless of coffee vs tea task)
6. sven.wientjes 12 Sep 2020
  
  in Public
  
  first defining functional or anatomical ROIs based on subject-specific data
  
  So it is very theoretically based, not like a random cluster search across all voxels
7. sven.wientjes 12 Sep 2020
  
  in Public
  
  serial rank order
  
  This is somewhat different from generic information maintanence in WM?
8. sven.wientjes 12 Sep 2020
  
  in Public
  
  contextual information
  
  So this is basically the same as Working Memory
Annotators

sven.wientjes
Local file Local file

Prefrontal reinstatement of contextual task demand is predicted by separable hippocampal patterns

5
1. sven.wientjes 07 Sep 2020
  
  in Public
  
  At the same time,we observed that dlPFC reinstatement of CTD positively scaledwith the hippocampal pattern similarity between the two over-lapping contexts
  
  So in dlPFC the CTD did have similarity over the two contexts, even though they were distinct in the HPC?
2. sven.wientjes 07 Sep 2020
  
  in Public
  
  hippocampal differentiationeffect
  
  More similar task demands should yield increasingly DIFFERENT HPC representations? Because we separate them?
3. sven.wientjes 07 Sep 2020
  
  in Public
  
  congruency: match/mismatch between the CTD and the actualtask demands on the trial)
  
  Some trials in context 1/2 will have task demand associated with 3/4. This is 'incongruent'
4. sven.wientjes 07 Sep 2020
  
  in Public
  
  ask-setsinclude additional instructions on“how”
  
  Task sets are conceptually different from the process that can identify the task set based on cueing - that is an associative / semantic process?
5. sven.wientjes 07 Sep 2020
  
  in Public
  
  proactively retrieves probabilistically likely task-sets
  
  The cueing of task sets - dont conceptualize this as being even higher in the hierarchy?
Annotators

sven.wientjes
Local file Local file

Interaction between neuronal encoding and population dynamics during categorization task switching in parietal cortex

13
1. sven.wientjes 07 Sep 2020
  
  in Public
  
  Black dots indicate stable fixed points
  
  You can see in the DMC the RNN has created 3 stable states it can occupy - not only the fixation at the start. Also two for the sample->delay moment, dependent on the category of the sample stimulus!
2. sven.wientjes 07 Sep 2020
  
  in Public
  
  time (ms)
  
  You can see accurate decoding along a wider spectrum of time - stable maintenance of information!
3. sven.wientjes 07 Sep 2020
  
  in Public
  
  stable states associated with each category at the end of the sample period in the DMC task
  
  This really is maintanence of classificatory information after the first stimulus!
4. sven.wientjes 07 Sep 2020
  
  in Public
  
  test stimulus
  
  The second stimulus is the 'test stimulus'
5. sven.wientjes 07 Sep 2020
  
  in Public
  
  sample stimulus
  
  The first stimulus is called the 'sample stimulus'
6. sven.wientjes 07 Sep 2020
  
  in Public
  
  It is likely that this phenomenonis mediatedbyinteractions among different brain regions involvedinthe OIC and DMC tasks. Indeed, LIP is connected with the dorsolateral prefrontal cortex(DLPFC)
  
  Cognitive Control area through dlPFC might be responsible for 'reprogramming' what goes on in LIP? Flexible readouts, different effect of recurrent encoding, etc?
7. sven.wientjes 07 Sep 2020
  
  in Public
  
  greater compression of activity among direction within categories in the DMC task
  
  Exactly what you would expect right, as direction does not matter for encoding category itself? We are talking about matching.
8. sven.wientjes 07 Sep 2020
  
  in Public
  
  compressing variability among directions within a category
  
  So it were mainly the response directions that were still encoded in the OIC task in LIP. In DMC this disappeared in favour of more population-level category coding.
9. sven.wientjes 07 Sep 2020
  
  in Public
  
  we evaluated the temporal stability of category decoding using SVM decoders that were trained at one time point and then tested at all other times pointsin the shared sample period
  
  Check the maintenance of information at time-point 1 by training a decoder on it and applying it to future time-points!
10. sven.wientjes 07 Sep 2020
  
  in Public
  
  category computations are supported by a common subpopulation of neurons in the early sample period and different subpopulations of neurons or different readout mechanisms in the late sample period
  
  Some perceptual mechanism is task-independent, simply information providing and encoding. Later readout can be flexible according to task demands!
11. sven.wientjes 07 Sep 2020
  
  in Public
  
  attractor dynamics appears to compress category-related information to a simpler, binary format by collapsing all directions within a category towards a single population state
  
  Working-Memory component induced in RNN as a function of task demand
12. sven.wientjes 07 Sep 2020
  
  in Public
  
  graded neural activity in the OIC
  
  Direct classification - more room for stimulus idiosyncratic representation?
13. sven.wientjes 07 Sep 2020
  
  in Public
  
  categorical encoding was more abstract with binary-like neural activityin the DMC
  
  Working-memory component
Annotators

sven.wientjes
Local file Local file

Beyond dichotomies in reinforcement learning

15
1. sven.wientjes 07 Sep 2020
  
  in Public
  
  contingency-degradation
  
  getting reward also when you pick random actions or don't pick anything - non-contingent rewards
2. sven.wientjes 07 Sep 2020
  
  in Public
  
  devaluation
  
  Sudden loss of reward or becoming aversive - MB-RL should instantly stop approaching, while MF-RL gradually
3. sven.wientjes 07 Sep 2020
  
  in Public
  
  assume a predefined state and action space
  
  It is really the representational structure that makes a HUGE difference in the effectiveness of any learning strategy deployed over it
4. sven.wientjes 07 Sep 2020
  
  in Public
  
  apply MF RL updates on retrospectively inferred latent states
  
  Have a model at the ready but don't do anticipatory planning - only retrospective evaluation according to MF-RL learning rules
5. sven.wientjes 07 Sep 2020
  
  in Public
  
  necessarily pressed into the singular axis of MF–MB
  
  So keep an open mind about such things - seems specifically aimed at HEURISTICS - form of wrongful MB-RL?
6. sven.wientjes 07 Sep 2020
  
  in Public
  
  Simple strategies that rely only on working memory,
  
  Looks exactly like MF
7. sven.wientjes 07 Sep 2020
  
  in Public
  
  For MB control to materialize, the agent must identify its goal, search its model for a path leading to that goal and then act on its plan
  
  Hard to model and understand the exact scope of the MB controller, so default to MF evidence if not accurate?
8. sven.wientjes 07 Sep 2020
  
  in Public
  
  features of trajectories in the environment
  
  Successor representations
9. sven.wientjes 07 Sep 2020
  
  in Public
  
  contextual information is used to segregate circumstances in which similar stimuli require different actions
  
  Again, a working-memory procedure?
10. sven.wientjes 07 Sep 2020
  
  in Public
  
  compound representations
  
  Essentially leveraging working memory to create apparently more 'flexible' behaviour, while in reality MF-RL is the only real 'learning' mechanism
11. sven.wientjes 07 Sep 2020
  
  in Public
  
  DAergic signals support both instrumental (action–value) and non-instrumental (state–value) learning in the striatum.
  
  The correct error signals are provided to facilitate any form of RL based learning. In Striatum?
12. sven.wientjes 07 Sep 2020
  
  in Public
  
  Computational RL theory built on the principles that animal behaviourists had distilled through experimentation, to develop the method of temporal difference (TD) learning (a MF algorithm)
  
  Origins of RL are purely associative learning - delta rule style
13. sven.wientjes 07 Sep 2020
  
  in Public
  
  derive value estimates for the different states or actions available
  
  Practical difference between computing desired states and inferring best actions VS directly computing desired actions without explicit state values
14. sven.wientjes 04 Sep 2020
  
  in Public
  
  dimensionality of learning — the axes of variance that describe how individuals learn and make choices — is well beyond two
  
  It is not only the speed / accuracy that is being traded off - which is what MB/MF and all other two-systems seem to boil down to.
15. sven.wientjes 04 Sep 2020
  
  in Public
  
  System1/System2
  
  Many learning theories seem to boil down to this, and MF-RL / MB-RL is exactly this!
Annotators

sven.wientjes
Aug 2020
Local file Local file

II

15
1. sven.wientjes 22 Aug 2020
  
  in Public
  
  rostral ACC activity will predict the probability of switching strategies, whereas caudal ACC activity will predict the probability of staying within a strategy
  
  Univariate more activity in rostral (higher up the hierarchy) means switching? not perhaps more top-down control to STAY in the current strategy?
2. sven.wientjes 20 Aug 2020
  
  in Public
  
  important for hierarchical systems that integrateinformation over extended time periods
  
  In fact it might be the one fundamental reason such hierarchy is useful
3. sven.wientjes 20 Aug 2020
  
  in Public
  
  ill-equipped to simulate control processes that are inherently dynamic, such as the response delays introduced by switching between tasks
  
  Yes, or maybe also the continuous tasks as proposed by Hayden group!
4. sven.wientjes 20 Aug 2020
  
  in Public
  
  by extending previous work that integrated goals into RNNs [38]
  
  The goal-circuit model is an abstraction of the HRL-RNN model?
5. sven.wientjes 20 Aug 2020
  
  in Public
  
  [12,144,181]
  
  Relevant publications explicitly drawing from HRL-RNN theory for ACC
6. sven.wientjes 20 Aug 2020
  
  in Public
  
  cells in isolation, or univariate indicators of ACC function that average across the activity of entire cell populations
  
  Distributed patterns will not be picked up - the guiding function of ACC cannot be detected. Or perhaps, only the 'energizing' part of it
7. sven.wientjes 20 Aug 2020
  
  in Public
  
  caudal ACC and rostral ACC apply control signals that attenuate costs associated with the production of low-level actions
  
  Or is this in some way similar to a 'gating' mechanism, allowing stable representations for control to be 'updated' or amended to current needs, detected by a 'higher' system
8. sven.wientjes 20 Aug 2020
  
  in Public
  
  tonic dopamine levels in ACC stabilize the task in working memory
  
  Something like a hidden state / task set active encoding
9. sven.wientjes 20 Aug 2020
  
  in Public
  
  ACC damage does not interfere strongly with many of the putative functions that have been attributed to it
  
  Crucial aspect of the HRL theory: Everything can still happen, the ACC does not EXCLUSIVELY execute all of these functions, it simply strengthens / guides
10. sven.wientjes 20 Aug 2020
  
  in Public
  
  do large and sudden changes in the state space explain the conflict-likesignals that are commonly observed in ACC?
  
  So basically it is not necessarily an explicit encoding of error, rather the updating of the current context?
11. sven.wientjes 20 Aug 2020
  
  in Public
  
  distributedmanner
  
  What exactly is the meaning and the point of this?
12. sven.wientjes 20 Aug 2020
  
  in Public
  
  an arsenal of mathematical tools from dynamical systems analysis
  
  Reference 58 contains examples of nonlinear dynamical systems analysis for neural networks?! :o
13. sven.wientjes 20 Aug 2020
  
  in Public
  
  bidirectional connectivity between DLPFC and ACC
  
  Explicit recurrence - Learning to reinforcement learn in activity dynamics?
14. sven.wientjes 20 Aug 2020
  
  in Public
  
  the model will predict how hierarchical action sequences are representedat different levels of abstraction along the frontal midline
  
  increasingly abstract goals will be represented along a gradient of the spatial organization of the network - this might have something to do with connectivity between layers? cf. convolutional neural nets
15. sven.wientjes 15 Aug 2020
  
  in Public
  
  individual regions of ACC process both typesof information
  
  Cognitive and emotional functions of the ACC can be found an individual regions all throughout the area
Annotators

sven.wientjes
Local file Local file

Freund et al - Neural coding of cognitive control the representational similarity analysis approach.pdf

17
1. sven.wientjes 15 Aug 2020
  
  in Public
  
  multiple regression problem, in which the different codingmodels were treated as predictor variables to the observed similarity matrix, the analysis enabledjoint estimation of multiple coding models (
  
  Super relevant: We can estimate the DEGREE to which different features are encoded on a trial-by-trial basis??
2. sven.wientjes 15 Aug 2020
  
  in Public
  
  task-switching study using MVPA
  
  So the sharpening can also be seen using fMRI - single neuron recordings not necessary!
3. sven.wientjes 15 Aug 2020
  
  in Public
  
  increase in the gain (i.e., sharpening) of fron-toparietal task-set coding
  
  This is exactly what is seen in the intracranial recording study Ebitz et al!
4. sven.wientjes 15 Aug 2020
  
  in Public
  
  by using a between-subjects RSAapproach, the analysis was not optimized to capture finer-grained representational structure thatcould be subject-specific
  
  Of course between people the geometry can take on a distinct form, but even if the encoded information is the same? Shouldn't RSA abstract over that?
5. sven.wientjes 15 Aug 2020
  
  in Public
  
  o task-generalcontrol representations were detected
  
  So per task it was very uniquely constructed - not one level for WM, inhibition, etc.
6. sven.wientjes 15 Aug 2020
  
  in Public
  
  separately, across different sub-domains of negative affec
  
  Orthogonal but consistent representation!
7. sven.wientjes 15 Aug 2020
  
  in Public
  
  This exclusive focuson behavioral measures may be suboptimal for construct validation, as brain activity measures canprovide more proximal, higher-dimensional readouts of the neural mechanisms of interest
  
  Is it expected that even though we cannot find common patterns in behaviour, we can still find common patterns in the brain? I thought even within the same task over different sessions, the control representation might differ...
8. sven.wientjes 15 Aug 2020
  
  in Public
  
  A construct validation approach is often used to address this issue.
  
  This is basically how IQ and the sub-measures of it are defined / measured
9. sven.wientjes 15 Aug 2020
  
  in Public
  
  lso relevant is the insight that RSA can be conducted in a time-dependent mannerwithin fMRI, such thattrialsform the dimensions of the similarity matrices
  
  How about sub-trails (e.g. only the picking of sugar) ?
10. sven.wientjes 15 Aug 2020
  
  in Public
  
  strength of conjunctive coding was robustly relatedto trial-by-trial response time
  
  So again - a scalar value strength type signal?
11. sven.wientjes 15 Aug 2020
  
  in Public
  
  For example, inter-ference occurs when a goal-relevant task set and an irrelevant yet prepotent set are simultaneouslyactive.
  
  Is this really a more elaborate model? Would'nt the 'interference' simply move any RSA model closer to the midline between color / word naming task set? We won't have very clear access to neural firing strengths or anything, if that would be relevant.
12. sven.wientjes 15 Aug 2020
  
  in Public
  
  one-dimensionalstructure of the model
  
  The model only investigated the representational strength along the face-house attentional dimension (not an interesting representational geometry?)
13. sven.wientjes 15 Aug 2020
  
  in Public
  
  specifying and comparing representational models is more flexible within RSA
  
  So the benefit is that we get more insight into the geometry of the representation, not just its presence / decodability? Compare different computational models.
14. sven.wientjes 15 Aug 2020
  
  in Public
  
  classification-based decoding, which we simply refer to here as “classification”, and RSA
  
  Distributed patterns can be subdivided like this: decoding and encoding models
15. sven.wientjes 15 Aug 2020
  
  in Public
  
  type and form of informationencoded in LPFC and associated regions of the FPN and CO
  
  So there is relevant information encoded in task set control reps: but there will probably still be a relevant 'intensity' summarizable in a scalar value?
16. sven.wientjes 15 Aug 2020
  
  in Public
  
  independent of particular stimuli, responses, or other task information
  
  Abstract control related factors such as 'congruency' abstract control signals away from directly related stimulus signals
17. sven.wientjes 15 Aug 2020
  
  in Public
  
  highly abstracted, one-dimensional factors
  
  This is the main issue right - Setting up experiments to identify one 'factor' of cognitive control, e.g. 'Congruency' in stroop tasks. It is much more complex multidimensional than that.
Annotators

sven.wientjes
Local file Local file

Predictive decision making driven by multiple time-linked reward representations in the anterior cingulate cortex

4
1. sven.wientjes 13 Aug 2020
  
  in Public
  
  inferred in the current experimentbecause the future value of a patch is, by design, different from itspast value
  
  dACC continuously learning a variable - the slope specifically? Or is it 'simply' trying to predict the prediction errors of lower layers. It seems like the latter would generalize less!
2. sven.wientjes 13 Aug 2020
  
  in Public
  
  The opposing time-linked signals observed do notsuggest that dACC and the other regions integrate rewards to asimple mean estimate (as RL-simple would), but instead pointtowards a comparison of recent and past reward rates necessaryfor the computation of reward trends.
  
  But this was based on full-region regression weights over different time bins, computed from a particular choice moment. How can you determine separate representations with whole-area betas, simultaneously recent reward increases, whereas past reward decreases, the activity of the whole region?
3. sven.wientjes 13 Aug 2020
  
  in Public
  
  which was updated on everytime step using a higher-order PE, PE* (that is, the difference ofobserved PE and expected PE)
  
  So there is some explicit hierarchy in the prediction errors - but is that really what goes on, or is there an explicit estimation of a trend, not necessarily based on the prediction errors themselves? It is mathematically identical!
4. sven.wientjes 13 Aug 2020
  
  in Public
  
  it is also possible that PEs may be used as a decisionvariable to guide decisions
  
  The ability of PEs to directly influence decision making, and not just learning, goes above and beyond simple-RL
Annotators

sven.wientjes
Local file Local file

The neural basis of predictive pursuit

2
1. sven.wientjes 13 Aug 2020
  
  in Public
  
  this relationship is heterogeneous; of these 58 neurons, 31.03% (n= 18/58) showed a positive slope and 18.97 % (n= 11/58) showed a negative slop
  
  Distance to prey is an important variable, and it is the actual code for time of impending reward - but it is not encoded by an overall rise in activity (typical fMRI analysis assumption!) It is encoded distributed over the neurons (perhaps RSA, but could still mask it if very single-neuron heavy?)
2. sven.wientjes 13 Aug 2020
  
  in Public
  
  The encoding of the control-relevant variable becomes higher when the expected reward for controlling becomes higher!
Annotators

sven.wientjes
Jul 2020
Local file Local file

Deep Reinforcement Learning and Its Neuroscientific Implications

3
1. sven.wientjes 14 Jul 2020
  
  in Public
  
  Finally, some research in deep RL proposes to tackle explora-tion by sampling randomly in the space of hierarchical behaviors(Machado et al., 2017;Jinnai et al., 2020;Hansen et al., 2020).This induces a form of directed, temporally extended, randomexploration reminiscent of some animal foraging models (Viswa-nathan et al., 1999).
  
  Sampling from hierarchical behaviours?
2. sven.wientjes 14 Jul 2020
  
  in Public
  
  An example is prediction learning, inwhich the agent is trained to predict, on the basis of its currentsituation, what it will observe at future time steps (Wayne et al.,2018;Gelada et al., 2019).
  
  This might be what can get confused for successor representation signal from dACC?
3. sven.wientjes 14 Jul 2020
  
  in Public
  
  Song et al. (2017)trained a recurrent deep RL modelon a series of reward-based decision making tasks that havebeen studied in the neuroscience literature, reporting close cor-respondences between the activation patterns observed in thenetwork’s internal units and neurons in dorsolateral prefrontal,orbitofrontal, and parietal cortices (Figure 2C).
  
  Relevant stuff!
Annotators

sven.wientjes
Local file Local file

Multiscale Computation and Dynamic Attention in Biological and Artificial Intelligence

21
1. sven.wientjes 03 Jul 2020
  
  in Public
  
  An important future goal is to create multiscale neural computational models that better predictmore complex real world behaviors
  
  Is this something that we induce automatically with recurrence?
2. sven.wientjes 03 Jul 2020
  
  in Public
  
  participants made decisionsbased on the instantaneous reward rate and the reward rate trend
  
  Trend is captured at a higher level representation?
3. sven.wientjes 03 Jul 2020
  
  in Public
  
  conflict betweenshort-term (safe options) and long-term (risky options) was mediated by the dorsal anterior cingulatecortex (dACC)
  
  dACC signals their conflict - does it do so by strengthening the representations of the one it favours?
4. sven.wientjes 03 Jul 2020
  
  in Public
  
  superimposition of computations at the shorter time scale (a trial) and the longer time scale (a blockof trials)
  
  now, participants have to continuously evaluate which task they should be engaged in (which computations to perform?)
5. sven.wientjes 03 Jul 2020
  
  in Public
  
  human memoryforaging
  
  Memory retrieval / search = foraging?
6. sven.wientjes 03 Jul 2020
  
  in Public
  
  one simple decision: when to leave a patch
  
  The key inference to make in Multi Value Theorem
7. sven.wientjes 03 Jul 2020
  
  in Public
  
  One research area that examined decisions of multiple time scales is foraging theory
  
  Foraging == The study of the tradeoff between exploiting the current local habit / inertia VS finding a different niche?
8. sven.wientjes 03 Jul 2020
  
  in Public
  
  miss the crucial resources that could be available if the animal maintains largerscale computations about the wider environment
  
  This is basically an explore-exploit tradeoff, but now not over one uniform action space but explicitly emphasizing local//global environment?
9. sven.wientjes 03 Jul 2020
  
  in Public
  
  summarized in the theories of hierarchical reinforcement learning
  
  !! Oh yeah baby
10. sven.wientjes 03 Jul 2020
  
  in Public
  
  shifts of anterior to posterior brain areas
  
  General organizational principle: The more something is habit-formed, the more posterior it shifts? The more control it requires, the more frontal it is?
11. sven.wientjes 03 Jul 2020
  
  in Public
  
  habit formation occurs when repetitive computations arestreamlined
  
  Habits automatize local goals so agent can allocate processing to more complex global goals?
12. sven.wientjes 03 Jul 2020
  
  in Public
  
  simplest form of multiscale processing,but it is ubiquitous.
  
  Simple bias in favour of repeating previously rewarded action = simple operant conditioning?
13. sven.wientjes 03 Jul 2020
  
  in Public
  
  stable or slowly changing environments
  
  Requires less to no flexibility
14. sven.wientjes 03 Jul 2020
  
  in Public
  
  effectively modulatebetween local tasks while also considering multiple global goals and contextual factors
  
  Comes together nicely with a view on RL as central controller for 'homeostasis', or something like that. Would require highly hierarchical system of goals and representations.
15. sven.wientjes 03 Jul 2020
  
  in Public
  
  t has been historically assumed that theinformation processing in each trial is independent from information processing from other trials,and that once one trial completes, all the information processing is reset
  
  No inter-trial dependencies - but of course there are processing benefits / interferences and maybe trial-by-trial updates of response caution known as speed-accuracy tradeoff
16. sven.wientjes 03 Jul 2020
  
  in Public
  
  many experimental paradigmsfocus on short spatial or temporal scales.
  
  Also a problem for 'real' HRL or meta-learning?
17. sven.wientjes 03 Jul 2020
  
  in Public
  
  area-restrictedsearch
  
  Foraging strategy: Limit your attention to locations that were previously rewarding (?)
18. sven.wientjes 03 Jul 2020
  
  in Public
  
  An overarching analogyis foraging
  
  Foraging requires dynamic allocation and weighting of attention and evidence between multiple sources
19. sven.wientjes 03 Jul 2020
  
  in Public
  
  the degree to which working memory is considered
  
  A Colling & MJ Frank Paper: How much RL is actually WM?
20. sven.wientjes 03 Jul 2020
  
  in Public
  
  The requirement to integrate information over spatial and temporal scales in a widevariety of environments would seem to be a common feature underlying intelligent systems, and onewhose performance has a profound impact on behavior [16–21]
  
  Exactly: Generalization over representations and over temporal grain of behaviour
21. sven.wientjes 03 Jul 2020
  
  in Public
  
  much of the progress made in the latest“AI spring” are, as we describe below, achievements of multiscale processing.
  
  Generalization and broader tasks are the hallmark of the current success of AI
Annotators

sven.wientjes
Jun 2020
Local file Local file

Science Journals — AAAS

16
1. sven.wientjes 30 Jun 2020
  
  in Public
  
  Cells that report choice independently of taskshould lie on the diagonal (i.e., an angle ofp/4).Instead, the distribution of angles was signif-icantly bimodal across all cells
  
  So the MFC has different cells specialized for different tasks (familiarity vs categorization)? Disappointing - hoped for generalized / remapping.
2. sven.wientjes 30 Jun 2020
  
  in Public
  
  Choice decoding in the MFC was strongestshortly after stimulus onset, well before theresponse was made
  
  So it is not the actual execution of an action which is the information that is picked up - It is the direction contextual state-space representation heads towards?
3. sven.wientjes 30 Jun 2020
  
  in Public
  
  Decoding accuracy for choices was highestin the MFC
  
  MFC is more task-relevant for response selection
4. sven.wientjes 30 Jun 2020
  
  in Public
  
  In contrast, in theMFC (Fig. 3E, right), the relative positions ofthe four conditions were not preserved.
  
  MFC seems to rely on different representations. Familiarity - category pairs are not perserved over tasks hence not really represented fundamentally. In the HPC it seems the determining factor (for Dim 1).
5. sven.wientjes 30 Jun 2020
  
  in Public
  
  In the HA, the ability to decode categorywas not significantly different between thetwo tasks
  
  HPC / Amyg encode stimulus category in memory & re-represent, regardless of task context. Part of stimulus representation in general.
6. sven.wientjes 30 Jun 2020
  
  in Public
  
  In the MFC, decoding accuracy for imagecategory was significantly higher in the memorytask
  
  MFC encodes task variable = stimulus category only during relevant context / task-set
7. sven.wientjes 30 Jun 2020
  
  in Public
  
  MS cell responses reflected amemory process: they strengthened over blocksas memories became stronger
  
  memory-selective identified cells fire stronger and stronger with repeated presentation of stimulus & identify false from true negative!
8. sven.wientjes 30 Jun 2020
  
  in Public
  
  We first trained a decoder to discriminate tasktype on trials where the subject was instructedto reply with a button press, and then we testedthe performance of this decoder on trials wherethe subject was instructed to use saccades
  
  Decoder should generalize task classification across response modalitied - does so in MFC (dACC and SMA)
9. sven.wientjes 30 Jun 2020
  
  in Public
  
  . Cells showed significantmodulation of their firing rate during thebaseline period as a function of task type
  
  So already from instruction there is a reconfiguration - very complicated mapping from linguistic input to representations for decision making...
10. sven.wientjes 30 Jun 2020
  
  in Public
  
  Subjects indicated choices using either sac-cades (leftward or rightward eye movement)or button press while maintaining fixationat the center of the screen
  
  Different response modalities allows for disambiguation of coding towards a very specific execution - if similar encoding it's really more cognitive/central!
11. sven.wientjes 30 Jun 2020
  
  in Public
  
  We found that neuronal pop-ulations within the MFC formed two separatedecision axes
  
  So movement through state space in unique but intra-task consistent direction?
12. sven.wientjes 30 Jun 2020
  
  in Public
  
  MFC [dorsal anterior cingulatecortex (dACC
  
  MFC = dACC = MCC
13. sven.wientjes 26 Jun 2020
  
  in Public
  
  insensitive to response modality
  
  So its really about the abstract task demands, not the concrete action output
14. sven.wientjes 26 Jun 2020
  
  in Public
  
  The strength andgeometry of representations of familiaritywere task-insensitive in the HA but not in theMFC
  
  This is what creates the 'shadowing' pattern?
15. sven.wientjes 26 Jun 2020
  
  in Public
  
  whether an image was novel orfamiliar, or whether an image belonged to agiven visual category
  
  recognition memory vs categorization: yes or no responses in both case. For 'pictures' so stimuli can be the same across tasks
16. sven.wientjes 26 Jun 2020
  
  in Public
  
  phase-locking of MFCactivity to oscillations in the HA
  
  HPC memory representations and dACC task set representations?
Annotators

sven.wientjes
Local file Local file

Human dorsal anterior cingulate neurons signal conflict by amplifying task-relevant information

30
1. sven.wientjes 24 Jun 2020
  
  in Public
  
  pattern of conflict modulation during one correct response is 489 orthogonal to the pattern during another correct response
  
  i.e. it is not a 'general boosting' effect -> only on average the activity of neurons can still increase, but it is all about upregulating the relevant neurons for this correct response
  
  Interpreting the results
2. sven.wientjes 24 Jun 2020
  
  in Public
  
  higher when Ericksen conflict was present (Figure 2A)
  
  Yeah, in single neurons you can show the detection of general conflict this way, and it was not partitionable into different responses...
  
  Interpreting the results Confusing statistics
3. sven.wientjes 24 Jun 2020
  
  in Public
  
  representational geometry
  
  nice wording similar to RSA
  
  RSA
4. sven.wientjes 24 Jun 2020
  
  in Public
  
  with Ericksen conflict than it was for trials without Ericksen
  
  what about simon?
  
  This does mean: Conflict increases representation shifting response toward correct action!
  
  Interpreting the results
5. sven.wientjes 24 Jun 2020
  
  in Public
  
  AUC
  
  This axis has more predictive power when there is conflict than when there is no conflict (task is already so easy that the information is not needed, or at least a lot less?)
  
  Interpreting the results
6. sven.wientjes 24 Jun 2020
  
  in Public
  
  G)
  
  Very clear effect! suspicious? how exactly did they even select the pseudo-populations, its not clear exactly from the methods to me
  
  Confusing statistics
7. sven.wientjes 24 Jun 2020
  
  in Public
  
  amplification hypothesis, conversely, does not predict a unified conflict 341 detection axis in the population. Instead, it makes a prediction that is exactly contrary to 342 the epiphenomenal view: that conflict should shift population activity along task-variable 343 coding dimensions, but in the opposite direction. That is, conflict is predicted to amplify 344 task-relevant neural responses
  
  conflict means more control will be exterted. Heavier representation of whatever info it is that dACC encodes that 'pushes' for the correct action. This function of dACC would be in line with the context layer!?
  
  Interpreting the results
8. sven.wientjes 24 Jun 2020
  
  in Public
  
  At the population level, then, the epiphenomenon hypothesis330 predicts that conflict should decrease the amount of information about the correct response 331 and shift neuronal population activity down along the axis in firing rate space that encodes 332 this response
  
  Because less % of neurons 'fighting' for the correct response are active, at least in total.
  
  Interpreting the results
9. sven.wientjes 24 Jun 2020
  
  in Public
  
  Neurons that were tuned for a specific correct response were 298 often tuned to prefer the same Simon/Ericksen distractor response
  
  DLPFC is tuned to action-outcomes? -> in single neurons!
  
  Interpreting the results
10. sven.wientjes 24 Jun 2020
  
  in Public
  
  In fact, the majority of conflict-sensitive 288 dACC neurons were not selective for either correct response or distractor responses (66.7%
  
  So the conflict is represented separately, not having much to do with action-outcomes.
11. sven.wientjes 24 Jun 2020
  
  in Public
  
  did still signal either Ericksen or Simon 277 conflict
  
  Simply the C-term in the ANOVA which is a binary coder for the general presence? Would also have more trials where its parameter is influential, does that influence estimation?
12. sven.wientjes 24 Jun 2020
  
  in Public
  
  neurons did not encode the distractor response
  
  So on trials with a unique distractor response, that action-outcome was not represented at all? It's interesting but then where does the actual conflict take place?
  
  Interpreting the results
13. sven.wientjes 24 Jun 2020
  
  in Public
  
  significant 270 proportion of neurons were selective for the correct response
  
  So desired action-outcome is represented. I think that was already known about dACC.
  
  Interpreting the results
14. sven.wientjes 24 Jun 2020
  
  in Public
  
  separate pools of 266 neurons corresponding to the two conflicting actions, and that conflict increases activity 267 because it uniquely activates both pools
  
  more neurons activate for the different possible action outcomes = more activity overall --> conflict signal. Makes sense.
15. sven.wientjes 24 Jun 2020
  
  in Public
  
  Furthermore, the population of cells whose responses were significantly 244 affected by Eriksen conflict was almost entirely non-overlapping with the population 245 significantly affected by Simon conflict (specifically, only one cell was significantly 246 modulated by both)
  
  Really separate representations for different aspects of the current task-set?
  
  Interpreting the results
16. sven.wientjes 24 Jun 2020
  
  in Public
  
  additive model was a better fit to the data than other, more 205 flexible models
  
  So separate statistical significance testing shows effect for Eriksen, not for Simon, but regression model shows through model comparison that it's best to ascribe to them the same effect...
  
  Interpreting the results
17. sven.wientjes 24 Jun 2020
  
  in Public
  
  (n=15/145) neurons had significantly different firing rates between Simon and no-196 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted March 15, 2020. . https://doi.org/10.1101/2020.03.14.991745doi: bioRxiv preprint
  
  No significant main effect but more single cells had a significant effect...? -> also directionality is not all positive, some positive some negative
  
  Interpreting the results
18. sven.wientjes 24 Jun 2020
  
  in Public
  
  A small number of individual 187 neurons also had different activity levels on Eriksen conflict and no conflict trials (8.2%, 188 n=12/145 neurons, within-cell t-test)
  
  Note the difference between 'averaged over all neurons' (first report) or 'within one specific neuron' (this report)
  
  Interpreting the results
19. sven.wientjes 24 Jun 2020
  
  in Public
  
  activity was higher on Ericksen conflict 185 trials than on no conflict trials
  
  for Eriksen flankers there is a main effect of conflict (vs no-conflict). Simon was not statistically significant. Was it mainly a power issue?
  
  Interpreting the results
20. sven.wientjes 24 Jun 2020
  
  in Public
  
  Ericksen
  
  So is it Ericksen or Eriksen??
21. sven.wientjes 24 Jun 2020
  
  in Public
  
  12 task conditions
  
  Here they acknowledge 12 task conditions, not 9.
  
  Condition labeling
22. sven.wientjes 24 Jun 2020
  
  in Public
  
  Within each task condition 730 (combination of correct response and distractor response), firing rates from separately 731 recorded neurons were randomly drawn with replacement to create a pseudotrial firing rate 732 vector for that task condition, with each entry corresponding to the activity of one neuron 733 in that condition
  
  Definition of pseudotrial
23. sven.wientjes 24 Jun 2020
  
  in Public
  
  pseudotrial vector x
  
  one trial for all different neurons in the current pseudopopulation matrix?
  
  Confusing statistics
24. sven.wientjes 24 Jun 2020
  
  in Public
  
  The separating hyperplane for each choice i is the vector (a) that satisfies: 770 771 772 773 Meaning that βi is a vector orthogonal to the separating hyperplane in neuron-774 dimensional space, along which position is proportional to the log odds of that correct 775 response: this is the the coding dimension for that correct response
  
  Makes sense: If Beta is proportional to the log-odds of a correct response, a is the hyperplane that provides the best cutoff, which must be orthogonal. Multiplying two orthogonal vectors yields 0.
  
  Statistics
25. sven.wientjes 24 Jun 2020
  
  in Public
  
  X is the trials by neurons pseudopopulation matrix of firing rates
  
  So these pseudopopulations were random agglomerates of single neurons that were recorded, so many fits for random groups, and the best were kept?
  
  Confusing statistics Population coding
26. sven.wientjes 24 Jun 2020
  
  in Public
  
  re-representing high-750 dimensional neural activity in a small number of dimensions that correspond to variables 751 of interest in the data
  
  Essentially this is kind of like constructing dissimilarity matrices over large groups of voxels?
  
  Population coding RSA
27. sven.wientjes 24 Jun 2020
  
  in Public
  
  4917.0 (1) 5826.5 (1)*
  
  Additive model is the winner in single cell firing rates -> coding simply for the notion of conflict? cf. the population coding from dimensionality reduction!
  
  Interpreting the results
28. sven.wientjes 24 Jun 2020
  
  in Public
  
  Subtracting this expectation from the observed pattern 723 of activity left the residual activity that could not be explained by the linear co-activation 724 of task and distractor conditions
  
  So this is what to analyze: If this still covaries with conflict in some way it means we go beyond epiphenomenal?
  
  Interpreting the results
29. sven.wientjes 24 Jun 2020
  
  in Public
  
  Within each neuron, 719 we calculated the expected firing rate for each task condition, marginalizing over 720 distractors, and for each distractor, marginalizing over tasks.
  
  Distractor = specific stimulus / location (e.g. '1' or 'left')?
  
  Task = conflict condition (e.g. Simon or Ericksen)?
  
  Residuals Ephiphenomenal hypothesis Confusing statistics
30. sven.wientjes 24 Jun 2020
  
  in Public
  
  condition-averaged within neurons (9 data points per 691 neuron, reflecting all combinations of the 3 correct response, 3 Ericksen distractors, and 3 692 Simon distractors)
  
  How do all combinations of 3 responses lead to only 9 data points per neuron? 3x2x2 = 12.
  
  Condition labeling Confusing statistics
Tags

RSA

Population coding

Ephiphenomenal hypothesis

Interpreting the results

Residuals

Condition labeling

Statistics

Confusing statistics

Annotators

sven.wientjes

sven.wientjes

Annotations: 584

Joined: June 24, 2020

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Annotators

Tags

Annotators