29 Matching Annotations
  1. Nov 2023
  2. Sep 2023
    1. 07:00 focus on reward, not process (summit syndrome), “is suffering going to pay off” (see zk fixation on results) “living life in expaction of better future is game of suffering for outcome or avoiding it” (10:00)

  3. Feb 2023
    1. Definition 3.2 (simple reward machine).

      The MDP does not change, it's dynamics are the same, with or without the RM, as they are with or without a standard reward model. Additionally, the rewards from the RM can be non-Markovian with respect to the MDP because they inherently have a kind of memory or where you've been, limited to the agents "movement" (almost "in it's mind") about where it is along the goals for this task.

    2. e thenshow that an RM can be interpreted as specifying a single reward function over a largerstate space, and consider types of reward functions that can be expressed using RMs

      So by specifying a reward machine you are augmenting the state space of the MDP with higher level goals/subgoals/concepts that provide structure about what is good and what isn't.

    3. However, an agent that hadaccess to the specification of the reward function might be able to use such information tolearn optimal policies faster.

      Fascinating idea, why not? Why are we hiding the reward from the agent really?

    4. Reward Machines: Exploiting Reward FunctionStructure in Reinforcement Learning

      [Icarte, JAIR, 2022] "Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning"

    1. Using Reward Machines for High-Level Task Specificationand Decomposition in Reinforcement Learning

      [Icarte, PMLR, 2018] "Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning"

  4. Dec 2022
    1. just wanted to have an overview of these categories to get people thinking and doing in this level. And the challenge of course is the cornucopias and the Vikings are distracting us from what really needs to be done. And so this whole conversation, we're thinking two or three steps ahead from something that 00:51:27 our culture is not giving us the status, reward, and emotional signals of yet.

      !- good point : rewards for Arcadians not yet in place - Nate makes a good point. The system design thinking required, the futures thinking now required is not being rewarded by the current system because its value is so far not recognized. Arcadians are on the bleeding edge and must be a tough and resilient bunch with autonomy to recognize that it will be an uphill battle

  5. Aug 2022
  6. Jul 2022
  7. Mar 2022
  8. Jan 2022
  9. Nov 2021
    1. The dopamine reward system has also been shown to bestimulated by most drugs of abuse and plays an important rolein addiction [33]. An important question is whether jhanameditators are subject to addiction and tolerance effects thatcan result from stimulation of the dopamine reward system.

      The question of potential addiction to self-induced states that activate the dopamine (and/or other neurochemical) reward system(s) is important. From a more philosophical angle, should we welcome beneficial addictions that, if cultivated, might significantly improve individual and group quality of life? Isn't this related to our high regard for replacing detrimental with positive habits? Habit formation and maintenance also depends on activation of neural reward systems (see Nir Eyal's book, Hooked).

    2. We report the first neural recording during ecstatic meditations called jhanas and test whether a brain reward system plays a rolein the joy reported. Jhanas are Altered States of Consciousness (ASC) that imply major brain changes based on subjective reports:(1) external awareness dims, (2) internal verbalizations fade, (3) the sense of personal boundaries is altered, (4) attention is highlyfocused on the object of meditation, and (5) joy increases to high levels. The fMRI and EEG results from an experienced meditatorshow changes in brain activity in 11 regions shown to be associated with the subjective reports, and these changes occur promptlyafter jhana is entered. In particular, the extreme joy is associated not only with activation of cortical processes but also with activationof the nucleus accumbens (NAc) in the dopamine/opioid reward system. We test three mechanisms by which the subject mightstimulate his own reward system by external means and reject all three. Taken together, these results demonstrate an apparentlynovel method of self-stimulating a brain reward system using only internal mental processes in a highly trained subject.

      I can find no other research on this particular matter. It would be helpful to have other studies to validate or invalidate this one. This method of reward requires a highly-trained participant and involves no external means.

  10. Sep 2021
    1. Investing, in simplest terms, is taking one finite resource and trying to allocate it to maximize for an ideal outcome. Whether you’re allocating money, time, energy, or attention. Everyone is an allocator of something. Investing is an opportunity to evaluate what you believe. To gain conviction. And then to act on that conviction.

      Trying to hit bullseye, getting the grand reward. Using the information at hand to act on what's best.

  11. May 2021
  12. Apr 2021
  13. Oct 2020
    1. If a behavior is insufficient in any of the four stages, it will not become a habit. Eliminate the cue and your habit will never start. Reduce the craving and you won’t experience enough motivation to act. Make the behavior difficult and you won’t be able to do it. And if the reward fails to satisfy your desire, then you’ll have no reason to do it again in the future. Without the first three steps, a behavior will not occur. Without all four, a behavior will not be repeated.
    2. Second, rewards teach us which actions are worth remembering in the future. Your brain is a reward detector
    3. The first purpose of rewards is to satisfy your craving
  14. Sep 2020
  15. Jul 2020
  16. Jun 2020
  17. May 2020
  18. Apr 2020
    1. “Even if experts are saying it’s really not going to make a difference, a little [part of] people’s brains is thinking, well, it’s not going to hurt. Maybe it’ll cut my risk just a little bit, so it’s worth it to wear a mask,” she says.
  19. Jan 2020
    1. Look over your list. Do they contain words like published, awarded, graduated, built, founded or created? Or do they contain mostly adjectives like nice, caring, loving, honest and smart? If you’re in the first sentence it’s likely you’re an SC. If the majority of your responses are in the second sentence you are likely an RC.

      The difference is if listing egocentric stuff (I'm impressive and I feel better than others, I feel worthy for myself itself) or listing qualities that influence the surrounding world (I do social work to help refugees, I published a theory to improve the current state of philosophy, I completed a project or a school, I created something that now generates some kind of value).

      The Replication Creators are creative just for themselves, so they get just short-term rewards.

      The Skilled Creators are creative for the sharing with the others, so they get long-term rewards.

  20. Feb 2014
    1. Intellectual property is far more egalitarian. Of limited duration and obtainable by anyone, intellectual property can be seen as a reward, an empowering instrument, for the talented upstarts Burke sought to restrain. Intellectual property is often the propertization of what we call "talent." It tends to shift the balance toward the talented newcomers whom Burke mistrusted

      intellectual property is often the propertization of what we call talent.

    1. MINTURN, J. The plaintiff occupied the position of a special police officer, in Atlantic City, and incidentally was identified with the work of the prosecutor of the pleas of the county. He possessed knowledge concerning the theft of certain diamonds and jewelry from the possession of the defendant, who had advertised a reward for the recovery of the property. In this situation he claims to have entered into a verbal contract with defendant, whereby she agreed to pay him $500 if he could procure for her the names and addresses of the thieves. As a result of his meditation with the police authorities the diamonds and jewelry were recovered, and plaintiff brought this suit to recover the promised reward.
      • Plaintiff makes a verbal contract with defendant. In return for $500, plaintiff will find defendant's stolen jewels.
      • Plaintiff had knowledge of whereabouts of jewels at contract formation.
      • Plaintiff is a special police officer and has dealings with prosecutor's office.
      • Defendant published advertisement for reward.
      • Plaintiff finds stolen goods and arranges return.