- Jul 2022
-
ieeexplore.ieee.org ieeexplore.ieee.org
-
A recent overview of RL methods used for autonomous driving.
-
- Jun 2022
-
assets.pubpub.org assets.pubpub.org
-
Discussion on
Bellinger C, Drozdyuk A, Crowley M, Tamblyn I. Balancing Information with Observation Costs in Deep Reinforcement Learning. Proceedings of the Canadian Conference on Artificial Intelligence [Internet]. 2022 May 27; Available from: https://caiac.pubpub.org/pub/0jmy7gpd
-
- May 2022
-
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov
-
Another piece to the "what can we do with eligibility traces" puzzle for Deep RL.
-
-
arxiv.org arxiv.org
-
Question: What happened to Eligibility Traces in the Deep RL era? This paper highlights some of the reasons they are not used widely and proposes a way they could still be effective.
-
-
arxiv.org arxiv.org
-
Question: What happened to Eligibility Traces in the Deep RL era? This paper highlights some of the reasons they are not used widely and proposes a way they could still be effective.
-
-
arxiv.org arxiv.org
-
Hypothesis page to discuss this high level description of DeepMind's new Gato framework.
-
- Mar 2022
-
arxiv.org arxiv.org
-
The paper that introduced the MineRL challenge dataset.
Tags
Annotators
URL
-
- Jan 2022
-
www.grandin.com www.grandin.com
-
reinforcement
"Reinforcement means to the act of reinforcing."
-
- Jul 2021
-
psyarxiv.com psyarxiv.com
-
Palminteri, S. (2021). Choice-confirmation bias and gradual perseveration in human reinforcement learning [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/dpqj6
-
- Jun 2021
-
-
Chadi, M.-A., & Mousannif, H. (2021). Reinforcement Learning Based Decision Support Tool For Epidemic Control [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/tcr8s
-
- Mar 2021
-
www.opendemocracy.net www.opendemocracy.net
-
Using chemicals to improve our economy of attention and become emotionally "fitter" is an option that penetrated public consciousness some time ago.
Same is true of reinforcement learning algorithms.
-
- Sep 2020
-
-
Ozaita, J., Baronchelli, A., & Sánchez, A. (2020). The emergence of segregation: From observable markers to group specific norms. ArXiv:2009.05354 [Physics, q-Bio]. http://arxiv.org/abs/2009.05354
-
-
journals.sagepub.com journals.sagepub.com
-
Ludwig, V. U., Brown, K. W., & Brewer, J. A. (2020). Self-Regulation Without Force: Can Awareness Leverage Reward to Drive Behavior Change? Perspectives on Psychological Science, 1745691620931460. https://doi.org/10.1177/1745691620931460
-
- Jul 2020
-
-
Harvey, A., Armstrong, C. C., Callaway, C. A., Gumport, N. B., & Gasperetti, C. E. (2020). COVID-19 Prevention via the Science of Habit Formation [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/57jyg
-
- May 2020
-
-
Radulescu, A., Holmes, K., & Niv, Y. (2020). On the convergent validity of risk sensitivity measures [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/qdhx4
-
-
-
psyarxiv.com psyarxiv.com
-
Hertz, U. (2020). Cognitive learning processes account for asymmetries in adaptations to new social norms [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/7thku
-
-
-
Liu, L., Wang, X., Tang, S., & Zheng, Z. (2020). Complex social contagion induces bistability on multiplex networks. ArXiv:2005.00664 [Physics]. http://arxiv.org/abs/2005.00664
-
- Apr 2020
-
-
Ting, C., Palminteri, S., Lebreton, M., & Engelmann, J. B. (2020, March 25). The elusive effects of incidental anxiety on reinforcement-learning. https://doi.org/10.31234/osf.io/7d4tc MLA
-
- Mar 2019
-
cjc.ict.ac.cn cjc.ict.ac.cn
-
深度强化学习综述
-
-
cjc.ict.ac.cn cjc.ict.ac.cn
-
深度强化学习综述
-
-
github.com github.com
-
reinforcement-learning code and paper tutorials
-
- Feb 2019
-
gitee.com gitee.com
-
We present MILABOT: a deep reinforcement learning chatbot developed by theMontreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prizecompetition. MILABOT is capable of conversing with humans on popular smalltalk topics through both speech and text. The system consists of an ensemble ofnatural language generation and retrieval models, including template-based models,bag-of-words models, sequence-to-sequence neural network and latent variableneural network models. By applying reinforcement learning to crowdsourced dataand real-world user interactions, the system has been trained to select an appropriateresponse from the models in its ensemble. The system has been evaluated throughA/B testing with real-world users, where it performed significantly better thanmany competing systems. Due to its machine learning architecture, the system islikely to improve with additional data
-
- Jul 2016
-
thesocialwrite.com thesocialwrite.com
-
Think of all the hard work and the sweat you put in to the things that your proudest of.
Always feels good to say, "I worked out today!"
-