Neuroscience and Reinforcement Learning
#neuroscience #reinforcement_learning
https://www.princeton.edu/~yael/ICMLTutorial.pdf
#neuroscience #reinforcement_learning
https://www.princeton.edu/~yael/ICMLTutorial.pdf
Deep Learning and Computational Neuroscience
#neuroscience #reinforcement_learning
https://link.springer.com/article/10.1007/s12021-018-9360-6
#neuroscience #reinforcement_learning
https://link.springer.com/article/10.1007/s12021-018-9360-6
Neuroinformatics
Deep Learning and Computational Neuroscience
Neuroinformatics -
Solving Rubik’s Cube with a Robot Hand
This is fascinating, make sure you read it.
Summary: OpenAI team trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR). The system can handle situations it never saw during training, such as being prodded by a stuffed giraffe. This shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.
https://openai.com/blog/solving-rubiks-cube/
#reinforcement_learning #machine_learning #robotics
This is fascinating, make sure you read it.
Summary: OpenAI team trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR). The system can handle situations it never saw during training, such as being prodded by a stuffed giraffe. This shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.
https://openai.com/blog/solving-rubiks-cube/
#reinforcement_learning #machine_learning #robotics
Openai
Solving Rubik’s Cube with a robot hand
We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic…
Proximal Policy Optimization
Paper:
https://openai.com/blog/openai-baselines-ppo/
YouTube Video:
https://www.youtube.com/watch?v=5P7I-xPq8u8&list=PLLO4N3-FoY3feUsA3_XZvn5sXy9Ms8ayE&index=2
#reinforcement_learning #optimization
Paper:
https://openai.com/blog/openai-baselines-ppo/
YouTube Video:
https://www.youtube.com/watch?v=5P7I-xPq8u8&list=PLLO4N3-FoY3feUsA3_XZvn5sXy9Ms8ayE&index=2
#reinforcement_learning #optimization
Openai
Proximal Policy Optimization
We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune. PPO has become the default reinforcement…
PyTorch tutorial of various RL algorithms:
actor critic / proximal policy optimization / acer / ddpg / twin dueling ddpg / soft actor critic / generative adversarial imitation learning / hindsight experience replay
https://github.com/higgsfield/RL-Adventure-2
#reinforcement_learning #pytorch
actor critic / proximal policy optimization / acer / ddpg / twin dueling ddpg / soft actor critic / generative adversarial imitation learning / hindsight experience replay
https://github.com/higgsfield/RL-Adventure-2
#reinforcement_learning #pytorch
GitHub
GitHub - higgsfield-ai/higgsfield: Fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed…
Fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed for training models with billions to trillions of parameters - higgsfield-ai/higgsfield
Distill and Transfer Learning for Robust Multitask Reinforcement Learning
"Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (DIStill & TRAnsfer Learning). Instead of sharing parameters between the different workers, we propose to share a distilled policy that captures common behavior across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning."
https://www.youtube.com/watch?v=scf7Przmh7c
#reinforcement_learning #multi_task_learning #transfer_learning
"Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (DIStill & TRAnsfer Learning). Instead of sharing parameters between the different workers, we propose to share a distilled policy that captures common behavior across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning."
https://www.youtube.com/watch?v=scf7Przmh7c
#reinforcement_learning #multi_task_learning #transfer_learning
YouTube
Distill and transfer learning for robust multitask RL
Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where…
A fruitful relationship between neuroscience and AI
https://deepmind.com/blog/article/Dopamine-and-temporal-difference-learning-A-fruitful-relationship-between-neuroscience-and-AI
#reinforcement_learning #machine_learning #neuroscience #artificial_intelligence
https://deepmind.com/blog/article/Dopamine-and-temporal-difference-learning-A-fruitful-relationship-between-neuroscience-and-AI
#reinforcement_learning #machine_learning #neuroscience #artificial_intelligence
Google DeepMind
Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI
Learning and motivation are driven by internal and external rewards. Many of our day-to-day behaviours are guided by predicting, or anticipating, whether a given action will result in a positive...
Book: The SOAR Cognitive Architecture
Introduction: in development for thirty years, Soar is a general cognitive architecture that integrates knowledge-intensive reasoning, reactive execution, hierarchical reasoning, planning, and learning from experience, with the goal of creating a general computational system that has the same cognitive abilities as humans. In contrast, most AI systems are designed to solve only one type of problem, such as playing chess, searching the Internet, or scheduling aircraft departures. Soar is both a software system for agent development and a theory of what computational structures are necessary to support human-level agents. Over the years, both software system and theory have evolved. This book offers the definitive presentation of Soar from theoretical and practical perspectives, providing comprehensive descriptions of fundamental aspects and new components. The current version of Soar features major extensions, adding reinforcement learning, semantic memory, episodic memory, mental imagery, and an appraisal-based model of emotion. This book describes details of Soar's component memories and processes and offers demonstrations of individual components, components working in combination, and real-world applications. Beyond these functional considerations, the book also proposes requirements for general cognitive architectures and explicitly evaluates how well Soar meets those requirements.
https://dl.acm.org/doi/book/10.5555/2222503
#cognitive_science #neuroscience #reinforcement_learning #artificial_intelligence
Introduction: in development for thirty years, Soar is a general cognitive architecture that integrates knowledge-intensive reasoning, reactive execution, hierarchical reasoning, planning, and learning from experience, with the goal of creating a general computational system that has the same cognitive abilities as humans. In contrast, most AI systems are designed to solve only one type of problem, such as playing chess, searching the Internet, or scheduling aircraft departures. Soar is both a software system for agent development and a theory of what computational structures are necessary to support human-level agents. Over the years, both software system and theory have evolved. This book offers the definitive presentation of Soar from theoretical and practical perspectives, providing comprehensive descriptions of fundamental aspects and new components. The current version of Soar features major extensions, adding reinforcement learning, semantic memory, episodic memory, mental imagery, and an appraisal-based model of emotion. This book describes details of Soar's component memories and processes and offers demonstrations of individual components, components working in combination, and real-world applications. Beyond these functional considerations, the book also proposes requirements for general cognitive architectures and explicitly evaluates how well Soar meets those requirements.
https://dl.acm.org/doi/book/10.5555/2222503
#cognitive_science #neuroscience #reinforcement_learning #artificial_intelligence
Reinforcement Learning and Optimal Control.pdf
2.7 MB
Reinforcement learning and Optimal Control (Draft version)
Desperately looking for the original version of this book. If you could find it, please let me know.
#reinforcement_learning #optimal_control
Desperately looking for the original version of this book. If you could find it, please let me know.
#reinforcement_learning #optimal_control
AlphaGo - The Movie | Full Documentary
Summary: with more board configurations than there are atoms in the universe, the ancient Chinese game of Go has long been considered a grand challenge for artificial intelligence. On March 9, 2016, the worlds of Go and artificial intelligence collided in South Korea for an extraordinary best-of-five-game competition, coined The DeepMind Challenge Match. Hundreds of millions of people around the world watched as a legendary Go master took on an unproven AI challenger for the first time in history.
https://www.youtube.com/watch?v=WXuK6gekU1Y
#artificial_intelligence #reinforcement_learning #deep_learning
Summary: with more board configurations than there are atoms in the universe, the ancient Chinese game of Go has long been considered a grand challenge for artificial intelligence. On March 9, 2016, the worlds of Go and artificial intelligence collided in South Korea for an extraordinary best-of-five-game competition, coined The DeepMind Challenge Match. Hundreds of millions of people around the world watched as a legendary Go master took on an unproven AI challenger for the first time in history.
https://www.youtube.com/watch?v=WXuK6gekU1Y
#artificial_intelligence #reinforcement_learning #deep_learning
YouTube
AlphaGo - The Movie | Full award-winning documentary
With more board configurations than there are atoms in the universe, the ancient Chinese game of Go has long been considered a grand challenge for artificial intelligence.
On March 9, 2016, the worlds of Go and artificial intelligence collided in South…
On March 9, 2016, the worlds of Go and artificial intelligence collided in South…
Ilya Sutskever: Deep Learning
Brief Biography: Ilya Sutskever is the co-founder of OpenAI, is one of the most cited computer scientist in history with over 165,000 citations, and to me, is one of the most brilliant and insightful minds ever in the field of deep learning. There are very few people in this world who I would rather talk to and brainstorm with about deep learning, intelligence, and life than Ilya, on and off the mic.
https://www.youtube.com/watch?v=13CZPWmke6A
#deep_learning #artificial_intelligence #reinforcement_learning
Brief Biography: Ilya Sutskever is the co-founder of OpenAI, is one of the most cited computer scientist in history with over 165,000 citations, and to me, is one of the most brilliant and insightful minds ever in the field of deep learning. There are very few people in this world who I would rather talk to and brainstorm with about deep learning, intelligence, and life than Ilya, on and off the mic.
https://www.youtube.com/watch?v=13CZPWmke6A
#deep_learning #artificial_intelligence #reinforcement_learning
YouTube
Ilya Sutskever: Deep Learning | Lex Fridman Podcast #94
Ilya Sutskever is the co-founder of OpenAI, is one of the most cited computer scientist in history with over 165,000 citations, and to me, is one of the most brilliant and insightful minds ever in the field of deep learning. There are very few people in this…
Reinforcement learning in the brain
Abstract: A wealth of research focuses on the decision-making processes that animals and humans employ when selecting actions in the face of reward and punishment. Initially such work stemmed from psychological investigations of conditioned behavior, and explanations of these in terms of computational models. Increasingly, analysis at the computational level has drawn on ideas from reinforcement learning, which provide a normative framework within which decision-making can be analyzed. More recently, the fruits of these extensive lines of research have made contact with investigations into the neural basis of decision making. Converging evidence now links reinforcement learning to specific neural substrates, assigning them precise computational roles. Specifically, electrophysiological recordings in behaving animals and functional imaging of human decision-making have revealed in the brain the existence of a key reinforcement learning signal, the temporal difference reward prediction error. Here, we first introduce the formal reinforcement learning framework. We then review the multiple lines of evidence linking reinforcement learning to the function of dopaminergic neurons in the mammalian midbrain and to more recent data from human imaging experiments. We further extend the discussion to aspects of learning not associated with phasic dopamine signals, such as learning of goal-directed responding that may not be dopamine-dependent, and learning about the vigor (or rate) with which actions should be performed that has been linked to tonic aspects of dopaminergic signaling. We end with a brief discussion of some of the limitations of the reinforcement learning framework, highlighting questions for future research.
https://psycnet.apa.org/record/2009-07078-003
#reinforcement_learning #neuroscience
Abstract: A wealth of research focuses on the decision-making processes that animals and humans employ when selecting actions in the face of reward and punishment. Initially such work stemmed from psychological investigations of conditioned behavior, and explanations of these in terms of computational models. Increasingly, analysis at the computational level has drawn on ideas from reinforcement learning, which provide a normative framework within which decision-making can be analyzed. More recently, the fruits of these extensive lines of research have made contact with investigations into the neural basis of decision making. Converging evidence now links reinforcement learning to specific neural substrates, assigning them precise computational roles. Specifically, electrophysiological recordings in behaving animals and functional imaging of human decision-making have revealed in the brain the existence of a key reinforcement learning signal, the temporal difference reward prediction error. Here, we first introduce the formal reinforcement learning framework. We then review the multiple lines of evidence linking reinforcement learning to the function of dopaminergic neurons in the mammalian midbrain and to more recent data from human imaging experiments. We further extend the discussion to aspects of learning not associated with phasic dopamine signals, such as learning of goal-directed responding that may not be dopamine-dependent, and learning about the vigor (or rate) with which actions should be performed that has been linked to tonic aspects of dopaminergic signaling. We end with a brief discussion of some of the limitations of the reinforcement learning framework, highlighting questions for future research.
https://psycnet.apa.org/record/2009-07078-003
#reinforcement_learning #neuroscience