Solving Rubik’s Cube with a Robot Hand
This is fascinating, make sure you read it.
Summary: OpenAI team trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR). The system can handle situations it never saw during training, such as being prodded by a stuffed giraffe. This shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.
https://openai.com/blog/solving-rubiks-cube/
#reinforcement_learning #machine_learning #robotics
This is fascinating, make sure you read it.
Summary: OpenAI team trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR). The system can handle situations it never saw during training, such as being prodded by a stuffed giraffe. This shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.
https://openai.com/blog/solving-rubiks-cube/
#reinforcement_learning #machine_learning #robotics
Openai
Solving Rubik’s Cube with a robot hand
We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic…
A fascinating research paper in the intersection of Graph Neural Networks and Reinforcement Learning for tackling Robotics challenges
https://openreview.net/pdf?id=S1sqHMZCb
#robotics #deep_learning #geometric_deep_learning
https://openreview.net/pdf?id=S1sqHMZCb
#robotics #deep_learning #geometric_deep_learning
Synthesis and Stabilization of Complex Behaviors through Online Trajectory Optimization
Abstract: We present an online trajectory optimization method and software platform applicable to complex humanoid robots performing challenging tasks such as getting up from an arbitrary pose on the ground and recovering from large disturbances using dexterous acrobatic maneuvers. The resulting behaviors, illustrated in the attached video, are computed only 7x slower than real time, on a standard PC. The video also shows results on the acrobot problem, planar swimming and one-legged hopping. These simpler problems can already be solved in real time, without pre-computing anything
Video of their experiments: https://youtu.be/anIsw2-Lbco
Paper: https://homes.cs.washington.edu/~todorov/papers/TassaIROS12.pdf
#model_predictive_control #optimal_control #robotics
Abstract: We present an online trajectory optimization method and software platform applicable to complex humanoid robots performing challenging tasks such as getting up from an arbitrary pose on the ground and recovering from large disturbances using dexterous acrobatic maneuvers. The resulting behaviors, illustrated in the attached video, are computed only 7x slower than real time, on a standard PC. The video also shows results on the acrobot problem, planar swimming and one-legged hopping. These simpler problems can already be solved in real time, without pre-computing anything
Video of their experiments: https://youtu.be/anIsw2-Lbco
Paper: https://homes.cs.washington.edu/~todorov/papers/TassaIROS12.pdf
#model_predictive_control #optimal_control #robotics
YouTube
Model Predictive Control with iLQG and MuJoCo
Tassa, Erez and Todorov, "Synthesis and stabilization of complex behaviors through online trajectory optimization," IROS 2012.