Our two previous blog entries implied that there is a role games can play in driving the development of Reinforcement Learning algorithms. As the world’s most popular creation engine, Unity is at the crossroads between machine learning and gaming. It is critical to our mission to enable machine learning researchers with the most powerful training scenarios, and for us to give back to the gaming community by enabling them to utilize the latest machine learning technologies. As the first step in this endeavor, we are excited to introduce Unity Machine Learning Agents Toolkit.
Machine Learning is changing the way we expect to get intelligent behavior out of autonomous agents. Whereas in the past the behavior was coded by hand, it is increasingly taught to the agent (either a robot or virtual avatar) through interaction in a training environment. This method is used to learn behavior for everything from industrial robots, drones, and autonomous vehicles, to game characters and opponents. The quality of this training environment is critical to the kinds of behaviors that can be learned, and there are often trade-offs of one kind or another that need to be made. The typical scenario for training agents in virtual environments is to have a single environment and agent which are tightly coupled. The actions of the agent change the state of the environment, and provide the agent with rewards.
The typical Reinforcement Learning training cycle.
At Unity, we wanted to design a system that provide greater flexibility and ease-of-use to the growing groups interested in applying machine learning to developing intelligent agents. Moreover, we wanted to do this while taking advantage of the high quality physics and graphics, and simple yet powerful developer control provided by the Unity Engine and Editor. We think that this combination can benefit the following groups in ways that other solutions might not:
We call our solution Unity Machine Learning Agents Toolkit (ML-Agents toolkit for short), and are happy to be releasing an open beta version of our SDK today! The ML-Agents SDK allows researchers and developers to transform games and simulations created using the Unity Editor into environments where intelligent agents can be trained using Deep Reinforcement Learning, Evolutionary Strategies, or other machine learning methods through a simple to use Python API. We are releasing this beta version of Unity ML-Agents toolkit as open-source software, with a set of example projects and baseline algorithms to get you started. As this is an initial beta release, we are actively looking for feedback, and encourage anyone interested to contribute on our GitHub page. For more information on Unity ML-Agents toolkit, continue reading below! For more detailed documentation, see our GitHub Wiki.
A visual depiction of how a Learning Environment might be configured within Unity ML-Agents Toolkit.
The three main kinds of objects within any Learning Environment are:
The states and observations of all agents with brains set to External are collected by the External Communicator, and communicated to our Python API for processing using your ML library of choice. By setting multiple agents to a single brain, actions can be decided in a batch fashion, opening the possibility of getting the advantages of parallel computation, when supported. For more information on how these objects work together within a scene, see our wiki page.
With Unity ML-Agents toolkit, a variety of training scenarios are possible, depending on how agents, brains, and rewards are connected. We are excited to see what kinds of novel and fun environments the community creates. For those new to training intelligent agents, below are a few examples that can serve as inspiration. Each is a prototypical environment configurations with a description of how it can be created using the ML-Agents SDK.
Beyond the flexible training scenarios made possible by the Academy/Brain/Agent system, the Unity ML-Agents toolkit also includes other features which improve the flexibility and interpretability of the training process.
Above each agent is a value estimate, corresponding to how much future reward the agent expects. When the right agent misses the ball, the value estimate drops to zero, since it expects the episode to end soon, resulting in no additional reward.
Different possible configurations of the GridWorld environment with increasing complexity.
Two different camera views on the same environment. When both are provided to an agent, it can learn to utilize both first-person and map-like information about the task to defeat the opponent.
As mentioned above, we are excited to be releasing this open beta version of Unity Machine Learning Agents Toolkit today, which can be downloaded from our GitHub page. This release is only the beginning, and we plan to iterate quickly and provide additional features for both those of you who are interested in Unity as a platform for Machine Learning research, and those of you who are focused on the potential of Machine Learning in game development. While this beta release is more focused on the former group, we will be increasingly providing support for the latter use-case. As mentioned above, we are especially interested in hearing about use-cases and features you would like to see included in future releases of Unity ML-Agents Toolkit, and we will be welcoming Pull Requests made to the GitHub Repository. Please feel free to reach out to us at firstname.lastname@example.org to share feedback and thoughts. If the project sparks your interests, come join the Unity Machine Learning team!