*Based on rlstructures v0.2*

**Other Tutorials:**

- Introduction to rlstructures
- Understanding the library
- REINFORCE with rlstructures
- Learning multiple policies at once
- A2C with GAE
- rlstructures and GPUs

The described techniques are available in the *rlalgos/reinforce_device* directory which illustrates the use of GPUs for REINFORCE. Note that the ability to use GPU (for loss computation) is also provided for DQN and A2C in the repository.

Let us restart from the REINFORCE implementation provided in a previous tutorial. In such an implementation, GPU can occur at three different locations to speed-up the learning process:

- At the loss computation level to accelerate loss computation.
- …

*Based on rlstructures v0.2*

**Other Tutorials:**

- Introduction to rlstructures
- Understanding the library
- REINFORCE with rlstructures
- Learning multiple policies at once
- A2C with GAE
- rlstructures and GPUs

In this tutorial, we describe how we can implement actor-critic methods.

- How we can use auto-reset environments to avoid wasting computation time
- Explain how the actor-critic loss can be computed with recurrent architectures

A first step is to implement the underlying model and corresponding agent. In our case, the model is a classic recurrent model that outputs at each timestep both action probabilities, and a critic value. Note that the critic and action. …

*Based on rlstructures v0.2*

**Other Tutorials:**

- Introduction to rlstructures
- Understanding the library
- REINFORCE with rlstructures
- Learning multiple policies at once
- A2C with GAE
- rlstructures and GPUs

In this tutorial, we illustrates the flexibility of *rlstructures* by showing how the previous implementation of REINFORCE can be easily modified to implement a completely different model used in a unsupervised-RL setting, where multiple policies are learned simultaneously. The model implemented is the *DIAYN* model proposed in https://arxiv.org/abs/1802.06070 (but in a REINFORCE version)

`@inproceedings{DBLP:conf/iclr/EysenbachGIL19, author = {Benjamin Eysenbach and Abhishek Gupta and Julian Ibarz and Sergey Levine}, title = {Diversity is All You Need…`

*Based on rlstructures v0.2*

**Other Tutorials:**

- Introduction to rlstructures
- Understanding the library
- REINFORCE with rlstructures
- Learning multiple policies at once
- A2C with GAE
- rlstructures and GPUs

In this tutorial we:

- Implement a parallelized version of REINFORCE using rlstructures

2. Show how we can add a parallel evaluation of the learned policy without slowing down the learning process (aka as fast as possible evaluation)

The complete source code is available in the rlstructures tutorial repository: http://github.com/facebookresearch/rlstructures

rlstructures proposes a *rlalgos.logger* object that is able to log scalars, images, text, etc both in *tensorboard *format but also in CSV format for future…

*Based on rlstructures v0.2*

**Other Tutorials:**

- Introduction to rlstructures
- Understanding the library
- REINFORCE with rlstructures
- Learning multiple policies at once
- A2C with GAE
- rlstructures and GPUs

In this article, we detail the different concepts used by **rlstructures **that will allow anyone to implement its own RL algorithm. The concepts are:

**Data structures: rlstructures**provide two main data structures that are used everywhere, namely**DictTensor**and**TemporalDictTensor (**and**Trajectories**that are just a pair of one DictTensor and one TemporalDictTensor**)****Agent API:**the agent API allows one to implement policies acting on a batch of environments in a simple way**Batcher…**

*Based on rlstructures v0.2*

**Link:** https://github.com/facebookresearch/rlstructures

**Other Tutorials:**

- Introduction to rlstructures
- Understanding the library
- REINFORCE with rlstructures
- Learning multiple policies at once
- A2C with GAE
- rlstructures and GPUs

This tutorial is the first one of a series of tutorials that explain how **rlstructures** can be used to implement complex RL algorithms that work at scale (multiple CPUs, multiple GPUs). We will publish new tutorials every one or two weeks focused on classical RL algorithms, but also on non-conventional ones (e.g hierarchical RL, unsupervised RL) and non-conventional application domains (e.g RL for computer vision, RL for compilers optimization, etc.)

Research Scientist at Facebook/FAIR -- publications are my owns