Every Monday, I present 4 publications from my research area. Let’s discuss them!

[← Previous review][Next review →]

Paper 1: Synthetic Returns for Long-Term Credit Assignment

Raposo, D., Ritter, S., Santoro, A., Wayne, G., Weber, T., Botvinick, M., van Hasselt H. & Song, F. (2021). Synthetic Returns for Long-Term Credit Assignment. arXiv preprint arXiv:2102.12425.

Good actions produce high rewards. Besides, the principle of causality tells us that the cause always precedes the effect. Put that together: a good action is associated with a high reward in the future. A logician would answer: is the reciprocal true? Does a high reward imply that all preceding actions are good? No. And it is on this false assertion that all RL algorithms are based…


Every Monday, I present 4 publications from my research area. Let’s discuss them!

[← Previous review][Next review →]

Paper 1: Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification

Eysenbach, B., Levine, S., & Salakhutdinov, R. (2021). Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification. arXiv preprint arXiv:2103.12656.

The core of reinforcement learning is the reward function: how well the agent does. In some cases, this reward function is easy to describe: in video games: let’s take the score. In other cases, it is not easy to give a reward function. Let’s take the example of the article: closing a drawer. …


Every Monday, I present 4 publications from my research area. Let’s discuss them!

[← Previous review][Next review →]

Paper 1: Learning to Fly — a Gym Environment with PyBullet Physics for Reinforcement Learning of Multi-agent Quadcopter Control

[Paper] Panerati J. and al.

It is quite common to see robotic environments containing robotic arms or navigation robots. But have you ever tried your learning algorithms on drones? That’s what the authors of this paper propose: an open-source OpenAI gym environment based on PyBullet for different tasks involving one or more quadricopters.


Every Monday, I present 4 publications from my research area. Let’s discuss them!

[Next review →]

Paper 1: First return, then explore

[Paper] in Nature —Ecoffet A., Huizinga J. and al.

Let’s start with a very important article, recently published in Nature. The authors tackle a fundamental problem in reinforcement learning: exploration.
They suggest that one of the difficulties to achieve an efficient exploration is the difficulty to return to an interesting state. Example: an agent manages to reach Mario’s advanced state, and thus gets a high reward. Nevertheless, in the next episode of training, this agent is not able to return to this state to continue the exploration of the game. …


Photo by NordWood Themes on Unsplash

Create the directory for your package

Here is the very minimal directory you need to create a package.

my_package
├── my_package
│ └── __init__.py
└── setup.py

Let’s see what’s in each of the files.

It is the main file, where you’ll put all your functions, class, objects… Here is an example :

def my_function():
print("Hello world!")

And that’s it !

This file is used for installation. Here is an minimal example:

from setuptools import setup, find_packages
setup(
name="my_package",
version="0.1",
packages=find_packages(),
)

find_packages() is used so that, during installation, all the packages necessary for the proper functioning of your package are also installed.

Let’s test it

Open a terminal, and…


Photo by NASA on Unsplash

Le Big Data, qu’est-ce donc ? En examinant les tendances de recherche de Google Trends, on remarque que c’est un terme très peu utilisé avant 2012.

Quentin Gallouédec

PhD student in machine learning. Engineer from Ecole Centrale de Lyon, France. https://qgallouedec.github.io

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store