[← Previous review][Next review →]
Raposo, D., Ritter, S., Santoro, A., Wayne, G., Weber, T., Botvinick, M., van Hasselt H. & Song, F. (2021). Synthetic Returns for Long-Term Credit Assignment. arXiv preprint arXiv:2102.12425.
Good actions produce high rewards. Besides, the principle of causality tells us that the cause always precedes the effect. Put that together: a good action is associated with a high reward in the future. A logician would answer: is the reciprocal true? Does a high reward imply that all preceding actions are good? No. And it is on this false assertion that all RL algorithms are based…
Eysenbach, B., Levine, S., & Salakhutdinov, R. (2021). Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification. arXiv preprint arXiv:2103.12656.
The core of reinforcement learning is the reward function: how well the agent does. In some cases, this reward function is easy to describe: in video games: let’s take the score. In other cases, it is not easy to give a reward function. Let’s take the example of the article: closing a drawer. …
[Paper] — Panerati J. and al.
It is quite common to see robotic environments containing robotic arms or navigation robots. But have you ever tried your learning algorithms on drones? That’s what the authors of this paper propose: an open-source OpenAI gym environment based on PyBullet for different tasks involving one or more quadricopters.
[Paper] in Nature —Ecoffet A., Huizinga J. and al.
Let’s start with a very important article, recently published in Nature. The authors tackle a fundamental problem in reinforcement learning: exploration.
They suggest that one of the difficulties to achieve an efficient exploration is the difficulty to return to an interesting state. Example: an agent manages to reach Mario’s advanced state, and thus gets a high reward. Nevertheless, in the next episode of training, this agent is not able to return to this state to continue the exploration of the game. …
Here is the very minimal directory you need to create a package.
│ └── __init__.py
Let’s see what’s in each of the files.
It is the main file, where you’ll put all your functions, class, objects… Here is an example :
And that’s it !
This file is used for installation. Here is an minimal example:
from setuptools import setup, find_packages
find_packages() is used so that, during installation, all the packages necessary for the proper functioning of your package are also installed.
Open a terminal, and…
Le Big Data, qu’est-ce donc ? En examinant les tendances de recherche de Google Trends, on remarque que c’est un terme très peu utilisé avant 2012.