Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/user/reward.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,13 @@ Customization of the reward
In grid2op you can customize the reward function / reward kernel used by your agent. By default, when you create an
environment a reward has been specified for you by the creator of the environment and you have nothing to do:

.. note::
In the mathematical MDP notation, the reward kernel is often written as a function of the state,
the next state and the action. In grid2op's implementation, reward classes also receive contextual
flags such as `has_error`, `is_illegal` and `is_ambiguous`. These flags make it possible to distinguish
the original action submitted by the agent from the action effectively applied by the environment, for
example when an out-of-bounds redispatching action is replaced by a do-nothing action.

.. code-block:: python

import grid2op
Expand Down