Grid2op · william-xue · May 31, 2026
diff --git a/docs/user/reward.rst b/docs/user/reward.rst
@@ -28,6 +28,13 @@ Customization of the reward
 In grid2op you can customize the reward function / reward kernel used by your agent. By default, when you create an
 environment a reward has been specified for you by the creator of the environment and you have nothing to do:
 
+.. note::
+    In the mathematical MDP notation, the reward kernel is often written as a function of the state,
+    the next state and the action. In grid2op's implementation, reward classes also receive contextual
+    flags such as `has_error`, `is_illegal` and `is_ambiguous`. These flags make it possible to distinguish
+    the original action submitted by the agent from the action effectively applied by the environment, for
+    example when an out-of-bounds redispatching action is replaced by a do-nothing action.
+
 .. code-block:: python
 
     import grid2op