This repository implements the Team Belief DAG (TB-DAG) approach for computing correlated team max-min equilibria in imperfect-information team games, based on the paper "Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization" (Zhang et al., ICML 2023).
- Final Report: Chris Ge Alan Lee Sharvaa Selvan 6.S890 Final Project.pdf
- Video Presentation: https://www.youtube.com/watch?v=zhahQ4fz03Q
We implement TB-DAG fictitious play to compute correlated equilibria for Team Leduc Hold'em, a 4-player poker variant where:
- Team 13 (Players 1 & 3) plays against Team 24 (Players 2 & 4)
- Teammates share payoffs and can correlate strategies via a pre-game correlation device
- Teammates cannot communicate during play
Our implementation achieved exploitability scores of:
- Team 13: 0.053
- Team 24: 0.046
These scores represent the maximum amount an opponent could gain per hand by deviating from equilibrium play.
# Clone the repository
git clone <repository-url>
cd 6.s890pset2
# Install dependencies
pip install numpyThe leduc_files/ directory containing the Leduc poker game tree and TB-DAG cache files is too large for GitHub. Download it from Dropbox and extract to the project root:
After downloading:
unzip leduc_files.zip -d .# Run 100 iterations of fictitious play
python3 scripts/run_leduc_fp.py --iterations 100
# Continue an existing run with 200 more iterations
python3 scripts/continue_fp.py leduc_output_0 --iterations 200# Run fictitious play on the small toy game
python3 tests/test_fp_correlated.py
# Run systematic tests
python3 tests/systematic_debug.py.
├── src/ # Core Implementation
│ ├── tdp_parser.py # Parse game trees into Team Decision Problems
│ ├── TBDAG.py # Convert TDP to Team Belief DAG (Algorithm 1)
│ ├── fictitious_play.py # Fictitious play with incremental updates
│ ├── correlated_strategy.py # Manage mixtures of correlated strategies
│ ├── strategy_utils.py # Strategy conversion utilities
│ └── strategy_io.py # I/O utilities for strategy files
│
├── scripts/ # Runner Scripts
│ ├── run_leduc_fp.py # Main script for Leduc poker
│ ├── continue_fp.py # Continue existing runs
│ └── tbdag_cache.py # Cache TB-DAGs for faster loading
│
├── toy_files/ # Toy game for testing
│ ├── toy_tree.txt # Game tree definition
│ ├── toy_game_documentation.md
│ ├── player_*_infosets.txt # Information set files
│ └── create_toy_pickle.py # Create cached TB-DAGs
│
├── leduc_files/ # Leduc poker game files (download separately)
│ ├── leduc_tree.txt # Game tree
│ ├── player_*_infosets.txt # Information sets
│ └── *.pkl # Cached TB-DAGs
│
├── visualizations/ # TB-DAG visualizations
│ ├── visualize.py # Generate PDF visualizations
│ ├── tb dag.pdf # Paper's TB-DAG figure
│ └── toy_*.pdf # Toy game visualizations
│
├── tests/ # Test suite
│ ├── systematic_debug.py # Main test for resume consistency
│ └── test_*.py # Various unit tests
│
└── summaries/ # Development documentation
The TB-DAG transforms an imperfect-information team game into a perfect-information decision problem:
- Build Connectivity Graph: Connect game histories that share reachable information sets
- Create Active Nodes (Beliefs): Represent joint team states; actions are prescriptions
- Create Inactive Nodes (Observations): Represent public observations; children are connected components
Initialize: Random pure strategies for both teams
For t = 1, 2, ..., T:
η = 1/(t+1) # Learning rate
# Team 1 best responds to Team 2's mixture
s₁ᵗ = BestResponse(Team1_TBDAG, Team2_strategy)
Team1_strategy ← (1-η)·Team1_strategy + η·s₁ᵗ
# Team 2 best responds to Team 1's mixture
s₂ᵗ = BestResponse(Team2_TBDAG, Team1_strategy)
Team2_strategy ← (1-η)·Team2_strategy + η·s₂ᵗ
- Incremental Reach Probability Updates: O(terminals) per iteration instead of O(terminals × strategies)
- Cached TB-DAGs: Pickle TB-DAGs for instant loading on subsequent runs
- Memory-Efficient Mode: Discard behavioral strategies after computing reach probabilities
- Epsilon-Based Tie-Breaking: Ensures deterministic resume from checkpoints
Strategies are saved in the following format:
leduc_output/
├── team13/
│ ├── strategy0-player1.npy # numpy arrays [num_infosets, num_actions]
│ ├── strategy0-player3.npy
│ ├── strategy1-player1.npy
│ └── ...
├── team24/
│ └── ...
├── team13.zip # Submission zip
├── team24.zip
└── meta-strategy.csv # Mixture weights
- Zhang et al., "Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization", ICML 2023
- Project Report
- Video Presentation
- Chris Ge
- Alan Lee
- Sharvaa Selvan
MIT 6.S890 Final Project