Skip to content

ChrisG777/Team-Leduc-Poker-Challenge

Repository files navigation

Team Belief DAG for Correlated Equilibria in Team Leduc Hold'em

This repository implements the Team Belief DAG (TB-DAG) approach for computing correlated team max-min equilibria in imperfect-information team games, based on the paper "Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization" (Zhang et al., ICML 2023).

Project Links

Overview

We implement TB-DAG fictitious play to compute correlated equilibria for Team Leduc Hold'em, a 4-player poker variant where:

  • Team 13 (Players 1 & 3) plays against Team 24 (Players 2 & 4)
  • Teammates share payoffs and can correlate strategies via a pre-game correlation device
  • Teammates cannot communicate during play

Key Results

Our implementation achieved exploitability scores of:

  • Team 13: 0.053
  • Team 24: 0.046

These scores represent the maximum amount an opponent could gain per hand by deviating from equilibrium play.

Installation

# Clone the repository
git clone <repository-url>
cd 6.s890pset2

# Install dependencies
pip install numpy

Downloading Leduc Files

The leduc_files/ directory containing the Leduc poker game tree and TB-DAG cache files is too large for GitHub. Download it from Dropbox and extract to the project root:

Download leduc_files.zip

After downloading:

unzip leduc_files.zip -d .

Quick Start

Running Fictitious Play on Leduc Poker

# Run 100 iterations of fictitious play
python3 scripts/run_leduc_fp.py --iterations 100

# Continue an existing run with 200 more iterations
python3 scripts/continue_fp.py leduc_output_0 --iterations 200

Testing on the Toy Game

# Run fictitious play on the small toy game
python3 tests/test_fp_correlated.py

# Run systematic tests
python3 tests/systematic_debug.py

Project Structure

.
├── src/                       # Core Implementation
│   ├── tdp_parser.py          # Parse game trees into Team Decision Problems
│   ├── TBDAG.py               # Convert TDP to Team Belief DAG (Algorithm 1)
│   ├── fictitious_play.py     # Fictitious play with incremental updates
│   ├── correlated_strategy.py # Manage mixtures of correlated strategies
│   ├── strategy_utils.py      # Strategy conversion utilities
│   └── strategy_io.py         # I/O utilities for strategy files
│
├── scripts/                   # Runner Scripts
│   ├── run_leduc_fp.py        # Main script for Leduc poker
│   ├── continue_fp.py         # Continue existing runs
│   └── tbdag_cache.py         # Cache TB-DAGs for faster loading
│
├── toy_files/                 # Toy game for testing
│   ├── toy_tree.txt           # Game tree definition
│   ├── toy_game_documentation.md
│   ├── player_*_infosets.txt  # Information set files
│   └── create_toy_pickle.py   # Create cached TB-DAGs
│
├── leduc_files/               # Leduc poker game files (download separately)
│   ├── leduc_tree.txt         # Game tree
│   ├── player_*_infosets.txt  # Information sets
│   └── *.pkl                  # Cached TB-DAGs
│
├── visualizations/            # TB-DAG visualizations
│   ├── visualize.py           # Generate PDF visualizations
│   ├── tb dag.pdf             # Paper's TB-DAG figure
│   └── toy_*.pdf              # Toy game visualizations
│
├── tests/                     # Test suite
│   ├── systematic_debug.py    # Main test for resume consistency
│   └── test_*.py              # Various unit tests
│
└── summaries/                 # Development documentation

Algorithm Overview

Team Belief DAG Construction

The TB-DAG transforms an imperfect-information team game into a perfect-information decision problem:

  1. Build Connectivity Graph: Connect game histories that share reachable information sets
  2. Create Active Nodes (Beliefs): Represent joint team states; actions are prescriptions
  3. Create Inactive Nodes (Observations): Represent public observations; children are connected components

Fictitious Play

Initialize: Random pure strategies for both teams
For t = 1, 2, ..., T:
    η = 1/(t+1)  # Learning rate

    # Team 1 best responds to Team 2's mixture
    s₁ᵗ = BestResponse(Team1_TBDAG, Team2_strategy)
    Team1_strategy ← (1-η)·Team1_strategy + η·s₁ᵗ

    # Team 2 best responds to Team 1's mixture
    s₂ᵗ = BestResponse(Team2_TBDAG, Team1_strategy)
    Team2_strategy ← (1-η)·Team2_strategy + η·s₂ᵗ

Key Optimizations

  1. Incremental Reach Probability Updates: O(terminals) per iteration instead of O(terminals × strategies)
  2. Cached TB-DAGs: Pickle TB-DAGs for instant loading on subsequent runs
  3. Memory-Efficient Mode: Discard behavioral strategies after computing reach probabilities
  4. Epsilon-Based Tie-Breaking: Ensures deterministic resume from checkpoints

Output Format

Strategies are saved in the following format:

leduc_output/
├── team13/
│   ├── strategy0-player1.npy     # numpy arrays [num_infosets, num_actions]
│   ├── strategy0-player3.npy
│   ├── strategy1-player1.npy
│   └── ...
├── team24/
│   └── ...
├── team13.zip                    # Submission zip
├── team24.zip
└── meta-strategy.csv             # Mixture weights

References

  • Zhang et al., "Team Belief DAG: Generalizing the Sequence Form to Team Games for Fast Computation of Correlated Team Max-Min Equilibria via Regret Minimization", ICML 2023
  • Project Report
  • Video Presentation

Authors

  • Chris Ge
  • Alan Lee
  • Sharvaa Selvan

MIT 6.S890 Final Project

About

6.S890 Team Leduc Poker Challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages