CellScientist

CellScientist is an autonomous AI agent framework designed for Virtual Cell Modeling (VCM). It employs a Dual-Space Bilevel Optimization strategy to align symbolic scientific hypotheses with computational code implementation.

The system operates through a structured Task Hypergraph, performing evolutionary optimization to discover robust biological models.

🛠️ Installation

conda create --name CellScientist python=3.11.14
conda activate CellScientist
cd CellScientist
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2+cu118 -f [https://download.pytorch.org/whl/cu118/torch_stable.html](https://download.pytorch.org/whl/cu118/torch_stable.html)
pip install torch-scatter torch-sparse torch-cluster torch-spline-conv -f [https://data.pyg.org/whl/torch-2.0.1+cu118.html](https://data.pyg.org/whl/torch-2.0.1+cu118.html)
pip install -r requirements.txt

📂 Project Structure

CellScientist/
├── Design_Analysis/          # [Phase 1] Exploration & Hypergraph Initialization
│   ├── cellscientist_phase_1.py
│   └── design_analysis_config.json
│
├── Generate_Execution/       # [Phase 2] Top-Down Instantiation & Code Generation
│   ├── cellscientist_phase_2.py
│   └── generate_execution_config.json
│
├── Review_Feedback/          # [Phase 3] Bottom-Up Refinement
│   ├── cellscientist_phase_3.py
│   ├── review_feedback_config.json
│   └── CodeEvo/              # Evolution History & Artifacts
│
├── pipeline_config.json      # ⭐ Unified pipeline-level configuration (recommended)
├── run_cellscientist.py      # 🚀 Unified pipeline runner
├── requirements.txt
└── README.md

Design_Analysis/ – handles design and analytical logic
Generate_Execution/ – manages generation and execution processes
Review_Feedback/ – reviews and iteratives optimization process
llm_providers.json – defines available LLM configurations
requirements.txt – Python dependencies
run_cellscientist.py – The master script that validates configurations and executes Phase 1, 2, and 3 sequentially in isolated environments.

⚙️ Experiment Settings & Environment

Hardware & Software Infrastructure

Experiments are conducted on high-performance nodes tailored.

CPU: Dual Intel Xeon Platinum 8336C @ 2.30GHz
GPU: NVIDIA RTX 5880 Ada Generation (48GB VRAM)
Memory: 512 GB DDR4 ECC
Software: Python 3.11.14, PyTorch 2.0.1+cu118, PyG 2.3.0, CUDA 11.8

Hyperparameters (Key Configurations)

The Dual-Space Bilevel Optimization is controlled via hierarchical configs:

LLM Engine: Gemini 3 Pro (Temp: 0.5 - 0.7)
Design Phase: 4 parallel hypothesis branches; Max 3 self-correction fix rounds.
Execution Phase: Global timeout 100h; Step timeout 5h; Max 5 debugging rounds.
Review Phase: Max 10 optimization iterations; Optimized via Pearson Correlation Coefficient (PCC).

Cost Efficiency

CellScientist minimizes cost through a Contextual Memory mechanism that reduces token load by ~60% in later iterations.

Average Run (3-5 iterations): $1.00 - $2.00 USD
Complex Run (10 iterations): < $5.00 USD

🚀 Usage

Method I: The Unified Pipeline

python run_cellscientist.py

Reads pipeline_config.json (if present)
Automatically merges shared parameters into each phase config
Ensures consistent dataset, paths, GPU, and LLM settings across all phases

Method II: Manual Phase Execution

You can also run individual phases manually if you need to debug a specific step.

Phase 1: Design & Analysis

cd Design_Analysis
python cellscientist_phase_1.py design_analysis_config.json

Phase 2: Generation & Execution

cd Generate_Execution
python cellscientist_phase_2.py --config generate_execution_config.json run --use-idea

Phase 3: Review & Optimization

cd Review_Feedback
python cellscientist_phase_3.py --config review_feedback_config.json

📊 Outputs

All experiment artifacts are automatically organized in ../results/<dataset_name>/:

pipeline_summary.json: The global scoreboard containing success rates, budget usage, and performance metrics across all phases.
logs/<timestamp>/:
Contains detailed execution logs (phase1.log, phase2.log, phase3.log) for full traceability.
advanced_metrics/: Analysis of mechanism diversity and code complexity.
design_analysis/: Generated scientific hypotheses and initial data artifacts.
generate_execution/: Source code instantiation, execution logs, and intermediate notebooks.
review_feedback/: Final optimized model, evolutionary history artifacts, and optimization process logs.
Optimization history (optimization_history.md, optimization_tree.txt).
notebook_best.ipynb: The final, best-performing model code validated by the system.

Name		Name	Last commit message	Last commit date
Latest commit History 226 Commits
Design_Analysis		Design_Analysis
Final_Report/prompts		Final_Report/prompts
Generate_Execution		Generate_Execution
Review_Feedback		Review_Feedback
README.md		README.md
pipeline_config.json		pipeline_config.json
requirements.txt		requirements.txt
run_cellscientist.py		run_cellscientist.py
runner_advanced_metrics.py		runner_advanced_metrics.py
runner_config.py		runner_config.py
runner_metrics.py		runner_metrics.py
runner_report.py		runner_report.py
runner_utils.py		runner_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CellScientist

🛠️ Installation

📂 Project Structure

⚙️ Experiment Settings & Environment

Hardware & Software Infrastructure

Hyperparameters (Key Configurations)

Cost Efficiency

🚀 Usage

Method I: The Unified Pipeline

Method II: Manual Phase Execution

📊 Outputs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CellScientist

🛠️ Installation

📂 Project Structure

⚙️ Experiment Settings & Environment

Hardware & Software Infrastructure

Hyperparameters (Key Configurations)

Cost Efficiency

🚀 Usage

Method I: The Unified Pipeline

Method II: Manual Phase Execution

📊 Outputs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages