Framework Support — Testing LangGraph, CrewAI, OpenAI, Claude, and More

EvalView provides dedicated testing adapters for the most popular AI agent frameworks. Each adapter handles framework-specific API formats, tool call extraction, and response parsing automatically.

EvalView supports multiple AI agent frameworks out of the box. Each framework has a dedicated adapter that handles its specific API format.

Supported Frameworks

Framework	Adapter	Auto-Detect	Default Port	Endpoint
LangGraph	`langgraph`	✅	8000	`/api/chat` or `/invoke`
LangServe	`http` or `streaming`	✅	8000	`/agent` or `/agent/stream`
CrewAI	`crewai`	✅	8000	`/crew`
OpenAI Assistants	`openai-assistants`	N/A	N/A	Uses OpenAI API
TapeScope	`streaming`	✅	3000	`/api/unifiedchat`
Generic REST	`http`	✅	Any	Any
Generic Streaming	`streaming`	✅	Any	Any

Quick Start

Auto-Detection (Recommended)

# Start your agent server first
# Then let EvalView detect it automatically
evalview connect

The connect command will:

Try common endpoints
Detect which framework is running
Configure the correct adapter automatically
Update .evalview/config.yaml

Manual Configuration

Edit .evalview/config.yaml:

adapter: langgraph  # or crewai, http, streaming, etc.
endpoint: http://localhost:8000/api/chat
timeout: 30.0

Framework-Specific Guides

1. LangGraph

What it supports:

Standard invoke endpoint
Streaming responses
Message-based APIs
Thread tracking

Setup:

# Start LangGraph agent
cd /path/to/langgraph-agent
python main.py
# or
uvicorn main:app --reload --port 8000

# Connect EvalView
evalview connect

Config:

adapter: langgraph
endpoint: http://localhost:8000/api/chat
streaming: false  # Set to true for streaming endpoints
timeout: 30.0

model:
  name: gpt-4o-mini

Test Case Example:

name: "LangGraph Test"
input:
  query: "What is the weather in SF?"
  context: {}

expected:
  tools: [tavily_search]  # Update with your actual tools
  output:
    contains: ["San Francisco", "weather"]

thresholds:
  min_score: 70
  max_cost: 0.50
  max_latency: 10000

Response Format Expected:

{
  "messages": [
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."}
  ],
  "thread_id": "...",
  "intermediate_steps": [...]
}

2. CrewAI

What it supports:

Task-based execution
Multi-agent crews
Usage metrics

Setup:

# Start CrewAI API
cd /path/to/crewai-agent
python api.py  # or however you serve it

# Connect
evalview connect

Config:

adapter: crewai
endpoint: http://localhost:8000/crew
timeout: 120.0  # CrewAI can be slow

Test Case Example:

name: "CrewAI Research Test"
input:
  query: "Research AI trends in 2025"
  context: {}

expected:
  tools: []  # CrewAI uses agents, not direct tools
  output:
    contains: ["AI", "trends", "2025"]

thresholds:
  min_score: 75
  max_cost: 2.00
  max_latency: 60000  # 60 seconds

Response Format Expected:

{
  "result": "Final crew output",
  "tasks": [
    {
      "id": "task-1",
      "description": "Research task",
      "output": "...",
      "status": "completed"
    }
  ],
  "usage_metrics": {
    "total_tokens": 1500,
    "total_cost": 0.045
  }
}

3. OpenAI Assistants

What it supports:

OpenAI Assistants API
Function calling
Code interpreter
File search/retrieval

Setup:

# Set your OpenAI API key
export OPENAI_API_KEY=sk-...

# No server needed - uses OpenAI API directly

Config:

adapter: openai-assistants
assistant_id: asst_xxxxxxxxxxxxx  # Your assistant ID
timeout: 120.0

Test Case Example:

name: "OpenAI Assistant Test"
input:
  query: "Calculate the fibonacci sequence up to 10"
  context:
    assistant_id: asst_xxxxxxxxxxxxx  # Can override here too

expected:
  tools: [code_interpreter]
  output:
    contains: ["fibonacci", "0, 1, 1, 2, 3, 5, 8"]

thresholds:
  min_score: 80
  max_cost: 0.50
  max_latency: 30000

Notes:

Requires openai Python package: pip install openai
Uses threads and runs under the hood
Automatically polls for completion

4. LangServe

What it supports:

Standard REST endpoints
Streaming via Server-Sent Events
Batch processing

Setup:

# Start LangServe
cd /path/to/langserve-app
python server.py

# Connect
evalview connect

Config (non-streaming):

adapter: http
endpoint: http://localhost:8000/agent/invoke
timeout: 30.0

Config (streaming):

adapter: streaming
endpoint: http://localhost:8000/agent/stream
timeout: 60.0

5. Generic HTTP/REST

For any custom REST API

Config:

adapter: http
endpoint: http://localhost:YOUR_PORT/YOUR_PATH
timeout: 30.0
headers:
  Authorization: Bearer YOUR_TOKEN
  Content-Type: application/json

Expected Request Format:

{
  "query": "User query here",
  "context": {}
}

Expected Response Format:

{
  "session_id": "...",
  "output": "Final response",
  "steps": [
    {
      "id": "step-1",
      "name": "Step name",
      "tool": "tool_name",
      "parameters": {...},
      "output": {...},
      "latency": 123,
      "cost": 0.001
    }
  ],
  "cost": 0.05,
  "tokens": 1000
}

Creating Custom Adapters

If your framework isn't supported, create a custom adapter:

# evalview/adapters/my_adapter.py
from evalview.adapters.base import AgentAdapter
from evalview.core.types import ExecutionTrace, StepTrace, StepMetrics, ExecutionMetrics
from datetime import datetime

class MyAdapter(AgentAdapter):
    @property
    def name(self) -> str:
        return "my-adapter"

    async def execute(self, query: str, context=None) -> ExecutionTrace:
        # 1. Call your agent API
        # 2. Parse response
        # 3. Extract steps and output
        # 4. Return ExecutionTrace
        pass

from evalview.adapters.my_adapter import MyAdapter

# In _run_async():
elif adapter_type == "my-adapter":
    adapter = MyAdapter(...)

See ADAPTERS.md for full guide.

Troubleshooting

Connection Failed

# Test endpoint manually
curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "test"}'

# Check if server is running
lsof -i :8000

# Try auto-detect
evalview connect

Wrong Adapter Detected

Manually set in .evalview/config.yaml:

adapter: langgraph  # Override auto-detection

Response Format Mismatch

Run with verbose to see actual response:

evalview run --verbose

Then adjust your test case or create a custom adapter.

Timeout Issues

Increase timeout:

timeout: 120.0  # 2 minutes

Framework Comparison

Feature	LangGraph	CrewAI	OpenAI	LangServe
Streaming	✅	❌	❌	✅
Multi-step	✅	✅	✅	✅
Self-hosted	✅	✅	❌	✅
Tool tracking	✅	Partial	✅	✅
Cost tracking	Manual	✅	✅	Manual

Best Practices

Always use evalview connect first - Let it auto-detect
Start with verbose mode - Understand API responses
Check framework docs - Verify endpoint paths
Use framework-specific adapters - Better parsing and metrics
Monitor timeouts - Some agents can be slow

Need Help?

Check QUICKSTART_LANGGRAPH.md for LangGraph
Check SETUP_LANGGRAPH_EXAMPLE.md for detailed setup
Check ADAPTERS.md for custom adapters
Open an issue: https://github.com/hidai25/eval-view/issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Framework Support — Testing LangGraph, CrewAI, OpenAI, Claude, and More

Supported Frameworks

Quick Start

Auto-Detection (Recommended)

Manual Configuration

Framework-Specific Guides

1. LangGraph

2. CrewAI

3. OpenAI Assistants

4. LangServe

5. Generic HTTP/REST

Creating Custom Adapters

Troubleshooting

Connection Failed

Wrong Adapter Detected

Response Format Mismatch

Timeout Issues

Framework Comparison

Best Practices

Need Help?

Related Documentation

FilesExpand file tree

FRAMEWORK_SUPPORT.md

Latest commit

History

FRAMEWORK_SUPPORT.md

File metadata and controls

Framework Support — Testing LangGraph, CrewAI, OpenAI, Claude, and More

Supported Frameworks

Quick Start

Auto-Detection (Recommended)

Manual Configuration

Framework-Specific Guides

1. LangGraph

2. CrewAI

3. OpenAI Assistants

4. LangServe

5. Generic HTTP/REST

Creating Custom Adapters

Troubleshooting

Connection Failed

Wrong Adapter Detected

Response Format Mismatch

Timeout Issues

Framework Comparison

Best Practices

Need Help?

Related Documentation