Skip to content

Commit bf16b9d

Browse files
authored
Feature/context management (#7)
* Context management * Checkpointing and sub-workflow context * Advanced Context Strategies * formatting
1 parent 2f280f3 commit bf16b9d

19 files changed

+3697
-4
lines changed

Cargo.toml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,18 @@ reqwest = { version = "0.11", features = ["json", "stream"] }
6161
# Optional - HTTP server (for examples)
6262
actix-web = { version = "4"}
6363

64+
[[bin]]
65+
name = "advanced_workflow_demo"
66+
path = "src/bin/advanced_workflow_demo.rs"
67+
68+
[[bin]]
69+
name = "advanced_strategies_demo"
70+
path = "src/bin/advanced_strategies_demo.rs"
71+
72+
[[bin]]
73+
name = "chat_history_demo"
74+
path = "src/bin/chat_history_demo.rs"
75+
6476
[[bin]]
6577
name = "hello_workflow"
6678
path = "src/bin/hello_workflow.rs"
Lines changed: 330 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,330 @@
1+
# Advanced Context Management Strategies
2+
3+
This document describes the advanced context management strategies available in Phase 6 of the workflow chat history implementation.
4+
5+
## Overview
6+
7+
In addition to the basic strategies (TokenBudgetManager and SlidingWindowManager), Phase 6 adds two advanced strategies for sophisticated context management:
8+
9+
1. **MessageTypeManager** - Priority-based pruning by message type
10+
2. **SummarizationManager** - LLM-based compression of old messages
11+
12+
## MessageTypeManager
13+
14+
### Purpose
15+
Prioritizes messages by type and importance, keeping system prompts and recent conversation pairs while pruning less critical messages like old tool calls.
16+
17+
### Use Cases
18+
- **Multi-agent workflows** where recent dialogue is critical
19+
- **Tool-heavy conversations** with many tool calls that become less relevant over time
20+
- **Conversational agents** that need to maintain recent context
21+
22+
### Configuration
23+
24+
```rust
25+
use agent_runtime::MessageTypeManager;
26+
27+
// Keep max 20 messages, preserve last 5 user/assistant pairs
28+
let manager = MessageTypeManager::new(20, 5);
29+
```
30+
31+
**Parameters:**
32+
- `max_messages`: Maximum total messages to keep in history
33+
- `keep_recent_pairs`: Number of recent user/assistant conversation pairs to always preserve
34+
35+
### Behavior
36+
37+
**Priority Levels:**
38+
1. **Critical** - System messages (always kept)
39+
2. **High** - User and Assistant messages (preserved by recency)
40+
3. **Low** - Tool messages (pruned first)
41+
42+
**Algorithm:**
43+
1. Always preserve all system messages
44+
2. Identify and protect the last N user/assistant pairs
45+
3. Remove low-priority messages (tool calls) first
46+
4. If still over limit, sort by priority and truncate
47+
48+
### Example
49+
50+
```rust
51+
let workflow = Workflow::builder()
52+
.with_chat_history(Arc::new(MessageTypeManager::new(15, 3)))
53+
.add_step(agent1) // Researcher
54+
.add_step(agent2) // Analyst (uses tools)
55+
.add_step(agent3) // Reporter
56+
.build();
57+
58+
// After execution:
59+
// - System prompts: Preserved
60+
// - Last 3 user/assistant pairs: Preserved
61+
// - Old tool calls: Pruned
62+
// - Total messages: ≤ 15
63+
```
64+
65+
### Advantages
66+
- ✅ Maintains conversation coherence
67+
- ✅ Preserves critical system instructions
68+
- ✅ Removes verbose tool outputs automatically
69+
- ✅ Simple, predictable behavior
70+
71+
### Limitations
72+
- Token count not considered (only message count)
73+
- May not work well for extremely long individual messages
74+
- Fixed priority scheme (not customizable)
75+
76+
## SummarizationManager
77+
78+
### Purpose
79+
Compresses old conversation history into summary messages when token limits are approached, preserving recent messages intact.
80+
81+
### Use Cases
82+
- **Long-running workflows** with extensive history
83+
- **Research pipelines** where old findings should be summarized
84+
- **Multi-stage analysis** where early stages can be compressed
85+
86+
### Configuration
87+
88+
```rust
89+
use agent_runtime::SummarizationManager;
90+
91+
// Max 18k input tokens
92+
// Trigger summarization at 15k tokens
93+
// Target ~500 tokens for summaries
94+
// Keep last 10 messages untouched
95+
let manager = SummarizationManager::new(18_000, 15_000, 500, 10);
96+
```
97+
98+
**Parameters:**
99+
- `max_input_tokens`: Maximum tokens allowed for input
100+
- `summarization_threshold`: Token count that triggers summarization
101+
- `summary_token_target`: Target size for compressed summaries (reserved for future use)
102+
- `keep_recent_count`: Number of recent messages to preserve unsummarized
103+
104+
### Behavior
105+
106+
**Algorithm:**
107+
1. Monitor total token count
108+
2. When exceeds threshold:
109+
- Split history into "old" (to summarize) and "recent" (keep as-is)
110+
- Preserve system messages from old section
111+
- Create summary of non-system old messages
112+
- Combine: system messages + summary + recent messages
113+
3. If still over limit, apply emergency truncation
114+
115+
**Summary Format:**
116+
```text
117+
Summary of previous conversation:
118+
119+
- 5 user inputs and 5 assistant responses
120+
- Initial topic: Analyze Q4 sales data and identify trends...
121+
- Latest response: Based on the analysis, I recommend increasing...
122+
123+
[This is a compressed summary. Original messages were removed to save context space.]
124+
```
125+
126+
### Example
127+
128+
```rust
129+
let workflow = Workflow::builder()
130+
.with_chat_history(Arc::new(SummarizationManager::new(
131+
18_000, // Max input tokens
132+
15_000, // Trigger at 15k
133+
500, // Summary target
134+
10 // Keep last 10 messages
135+
)))
136+
.add_step(researcher) // Stage 1
137+
.add_step(analyzer) // Stage 2
138+
.add_step(deep_analyzer) // Stage 3
139+
.add_step(reporter) // Stage 4
140+
.build();
141+
142+
// After execution with 30+ messages:
143+
// - System prompts: Preserved
144+
// - Messages 1-20: Summarized into compact summary
145+
// - Messages 21-30: Kept verbatim
146+
// - Final message count: ~12 messages (system + summary + last 10)
147+
```
148+
149+
### Advantages
150+
- ✅ Preserves information from old messages
151+
- ✅ Keeps recent context intact
152+
- ✅ Token-aware (not just message count)
153+
- ✅ Handles very long workflows
154+
155+
### Limitations
156+
- Current implementation uses template-based summaries (not LLM-generated)
157+
- Summary quality depends on implementation
158+
- Adds computational overhead (when enhanced with LLM calls)
159+
- May lose nuance from original messages
160+
161+
### Future Enhancements
162+
163+
The `summary_token_target` parameter is reserved for future LLM-based summarization:
164+
165+
```rust
166+
// Future enhancement: Call LLM to create intelligent summaries
167+
async fn create_llm_summary(
168+
messages: &[ChatMessage],
169+
target_tokens: usize,
170+
llm_client: &dyn ChatClient
171+
) -> ChatMessage {
172+
let prompt = format!(
173+
"Summarize the following conversation in approximately {} tokens:\n\n{}",
174+
target_tokens,
175+
format_messages(messages)
176+
);
177+
178+
let summary = llm_client.complete(prompt).await?;
179+
ChatMessage::system(summary)
180+
}
181+
```
182+
183+
## Strategy Comparison
184+
185+
| Feature | TokenBudget | SlidingWindow | MessageType | Summarization |
186+
|---------|-------------|---------------|-------------|---------------|
187+
| **Metric** | Tokens | Message count | Message count + type | Tokens |
188+
| **Pruning** | Oldest first | FIFO | Priority-based | Compression |
189+
| **Preserves** | System + recent | Recent only | System + pairs | System + recent |
190+
| **Best For** | General use | Simple cases | Multi-agent | Long workflows |
191+
| **Overhead** | Low | Very low | Low | Medium |
192+
| **Information Loss** | High | High | Medium | Low |
193+
194+
## Choosing a Strategy
195+
196+
### Use **TokenBudgetManager** when:
197+
- You need flexible token management (any context size/ratio)
198+
- Simple pruning is sufficient
199+
- General-purpose workflows
200+
201+
### Use **SlidingWindowManager** when:
202+
- You want predictable, simple behavior
203+
- Message count matters more than tokens
204+
- Stateless or short workflows
205+
206+
### Use **MessageTypeManager** when:
207+
- You have multi-agent conversations
208+
- Tool calls create noise in history
209+
- Recent dialogue is most important
210+
- You want to maintain conversation coherence
211+
212+
### Use **SummarizationManager** when:
213+
- Workflows can become very long
214+
- Old context should be compressed, not discarded
215+
- You need to preserve information over time
216+
- Token limits are strict
217+
218+
## Combining Strategies
219+
220+
While workflows use one strategy at a time, you can chain strategies externally:
221+
222+
```rust
223+
// Example: Apply MessageType first, then Summarization
224+
let checkpoint1 = workflow1.execute_with(MessageTypeManager::new(30, 5)).await;
225+
let restored = deserialize(checkpoint1);
226+
227+
let workflow2 = Workflow::builder()
228+
.with_restored_context(restored)
229+
.with_chat_history(SummarizationManager::new(18_000, 15_000, 500, 10))
230+
.add_step(next_agent)
231+
.build();
232+
```
233+
234+
## Performance Considerations
235+
236+
### MessageTypeManager
237+
- **Time Complexity**: O(n log n) for sorting protected messages
238+
- **Space Complexity**: O(n) for tracking indices
239+
- **Best Case**: Few messages, no pruning needed
240+
- **Worst Case**: Many messages, frequent pruning
241+
242+
### SummarizationManager
243+
- **Time Complexity**: O(n) for splitting and filtering
244+
- **Space Complexity**: O(n) for creating new history
245+
- **Best Case**: Below threshold, no summarization
246+
- **Worst Case**: Frequent summarization with LLM calls (future)
247+
248+
## Testing
249+
250+
Both strategies include comprehensive test coverage:
251+
252+
### MessageTypeManager Tests
253+
- Creation and configuration
254+
- Should-prune logic
255+
- Priority-based pruning
256+
- System message preservation
257+
- Recent pair extraction
258+
259+
### SummarizationManager Tests
260+
- Creation and configuration
261+
- Threshold-based pruning
262+
- Summary generation
263+
- Recent message preservation
264+
- System message handling
265+
- Emergency truncation
266+
267+
## Demonstration
268+
269+
Run the comprehensive demo:
270+
271+
```bash
272+
cargo run --bin advanced_strategies_demo
273+
```
274+
275+
This demonstrates:
276+
1. MessageTypeManager with multi-agent workflow
277+
2. SummarizationManager with multi-stage pipeline
278+
3. Side-by-side strategy comparison
279+
280+
## API Reference
281+
282+
### MessageTypeManager
283+
284+
```rust
285+
impl MessageTypeManager {
286+
pub fn new(max_messages: usize, keep_recent_pairs: usize) -> Self;
287+
}
288+
289+
#[async_trait]
290+
impl ContextManager for MessageTypeManager {
291+
async fn should_prune(&self, history: &[ChatMessage], _: usize) -> bool;
292+
async fn prune(&self, history: Vec<ChatMessage>)
293+
-> Result<(Vec<ChatMessage>, usize), ContextError>;
294+
fn estimate_tokens(&self, messages: &[ChatMessage]) -> usize;
295+
fn name(&self) -> &str;
296+
}
297+
```
298+
299+
### SummarizationManager
300+
301+
```rust
302+
impl SummarizationManager {
303+
pub fn new(
304+
max_input_tokens: usize,
305+
summarization_threshold: usize,
306+
summary_token_target: usize,
307+
keep_recent_count: usize
308+
) -> Self;
309+
}
310+
311+
#[async_trait]
312+
impl ContextManager for SummarizationManager {
313+
async fn should_prune(&self, _: &[ChatMessage], current_tokens: usize) -> bool;
314+
async fn prune(&self, history: Vec<ChatMessage>)
315+
-> Result<(Vec<ChatMessage>, usize), ContextError>;
316+
fn estimate_tokens(&self, messages: &[ChatMessage]) -> usize;
317+
fn name(&self) -> &str;
318+
}
319+
```
320+
321+
## Summary
322+
323+
Phase 6 adds sophisticated context management for advanced use cases:
324+
325+
- **MessageTypeManager**: Intelligent priority-based pruning
326+
- **SummarizationManager**: Compression instead of deletion
327+
- **Comprehensive tests**: 15 tests covering all scenarios
328+
- **Demo application**: Real-world examples
329+
330+
These strategies complement the basic strategies to provide a complete toolkit for workflow context management.

0 commit comments

Comments
 (0)