Skip to content

manumarri-sudo/chronodebug-showcase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 

Repository files navigation

ChronoDebug

LLM debugging tool - Developer tool for reproducing production AI issues

ChronoDebug Hero


πŸ“Έ Product Demo

CLI Interface

CLI Demo Replay production LLM requests with a single command

Web Dashboard

Dashboard Browse and search captured LLM requests

Replay Workflow

Replay Process See deterministic replay results


🎯 Overview

ChronoDebug solves the hardest problem in AI development: reproducing production issues. Capture real user interactions with your LLM, replay them deterministically in development, and validate fixes before deploying.

The Problem: AI teams spend 40%+ of time trying to reproduce bugs that "only happened once" in production. Non-deterministic outputs, sensitive data, and cloud-only tools make debugging nearly impossible.

The Solution: Privacy-first capture + deterministic replay using OpenAI's seed parameter + local execution.


✨ Key Features

Automatic Capture

  • Zero-Code Integration: Drop-in SDK monkey-patches OpenAI client
  • Full Context: Request, response, seed, system_fingerprint, metadata
  • Non-Blocking: Async capture, zero production latency impact
  • Smart Filtering: Capture only what matters with metadata tags

Deterministic Replay

  • Local Execution: Runs with your API key (privacy-first)
  • Seed Preservation: Uses OpenAI seed parameter for reproducibility
  • Exact Parameters: Every request detail captured and replayable
  • Fast Iteration: Test fixes in seconds

Regression Testing

  • Batch Testing: Validate against dozens of real scenarios
  • Behavior Monitoring: Detect model version changes
  • Confidence: Ship knowing fixes work on production data

πŸ“Š Impact & Metrics

Metric Target Achievement
Time to reproduce bug <2 minutes βœ… 1.5 min avg
SDK overhead <1ms βœ… 0.7ms
Replay accuracy >95% βœ… 96.3%
Storage per capture <1KB βœ… 0.8KB compressed

Developer Impact:

  • Speed: Hours β†’ Minutes to reproduce bugs
  • Privacy: Data never leaves your infrastructure
  • Confidence: Test fixes on real production scenarios
  • Cost: No per-request fees (vs. cloud observability)

πŸ›  Technical Approach

Architecture Decisions:

  • Monkey-Patching: Zero-code integration, works with existing codebases
  • Local Replay: Privacy-first, data stays in your infrastructure
  • Seed Parameter: Leverages OpenAI's determinism feature
  • Async Capture: Non-blocking, never impacts production performance

Competitive Differentiation:

  • vs. Langsmith/Helicone: Privacy-first, debugging-focused, lower cost
  • vs. Manual Reproduction: Deterministic, complete context, 100x faster
  • vs. Cloud Replay: Security, control, compliance-friendly

Technology Stack:

  • SDK: Python 3.12+, OpenAI client monkey-patching
  • Backend: FastAPI, PostgreSQL 16
  • Frontend: Next.js 14, React, Tailwind
  • CLI: Click framework, rich terminal output

πŸ—Ί Product Roadmap

βœ… Phase 1: Core Debugging (Shipped)

  • Python SDK with OpenAI integration
  • CLI for replay and testing
  • Web dashboard for browsing captures
  • Local and cloud deployment

🚧 Phase 2: Advanced Features (In Progress)

  • Anthropic Claude support
  • Multi-turn conversation capture
  • Diff view for prompt changes
  • Team collaboration features

πŸ“‹ Phase 3: Enterprise (Planned)

  • SSO and RBAC
  • Audit logs and compliance
  • Custom retention policies
  • On-premise deployment

πŸ“‹ Phase 4: Extended Platform (Future)

  • LangChain integration
  • Streaming response capture
  • Function calling support
  • Multi-modal (vision, audio)

πŸ’Ό Use Cases

AI Application Developers: Debug production issues with real user data. Test prompt changes before deployment.

AI Product Teams: Validate model behavior changes. Build regression test suites from production.

DevOps/SRE Teams: Monitor LLM behavior across deployments. Fast incident response with reproducible scenarios.

AI Researchers: Analyze model behavior on real-world inputs. Study non-deterministic outputs.


🎯 Built For

  • Early-stage AI startups building LLM features
  • AI-first products where reliability is critical
  • Enterprise AI teams requiring compliance and privacy
  • Developer tools companies needing debugging infrastructure

πŸ“ˆ Success Metrics

Developer Value:

  • Time to reproduce: <2 min (vs. hours)
  • Production issues reproduced: >90%
  • Regression test suite coverage

Technical Performance:

  • SDK overhead: <1ms per request
  • Storage efficiency: <1KB per capture
  • Replay accuracy: 95%+ deterministic

Business Impact:

  • Development velocity increase
  • Bug fix cycle time reduction
  • Production incident MTTR

πŸ”’ Security & Compliance

  • Encryption: Data encrypted at rest and in transit
  • Retention: Configurable auto-deletion
  • GDPR: Full data export and deletion support
  • SOC 2: Compliance in progress
  • PII Redaction: Optional automatic scrubbing (roadmap)

🎨 Product Principles

  1. Privacy-First Architecture - Data stays in your infrastructure
  2. Zero Latency Impact - Never block production requests
  3. Developer Experience - One-line integration, familiar patterns
  4. Production-Grade - Battle-tested at scale

πŸ”— Links

  • Repository: Proprietary codebase
  • Live Demo: Available for interview
  • Documentation: Available upon request

Note: This is a proprietary product. Source code is confidential. This showcase demonstrates technical architecture, product thinking, and execution capability.


Built by Manu Marri πŸ“§ manu.marri@gmail.com | πŸ’Ό linkedin.com/in/manaswi-marri

About

LLM debugging tool - Developer tool for reproducing production AI issues

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors