Name	Name	Last commit message	Last commit date
parent directory ..
config	config
db	db
gradle/wrapper	gradle/wrapper
scripts	scripts
src	src
.env.example	.env.example
.gitignore	.gitignore
Dockerfile	Dockerfile
Makefile	Makefile
README.md	README.md
build.gradle	build.gradle
compose.yml	compose.yml
gradlew	gradlew
gradlew.bat	gradlew.bat
settings.gradle	settings.gradle

AI Customer Support

Full Documentation

Conversational AI customer support agent with RAG retrieval, tool calling, intent classification, escalation routing, and full OpenTelemetry observability.

Architecture

Message → Classify → Retrieve → Generate → PII Scrub → Route
              │          │          │                      │
          gpt-4.1-mini  pgvector   gpt-4.1              Escalate?
          (fast)        (RAG)      (capable + tools)

5-stage pipeline with three-layer OTel: Java Agent (HTTP/DB/Spring auto-instrumentation), Spring AI built-in (ChatModel/VectorStore via Micrometer), and manual spans (pipeline stages, domain metrics, gateway contract).

Quick Start

# Copy and configure environment
cp .env.example .env
# Set OPENAI_API_KEY in .env

# Start all services
docker compose up -d

# Run smoke tests
./scripts/test-api.sh

# Send a message
curl -X POST http://localhost:8080/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message":"What is the status of order ORD-10001?"}'

API Endpoints

Method	Path	Description
`POST`	`/api/chat`	Send message, get JSON response with intent + content
`POST`	`/api/chat/stream`	Send message, get SSE streaming response
`GET`	`/api/conversations`	List all conversations
`GET`	`/api/conversations/{id}`	Get conversation with message history
`POST`	`/api/conversations/{id}/resolve`	Resolve a conversation
`GET`	`/api/products`	List all products
`GET`	`/api/products/{sku}`	Get product by SKU
`GET`	`/api/orders/{orderId}`	Get order by order ID
`GET`	`/api/health`	Health check
`GET`	`/api/failures`	List failure scenarios (failure-injection profile)
`POST`	`/api/failures/{scenario}`	Trigger failure scenario

Data

TechMart e-commerce store: 50 KB articles (10 categories), 30 products, 20 customers, 25 orders, 10 returns. KB articles are embedded into pgvector on first startup via Spring AI's OpenAI embedding model (text-embedding-3-small).

Tool Calling

Spring AI @Tool-annotated methods available to the LLM:

Tool	Description
`getOrderStatus`	Look up order status by order ID
`getOrderHistory`	Get customer's recent orders by email
`initiateReturn`	Start a return for a delivered order
`getReturnStatus`	Check return status by return ID
`searchProducts`	Search catalog by name/category
`getProductInfo`	Get product details by SKU

Observability

Every message produces a trace with:

support_conversation — root pipeline span
classify_intent — intent classification (fast model)
rag_retrieval — pgvector similarity search with match count
gen_ai.chat {model} — LLM calls with full GenAI semconv attributes
generate_response — response generation (capable model)
escalation_check — escalation rule evaluation

GenAI metrics: token usage, operation duration, cost, retry count, fallback count, error count. Domain metrics: conversation turns, conversation duration, escalation count, tool calls, RAG similarity. PII filter: email, phone, SSN, credit card redaction with span events.

Three-Layer OTel

Java Agent (zero-code): HTTP server spans, JDBC/R2DBC database spans, Spring framework spans
Spring AI built-in (Micrometer): ChatModel and VectorStore observation spans
Manual spans (OTel API): Pipeline stages, gateway contract compliance, domain metrics

Verify Telemetry

./scripts/verify-scout.sh

Development

make check    # build + test
make build    # compile
make test     # run tests

LLM Providers

Provider	Models	Usage
OpenAI	gpt-4.1 (capable), gpt-4.1-mini (fast)	Default primary
Anthropic	claude-haiku-4-5-20251001	Fallback (auto model switch via `FALLBACK_MODEL`)
Ollama	Any local model	`LLM_PROVIDER=ollama`

Failure Injection

Activate with Spring profile failure-injection. 8 scenarios for testing observability under failure:

hallucinated-order — nonexistent order lookup
escalation-thrash — angry customer triggering escalation
tool-loop — ambiguous input causing repeated tool calls
rag-miss — question outside KB coverage
rate-limit — high-volume request
streaming-interrupt — long response for SSE interruption
sensitive-data — PII in input, verify redaction
context-overflow — large conversation history

Sample Conversations

"What is the status of order ORD-10001?"
"I want to return my headphones, order ORD-10005"
"What products do you have in the audio category?"
"I'm really frustrated, nothing is working. Let me talk to a human."
"What is your return policy?"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

AI Customer Support

Architecture

Quick Start

API Endpoints

Data

Tool Calling

Observability

Three-Layer OTel

Verify Telemetry

Development

LLM Providers

Failure Injection

Sample Conversations

FilesExpand file tree

ai-customer-support

Directory actions

More options

Directory actions

More options

Latest commit

History

ai-customer-support

Folders and files

parent directory

README.md

AI Customer Support

Architecture

Quick Start

API Endpoints

Data

Tool Calling

Observability

Three-Layer OTel

Verify Telemetry

Development

LLM Providers

Failure Injection

Sample Conversations