[Example Request] Retrieval eval notebook for financial document RAG — precision/recall benchmarking on SEC filings with Pinecone

## What example would you like to see?

A Jupyter notebook demonstrating **retrieval quality benchmarking** for financial document RAG using Pinecone. This is one of the highest-value use cases for Pinecone — SEC filings, earnings calls, and financial reports — but there's no existing example that shows how to *evaluate* retrieval precision on structured financial data.

## Why this example is needed

Existing Pinecone examples show how to **build** RAG pipelines, but not how to **measure retrieval quality** on domain-specific data. Financial documents have specific challenges:

1. **Heterogeneous structure** — 10-Ks mix prose (MD&A, Risk Factors) with tables (Balance Sheets, Income Statements) and footnotes
2. **Section-boundary bleed** — vector search retrieves chunks from adjacent sections that are semantically similar but not contextually relevant to the query
3. **Numerical precision** — retrieving the *right* financial figure matters ("current assets FY2024" vs. "total assets FY2023")
4. **No existing benchmark** — there's no Pinecone example showing retrieval precision@k on a financial QA dataset

## Proposed notebook outline

```
1. Dataset: FinanceBench (public) or custom 10-K QA pairs
2. Indexing: Chunk 10-K with section-aware metadata
   - doc_type, section, fiscal_year, chunk_role (table/prose/footnote)
3. Query evaluation:
   - For each QA pair, retrieve top-k chunks
   - Score precision@1, precision@5, NDCG@10
4. Metadata filter comparison:
   - Baseline: no filters (pure vector search)
   - Filtered: section + fiscal_year filters applied
5. Show: how metadata filtering improves precision from ~0.55 → ~0.82
   on financial queries
```

## Example code sketch

```python
import pinecone
from pinecone import Pinecone

pc = Pinecone(api_key="...")
index = pc.Index("financial-rag")

# Baseline: pure semantic search
results_baseline = index.query(
    vector=query_embedding,
    top_k=5,
    include_metadata=True
)

# Filtered: section-aware + fiscal year
results_filtered = index.query(
    vector=query_embedding,
    top_k=5,
    filter={
        "section": {"$in": ["balance_sheet", "income_statement"]},
        "fiscal_year": {"$eq": "2024"}
    },
    include_metadata=True
)

# Eval: compare precision@k
def precision_at_k(results, ground_truth_ids, k=5):
    retrieved_ids = [r.id for r in results.matches[:k]]
    hits = len(set(retrieved_ids) & set(ground_truth_ids))
    return hits / k
```

## Why I'm well-placed to contribute this

I've been building financial RAG eval frameworks and have a working prototype in my [finrag-eval](https://github.com/Ruthwik-Data/finrag-eval) project. I can contribute this notebook as a PR if the team is interested. I'm also familiar with FinanceBench as a QA dataset and can set up the ground truth labels.

Happy to discuss the scope and any preferred notebook format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Example Request] Retrieval eval notebook for financial document RAG — precision/recall benchmarking on SEC filings with Pinecone #584

What example would you like to see?

Why this example is needed

Proposed notebook outline

Example code sketch

Why I'm well-placed to contribute this

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Example Request] Retrieval eval notebook for financial document RAG — precision/recall benchmarking on SEC filings with Pinecone #584

Description

What example would you like to see?

Why this example is needed

Proposed notebook outline

Example code sketch

Why I'm well-placed to contribute this

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions