Skip to content

AI-powered CLI tool that extracts structured data from receipt PDFs using Claude and uploads them to Paperless-ngx.

License

Notifications You must be signed in to change notification settings

itacentury/docuparse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Docuparse

CI

AI-powered bill parser that extracts structured data from PDF receipts using Claude and optionally uploads them to Paperless-ngx.

Requirements

Installation

pip install -e .

Configuration

Create a .env file in the project root:

ANTHROPIC_API_KEY=your-api-key

# Optional: Paperless-ngx integration
PAPERLESS_URL=https://your-paperless-instance.com
PAPERLESS_API_TOKEN=your-paperless-token

To disable Paperless upload, set PAPERLESS_UPLOAD_ENABLE = False in docuparse/config.py.

Usage

docuparse

Or run as module:

python -m docuparse

A file dialog opens to select PDF bills. The extracted data is saved to ~/Downloads/bills-YYYY-MM-DD.json.

Development

pip install -e ".[dev]"

Lint and format:

ruff check docuparse/
ruff format docuparse/

Type checking:

mypy docuparse/

Run tests:

pytest

About

AI-powered CLI tool that extracts structured data from receipt PDFs using Claude and uploads them to Paperless-ngx.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages