AI-powered bill parser that extracts structured data from PDF receipts using Claude and optionally uploads them to Paperless-ngx.
- Python 3.12+
- Anthropic API Key
- (Optional) Paperless-ngx instance
pip install -e .Create a .env file in the project root:
ANTHROPIC_API_KEY=your-api-key
# Optional: Paperless-ngx integration
PAPERLESS_URL=https://your-paperless-instance.com
PAPERLESS_API_TOKEN=your-paperless-tokenTo disable Paperless upload, set PAPERLESS_UPLOAD_ENABLE = False in docuparse/config.py.
docuparseOr run as module:
python -m docuparseA file dialog opens to select PDF bills. The extracted data is saved to ~/Downloads/bills-YYYY-MM-DD.json.
pip install -e ".[dev]"Lint and format:
ruff check docuparse/
ruff format docuparse/Type checking:
mypy docuparse/Run tests:
pytest