This directory contains GitHub Actions workflows for automated building and deployment of the sglawwatch-zeeker databases.
Purpose: Specialized workflow for the about_singapore_law resource.
Triggers:
- Manual trigger only: Workflow dispatch with optional force rebuild
Features:
- Builds only the
about_singapore_lawresource - Fast execution (typically 5-10 minutes)
- Detailed logging and validation
- S3 deployment with error handling
- Automatic backup creation after deployment
Purpose: Comprehensive workflow for all resources in the project.
Triggers:
- Manual trigger only: Workflow dispatch with resource selection
Features:
- Can build specific resources or all resources
- Comprehensive database validation
- Support for all environment variables
- Extended timeout (45 minutes)
- Detailed statistics and summaries
- Automatic backup creation after deployment
Purpose: Create date-based backup archives in S3.
Triggers:
- Manual trigger only: Workflow dispatch with optional date and dry-run
Features:
- Creates backups with specific dates or today's date
- Dry-run mode for testing without uploading
- Stores archives at
s3://bucket/archives/YYYY-MM-DD/ - Automatically builds database if not present locally
- Lightweight and fast execution (< 15 minutes)
Purpose: Automated daily sync of headlines resource with incremental updates.
Triggers:
- Scheduled: Daily at 3 AM UTC
- Manual trigger: Workflow dispatch with force rebuild option
Features:
- Incremental sync from existing S3 database (faster updates)
- Force rebuild option for complete refresh
- Smart build logic with sync-from-s3 capability
- Automatic backup creation after deployment
- Detailed statistics and recent headlines reporting
Purpose: Validate deployed databases and monitor system health.
Triggers:
- Manual trigger: For on-demand health checks
- Automatic trigger: After successful deployments
Features:
- Downloads and validates deployed databases from S3
- Comprehensive health checks and data quality validation
- Reports on database statistics and freshness
- Alerts on issues or anomalies
Configure these secrets in your GitHub repository (Settings > Secrets and variables > Actions):
S3_BUCKET=your-s3-bucket-name
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-access-key
S3_ENDPOINT_URL=https://your-s3-endpoint.com
JINA_API_TOKEN=your-jina-api-token
OPENAI_API_KEY=your-openai-api-key
Your S3 bucket should have the following structure after deployment:
your-bucket/
├── latest/
│ └── sglawwatch.db # Latest database
└── assets/
└── databases/
└── sglawwatch/
└── metadata.json # Database metadata
- Go to
Actionstab in your GitHub repository - Select
Deploy About Singapore Law Database - Click
Run workflow - Optionally check
Force full rebuild
- Go to
Actionstab in your GitHub repository - Select
Build and Deploy Sglawwatch Database - Click
Run workflow - Optionally specify a specific resource name
- Optionally check
Force full rebuild
- Database file:
sglawwatch.db - Metadata:
metadata.json - Retention: 7-14 days depending on workflow
Each successful workflow run creates a summary showing:
- Database statistics (record counts, file size)
- Sample of processed data
- Deployment timestamp and trigger information
- Database deployed to:
s3://your-bucket/latest/sglawwatch.db - Metadata deployed to:
s3://your-bucket/assets/databases/sglawwatch/metadata.json
1. Build Failures
- Check API key validity (JINA_API_TOKEN, OPENAI_API_KEY)
- Verify website accessibility and HTML structure changes
- Review resource-specific error messages
2. Deployment Failures
- Verify S3 credentials and permissions
- Check S3 bucket exists and is accessible
- Ensure S3_ENDPOINT_URL is correct for non-AWS services
3. Timeout Issues
- Individual resources timing out: Check website response times
- Overall workflow timing out: Increase
timeout-minutesvalue
Workflow status appears in:
- GitHub Actions tab with success/failure status
- Deployment summaries with detailed statistics
- Artifacts section with downloadable database files
- Run builds regularly to keep data current
- Review deployment summaries for data quality changes
- Test locally first using the test script before deploying
- Check artifacts if deployment fails but build succeeds
- Update API keys before they expire to prevent failures
- Use health checks after deployments to verify data integrity
-
Local Development:
# Test locally first uv run zeeker build about_singapore_law # Verify database sqlite3 sglawwatch.db ".tables"
-
Push to Repository:
- Workflow triggers automatically on resource file changes
- Monitor Actions tab for build status
-
Manual Deployment:
- Use workflow dispatch for immediate deployment
- Useful for urgent updates or testing
-
Backup Management:
- Each deployment automatically creates a dated backup in S3 archives
- Use the
Backup Database to S3 Archivesworkflow for on-demand backups - Support for specific dates and dry-run testing
- Archives stored at:
s3://bucket/archives/YYYY-MM-DD/sglawwatch.db
-
Production Deployment:
- Run workflows manually as needed to keep data current
- Check for any structural changes in source websites
- Monitor database size and record count trends