Lawyer who ships code. I build the tooling under the legal work — NER models, document pipelines, contract evals, redlining services — instead of describing it in a memo.
Based in New York and São Paulo. Sole counsel at Teachable. Building Cicero — a legal workbench that turns messy inputs (email threads, third-party paper, drafts, quick requests) into structured outputs.
- GLiNER for legal NER — training, fine-tuning, ONNX export, Rust inference, benchmarking. End-to-end pipeline from FlashDeBERTa-based architectures down to runtime.
- PII detection & synthesis for pt-BR — Brazilian Portuguese is under-served in privacy tooling; building both detectors and realistic synthetic generators.
- Contract extraction & redlining — clause extraction, docx round-trips, HTML↔DOCX conversion, diff-aware n8n nodes.
- SEC / EDGAR data infrastructure — bulk pullers, parsers, doc-to-dict utilities.
- Legal-AI evaluation — benchmark of contract-drafting AI vs. human lawyers (450 outputs, 72 surveyed lawyers).
PyTorch · Transformers · ONNX · Rust (inference) · FastAPI · pydantic-ai · TypeScript · Astro · Cloudflare Workers · pt-BR NLP · docx / OOXML internals
- gliner-onnx-benchmark — benchmarks across ONNX runtimes
- gliner-training-utils — training tooling
- gliner2_finetune — fine-tuning recipes
- ptbr_gliner_showcase — pt-BR demos
- gliner-looong — long-context experiments
- clause-extract — clause-level extraction
- sec-edgar-bulker — SEC EDGAR bulk fetching
- sec-parser-2 · secdoctodict — parsing & structuring
- amd_pii_generator · pii_showcase — pt-BR PII synthesis & detection
- planalto_dot2dict — Brazilian gov data utilities
- contractrec · contract_translator — contract tooling
- Legal-Eaze · Tourcicero
- tabajara-stemmer — classic pt-BR stemmer
- html-to-docx · html-to-docx2
- n8n-nodes-docx-diff — diff nodes for n8n
- redline-endpoint — redlining as a service
- claude-code-system-prompts
- ai-pr-reviewer
- modal-for-noobs — Modal helpers
- app-gestao-advocacia — law-firm ops app
12+ years in BigLaw and in-house. Cross-border US/Brazil practice — contracts, governance, privacy (LGPD), compliance, AI adoption, legal operations. Trilingual: English · Português · Español.
arthur.law · LinkedIn · Hugging Face · contact@arthur.law



