Conversation
…xruntime to openvino after benchmarks
There was a problem hiding this comment.
Pull request overview
This PR improves PDF→Markdown extraction for the Knowledge Flow “medium/rich” ingestion profiles by expanding Docling/RapidOCR configuration (notably OpenVINO support) and aligning deployment configs accordingly, with a small frontend UX/i18n tweak in the agent asset manager drawer.
Changes:
- Add build-time patching to Docling so RapidOCR’s
"openvino"backend can resolve default model artifact paths, and add theopenvinodependency on Linux. - Extend PDF pipeline configuration to support
ocr_backendandforce_full_page_ocr, and update multiple environment/deployment YAMLs to use Docling parsing with table structure extraction and OpenVINO OCR in rich profiles. - Update the frontend asset manager drawer title to display
agentName(fallback toagentId) and add an “Add more files” label.
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/patch_docling_rapidocr_openvino.py | Build-time patch script to extend Docling RapidOCR default model mapping for OpenVINO. |
| knowledge-flow-backend/pyproject.toml | Adds Linux-only openvino dependency alongside pinned Docling. |
| knowledge-flow-backend/uv.lock | Locks openvino/openvino-telemetry versions for Linux builds. |
| knowledge-flow-backend/knowledge_flow_backend/core/processors/input/pdf_markdown_processor/pdf_markdown_processor.py | Wires new RapidOCR options into Docling pipeline and switches to in-memory markdown export. |
| knowledge-flow-backend/knowledge_flow_backend/common/structures.py | Adds ocr_backend and force_full_page_ocr to PDF pipeline config schema. |
| knowledge-flow-backend/knowledge_flow_backend/application_context.py | Logs the new PDF OCR configuration fields in the config summary. |
| knowledge-flow-backend/dockerfiles/Dockerfile-prod | Runs the Docling patch script during image build. |
| knowledge-flow-backend/dockerfiles/Dockerfile-dev | Runs the Docling patch script during dev image build. |
| knowledge-flow-backend/config/configuration.yaml | Updates default profile settings and PDF processing options (Docling parse, tables, OCR tuning). |
| knowledge-flow-backend/config/configuration_prod.yaml | Aligns prod profile PDF settings (tables + OpenVINO OCR in rich). |
| knowledge-flow-backend/config/configuration_test.yaml | Aligns test profile PDF settings (tables + OpenVINO OCR in rich). |
| knowledge-flow-backend/config/configuration_gcp.yaml | Aligns GCP profile PDF settings and processor selection. |
| knowledge-flow-backend/config/configuration_worker.yaml | Aligns worker profile PDF settings and processors. |
| knowledge-flow-backend/config/configuration_bench.yaml | Adjusts benchmark defaults and PDF profile options. |
| deploy/charts/fred/values.yaml | Updates Helm values to reflect new processing profile defaults and OCR/table settings. |
| deploy/local/k3d/values-local.yaml | Updates local k3d values to include/align medium & rich profiles with Docling parse and OpenVINO OCR. |
| frontend/src/locales/en/translation.json | Updates asset manager title interpolation and adds “addMoreFiles” string. |
| frontend/src/locales/fr/translation.json | Updates asset manager title interpolation and adds “addMoreFiles” string. |
| frontend/src/components/agentHub/AgentGridManager.tsx | Passes agentName into the asset manager drawer. |
| frontend/src/components/agentHub/AgentConfigWorkspaceManagerDrawer.tsx | Uses agent name in title (fallback to id) and uses new i18n key for “add more files”. |
You can also share your feedback on Copilot code review. Take the survey.
…backend but no image processing
…ediumrich-profiles
…tion or class' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
…ediumrich-profiles
…ediumrich-profiles
…ediumrich-profiles
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 27366443 | Triggered | Generic Password | 830873c | deploy/local/k3d/values-local.yaml | View secret |
| 17205519 | Triggered | Generic Password | 830873c | deploy/local/k3d/values-local.yaml | View secret |
| 17205519 | Triggered | Generic Password | 3f59a2a | deploy/local/k3d/values-local.yaml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
This reverts commit 1f7dd59.
…ediumrich-profiles
No description provided.