You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
generate_presentation (#2778, PR #3016) currently emits text-only slides — titles, bodies, bullets, speaker notes. Real decks usually include images: charts, screenshots, photo placeholders. ppt-rs already supports image embedding via Image::from_bytes(...).add_to(...) on the slide builder. Wire it through so the agent can dispatch a deck with images alongside text.
Problem
Users asking the agent to "make a deck with the Q3 revenue chart" or "include a screenshot of the dashboard" get a text-only deck back today. The tool spec on SlideSpec has no images field, the Rust engine doesn't accept image payloads, and there's no path from a chat-uploaded image or a Composio-fetched file to a slide-embedded asset.
Resolve image references in the engine (engine.rsbuild_slides): walk each SlideImage, fetch bytes (with a deadline-bounded HTTP client for URLs, tokio::fs::read for FILE markers, artifacts::store::get_artifact_bytes for artifact ids), decode + validate (PNG/JPEG/WebP only, max 5 MB), and hand off to ppt-rs's Image::from_bytes.
Layout heuristics: when a slide has both bullets and images, lay them out side-by-side; when only images, full-bleed. Start with a fixed two-column grid for ≤2 images, 2x2 grid for 3-4. Defer arbitrary positioning.
Tests: round-trip a PNG, assert image entry present in the [Content_Types].xml and ppt/media/image1.png entry in the resulting zip. Size-cap rejection. MIME rejection.
Orchestrator prompt update (agent_registry/agents/orchestrator/prompt.md): document that generate_presentation accepts an images array, and clarify the grounding rule covers image fetches (e.g. agent must research/memory_tree before claiming a chart shows X).
Acceptance criteria
SlideSpec.images accepted — passing SlideImage entries on a slide attaches the image at generation time.
Summary
generate_presentation(#2778, PR #3016) currently emits text-only slides — titles, bodies, bullets, speaker notes. Real decks usually include images: charts, screenshots, photo placeholders.ppt-rsalready supports image embedding viaImage::from_bytes(...).add_to(...)on the slide builder. Wire it through so the agent can dispatch a deck with images alongside text.Problem
Users asking the agent to "make a deck with the Q3 revenue chart" or "include a screenshot of the dashboard" get a text-only deck back today. The tool spec on
SlideSpechas noimagesfield, the Rust engine doesn't accept image payloads, and there's no path from a chat-uploaded image or a Composio-fetched file to a slide-embedded asset.Solution (optional)
SlideSpec(src/openhuman/tools/impl/presentation/types.rs) with an optionalimages: Vec<SlideImage>field.SlideImagecarries either an in-workspace artifact id, a[FILE:path]marker (per Extend multimodal input to accept document and file attachments beyond images #2777), or a remote URL with a size cap.engine.rsbuild_slides): walk eachSlideImage, fetch bytes (with a deadline-bounded HTTP client for URLs,tokio::fs::readfor FILE markers,artifacts::store::get_artifact_bytesfor artifact ids), decode + validate (PNG/JPEG/WebP only, max 5 MB), and hand off toppt-rs'sImage::from_bytes.[Content_Types].xmlandppt/media/image1.pngentry in the resulting zip. Size-cap rejection. MIME rejection.agent_registry/agents/orchestrator/prompt.md): document thatgenerate_presentationaccepts animagesarray, and clarify the grounding rule covers image fetches (e.g. agent mustresearch/memory_treebefore claiming a chart shows X).Acceptance criteria
SlideSpec.imagesaccepted — passingSlideImageentries on a slide attaches the image at generation time.[FILE:path]marker (depends on Extend multimodal input to accept document and file attachments beyond images #2777), remote URL (deadline-bounded, ≤5 MB, allowed MIME)..github/workflows/pr-ci.yml.Related
[FILE:path]resolution