Support for batched  inference

Hi, 

I wanted to ask if batched inference is supported for Ming-Lite-Omni-1.5? All the usage examples in the repo (`examples/`) and the `cookbook.ipynb` only demonstrate inference on a single inference inference (multi-turn conversations are still a **singular** conversation, hence only one input). 

I tried to batching inputs (text-only, and passed `None` to all other input modalities within the processor) but I kept getting errors about `past_token_values` and `rope_deltas`. When I passed a fully batched input to the other modalities as well (`List[None]`), then I was getting the error `processor cannot convert list of None to list of images` (something very similar to that). 

Even the vLLM inference examples are only for singular input samples. Is there no way to do batched inference? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for batched inference #63

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for batched inference #63

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions