Skip to content

[OVEP] [CVS-177257] Adding support for OV 2026.1 and Float8 precisions#991

Open
n1harika wants to merge 4 commits intoovep-developfrom
niharika/fp8_cpu
Open

[OVEP] [CVS-177257] Adding support for OV 2026.1 and Float8 precisions#991
n1harika wants to merge 4 commits intoovep-developfrom
niharika/fp8_cpu

Conversation

@n1harika
Copy link
Copy Markdown

This PR introduces OV 2026.0 and 2026.1 to data_ops.cc and capability.cc, and adds Float8 precisions for CPU/NPU/GPU which have been enabled from this OV version.

Copy link
Copy Markdown

@ankitm3k ankitm3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm

std::make_pair(V_2024_4, ONNX_NAMESPACE::TensorProto_DataType::TensorProto_DataType_INT4));
supported_types_cpu_.insert(
std::make_pair(V_2024_4, ONNX_NAMESPACE::TensorProto_DataType::TensorProto_DataType_UINT4));
supported_types_npu_.insert(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this support only valid for npu only? what about cpu & gpu FP8 support?

Copy link
Copy Markdown

@MayureshV1 MayureshV1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@n1harika . FP8 has 2 modes: E4M3 and E5M2. Please add support for both.

std::make_pair(V_2024_4, ONNX_NAMESPACE::TensorProto_DataType::TensorProto_DataType_INT4));
supported_types_initializer_.insert(
std::make_pair(V_2024_4, ONNX_NAMESPACE::TensorProto_DataType::TensorProto_DataType_UINT4));
supported_types_initializer_.insert(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@n1harika .. Please add support for E4M3 as well as E5M2.
ONNX FE support

Copy link
Copy Markdown
Author

@n1harika n1harika Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added.
Just FYI, I haven't tested the E5M2 type as the model I have generated doesn't use it. There are also two other ONNX FP8 precisions- E4M3FNUZ and E5M2FNUZ, will add these when ONNX FE supports them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants