Skip to content

Enabled MoE for both BF16 and INC based FP8 on Gaudi#1309

Draft
gyou2021 wants to merge 5 commits into
HabanaAI:dev/qwen3-habanamainfrom
gyou2021:gyou/dev/qwen3-habanamain
Draft

Enabled MoE for both BF16 and INC based FP8 on Gaudi#1309
gyou2021 wants to merge 5 commits into
HabanaAI:dev/qwen3-habanamainfrom
gyou2021:gyou/dev/qwen3-habanamain

Conversation

@gyou2021

Copy link
Copy Markdown

Enabled optimized MoE (combination of dynamic MoE and static MoE) for both BF16 and INC-based FP8.

xuechendi and others added 4 commits May 8, 2025 06:48
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Ganmei You <ganmei.you@intel.com>
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: gyou2021 <ganmei.you@intel.com>
if hasattr(layer.moe_op, "w13_weight"):
layer.moe_op.w13_weight = layer.w13_weight
if hasattr(layer.moe_op, "w2_weight"):
layer.moe_op.w2_weight = layer.w2_weight

@xuechendi xuechendi May 28, 2025

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for static moe, right?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@xuechendi

Copy link
Copy Markdown

Please push to vllm-fork branch, otherwise we can't trigger UT.

@xuechendi

Copy link
Copy Markdown

Don't you need to update qwen RMSnorm to HPU RMSNorm?

@gyou2021 gyou2021 changed the title Enabled MoE for both BF16 and INC based FP8. Enabled MoE for both BF16 and INC based FP8 on Gaudi Jun 3, 2025
Signed-off-by: gyou2021 <ganmei.you@intel.com>
@xuechendi xuechendi force-pushed the dev/qwen3-habanamain branch from d3012e3 to f143727 Compare June 17, 2025 15:10

@michalkuligowski michalkuligowski left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no update for 1 month converting to draft

@michalkuligowski michalkuligowski marked this pull request as draft July 25, 2025 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants