Releases · NVIDIA-NeMo/Export-Deploy

16 Apr 20:49

svcnvidia-nemo-ci

v0.5.0

04ca37f

NVIDIA NeMo-Export-Deploy 0.5.0 Latest

Latest

Changelog Details

Version bump to 0.5.0rc0.dev0 by @github-actions[bot] :: PR: #580
ci: Add secrets detector by @chtruong814 :: PR: #578
Add apply_chat_template to HF vllm Ray deployment by @athitten :: PR: #581
Onur/remove nemo2 trtllm support by @oyilmaz-nvidia :: PR: #576
Remove MM trt-llm files for nemo2 by @oyilmaz-nvidia :: PR: #583
ci: Adding to codeowners by @chtruong814 :: PR: #585
Remove more nemo2 and unused code. by @oyilmaz-nvidia :: PR: #584
docs: Remove uv sync with uv_args by @thomasdhc :: PR: #586
Update to use latest MBridge by @chtruong814 :: PR: #589
Add inference_max_seq_len to ray mbridge deployment path by @athitten :: PR: #588
Remove nemo imports by @oyilmaz-nvidia :: PR: #594
ci: Fix wheel build test and publish by @chtruong814 :: PR: #595
ci: Re-enable onnx test by @chtruong814 :: PR: #597
ci: Update release-docs workflow to use FW-CI-templates v0.72.0 by @chtruong814 :: PR: #599
feat: Pass ETP and Sequence Parallel to inframework Ray deployment by @ko3n1g :: PR: #600
ci: Update release workflows to include changelog and docs by @chtruong814 :: PR: #604
build: Remove torchao by @chtruong814 :: PR: #606
build: Upgrade vllm to 0.14.1 by @chtruong814 :: PR: #609
Add support for stop_words in Ray MBridge deployment by @athitten :: PR: #605
Add vllm docs for mbridge ckpt by @oyilmaz-nvidia :: PR: #573
Docs update: remove nemo2 and fix import by @oyilmaz-nvidia :: PR: #608
Update CI docker image and set vllm eager enforce_eager to False by @chtruong814 :: PR: #614
Fix building doc and remove all nemo 2.0 docs by @oyilmaz-nvidia :: PR: #615
Fix multimodal deployment sampling params by @meatybobby :: PR: #602
docs: Enable nightly docs build on main branch by @chtruong814 :: PR: #619
Set materialize_only_last_token_logits=False when log_probs = True by @athitten :: PR: #613
ci: Add-credentials-for-docs by @ko3n1g :: PR: #623
Fix release workflow reference by @chtruong814 :: PR: #625
Fix mbridge inference for latest mbridge by @oyilmaz-nvidia :: PR: #627
feat: Add support for batching of Ray Serve requests by @pthombre :: PR: #629
Remove all nemo2 imports from old repo by @oyilmaz-nvidia :: PR: #628
build: Bump export-deploy dependencies for 26.04 by @chtruong814 :: PR: #633
Docs: remove vLLM install step from mbridge vllm quickstart by @oyilmaz-nvidia :: PR: #618
Announce Python 3.12 migration by @ko3n1g :: PR: #630
ci: Enable claude review by @thomasdhc :: PR: #635
ci: Fix sso user check by @chtruong814 :: PR: #637
chore: test FW-CI-templates ko3n1g/fix/linkcheck-retry-backoff by @ko3n1g :: PR: #638
ci: upgrade GitHub Actions for Node.js 24 compatibility by @ko3n1g :: PR: #639
Add legacy_model_format param by @oyilmaz-nvidia :: PR: #641
chore: Move to Py3.12 by @ko3n1g :: PR: #631
cp: build: Bump vLLM to address CVE (644) into r0.5.0 by @svcnvidia-nemo-ci :: PR: #645
cp: Fix MLA model issues (647) into r0.5.0 by @svcnvidia-nemo-ci :: PR: #649
cp: build: Set trt-llm and vllm for 26.04 (650) into r0.5.0 by @svcnvidia-nemo-ci :: PR: #651

Contributors

thomasdhc, meatybobby, and 6 other contributors

Assets 2

26 Feb 00:19

svcnvidia-nemo-ci

v0.4.0

2ba74b0

NVIDIA NeMo-Export-Deploy 0.4.0

Highlights

vLLM support for Megatron-Bridge LLM checkpoints.
Remove NeMo 2.0 support.
Deployment of Megatron-Bridge VLM checkpoints

Changelog Details

Eval logprob benchmarks support for HF via vLLM with Ray by @athitten :: PR: #479
feat: add labeler by @pablo-garay :: PR: #483
Support apply_chat_template in NeMo MM in-framework deployment by @meatybobby :: PR: #440
NeMo-Export-Deploy 0.2.1 changelog by @pablo-garay :: PR: #489
Add torch_dtype and default values by @oyilmaz-nvidia :: PR: #466
Fix max token input by @oyilmaz-nvidia :: PR: #478
Remove scheduled cron job from release workflow by @pablo-garay :: PR: #494
feat: Add anchor by @pablo-garay :: PR: #495
[Eval] Fixes for compatibility between Pytriton, Ray deployments with nemo-run by @athitten :: PR: #501
New script path by @oyilmaz-nvidia :: PR: #487
Update trt-llm doc for nemo 2 by @oyilmaz-nvidia :: PR: #506
Change type for --runtime_env in ray in-fw deployment script by @athitten :: PR: #505
fix : New peft release adjust fix by @pablo-garay :: PR: #514
fix: ensure vLLM receives valid params regardless of env changes by @pablo-garay :: PR: #516
Fix minor doc issue by @oyilmaz-nvidia :: PR: #521
Update changelog for release 0.3.0 by @oyilmaz-nvidia :: PR: #522
Update nvidia-sphinx-theme by @chtruong814 :: PR: #528
Update changelog for version 0.3.1 by @pablo-garay :: PR: #537
Minor fixes for MBridge nemotron deployment by @athitten :: PR: #518
docs: Update docs version to latest by @chtruong814 :: PR: #553
docs: Fixing version1.json by @aschilling-nv :: PR: #554
Properly Handle DynamicInferenceRequestRecord with latest Mcore by @chtruong814 :: PR: #559
Add vllm support for mbridge by @oyilmaz-nvidia :: PR: #555
Temp fix for k8s issue by @ko3n1g :: PR: #565
ci: Enable AWS runners by @chtruong814 :: PR: #557
docs: Release docs by @ko3n1g :: PR: #566
Remove nemo from in-framework deployment by @oyilmaz-nvidia :: PR: #568
Fix chat endpoint support for Ray in-framework MBridge deployment by @athitten :: PR: #572
build: Update dependencies for 26.02 by @chtruong814 :: PR: #567
Remove nemo2 vllm support by @oyilmaz-nvidia :: PR: #571
Update multimodal in-framework FastAPI from NeMo to Megatron Bridge by @meatybobby :: PR: #511
Fix chat endpoint support for HF deployment with Ray by @athitten :: PR: #575
Add Ray Serve Deployment Support for Multimodal Models by @meatybobby :: PR: #574
cp: Add apply_chat_template to HF vllm Ray deployment (581) into r0.4.0 by @ko3n1g :: PR: #582
cp: Remove more nemo2 and unused code. (584) into r0.4.0 by @ko3n1g :: PR: #587
cp: docs: Remove uv sync with uv_args (586) into r0.4.0 by @ko3n1g :: PR: #591
cp: Add inference_max_seq_len to ray mbridge deployment path (588) into r0.4.0 by @ko3n1g :: PR: #593
cp: Fix wheel build test and publish (#595) in r0.4.0 by @chtruong814 :: PR: #596
cp: Re-enable onnx test (#597) in r0.4.0 by @chtruong814 :: PR: #598
cp: ci: Update release-docs workflow to use FW-CI-templates v0.72.0 (599) into r0.4.0 by @ko3n1g :: PR: #601
cp: ci: Update release workflows to include changelog and docs (604) into r0.4.0 by @ko3n1g :: PR: #607
cp: build: Remove torchao (606) into r0.4.0 by @ko3n1g :: PR: #610
cp: build: Upgrade vllm to 0.14.1 (#609) into r0.4.0 by @chtruong814 :: PR: #611
docs: Update docs for 0.4.0 by @chtruong814 :: PR: #612
cp: Update CI docker image and set vllm eager enforce_eager to False (614) into r0.4.0 by @svcnvidia-nemo-ci :: PR: #617
docs: Update docs version for 0.4.0 release by @chtruong814 :: PR: #620

Contributors

pablo-garay, meatybobby, and 6 other contributors

Assets 2

15 Dec 23:36

chtruong814

v0.3.1

44a30f0

NVIDIA NeMo-Export-Deploy 0.3.1

Fix vLLM top_p parameter handling in HuggingFace Ray deployment (#524)
Pin peft dependency to <0.14.0 for compatibility (#524)

Assets 2

04 Dec 00:55

chtruong814

v0.3.0

2cdaf51

NVIDIA NeMo-Export-Deploy 0.3.0

Update TensorRT-LLM export to use NeMo->HF->TensorRT-LLM export path
Add chat template support for VLM deployment.
Bug fixes and folder name updates such as updating nlp to llm.

Assets 2

22 Oct 23:36

chtruong814

v0.2.1

950000c

NVIDIA NeMo-Export-Deploy 0.2.1

Bug fixes for HuggingFace model deployment (#459)
- Fixed HuggingFace deployable implementations for both Triton and Ray Serve backends
- Improved tokenizer handling in HuggingFace deployment scripts
Minor fixes for Ray deployment (#464)
- Additional bug fixes in Ray deployment utilities

Assets 2

09 Oct 20:01

chtruong814

v0.2.0

726695b

NVIDIA NeMo-Export-Deploy 0.2.0

MegatronLM and Megatron-Bridge model deployment support with Triton Inference Server and Ray Serve
Multi-node multi-instance Ray Serve based deployment for NeMo 2, Megatron-Bridge, and Megatron-LM models.
Update vLLM export to use NeMo->HF->vLLM export path
Multi-Modal deployment for NeMo 2 models with Triton Inference Server
NeMo Retriever Text Reranking ONNX and TensorRT export support

Assets 2

18 Aug 06:32

chtruong814

v0.2.0rc2

7867110

NVIDIA NeMo-Export-Deploy 0.2.0rc2 Pre-release

Pre-release

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc2 (2025-08-18)

Assets 2

15 Aug 08:24

chtruong814

v0.1.1

ca72da9

NVIDIA NeMo-Export-Deploy 0.1.1

ci: Mock DCO check

Signed-off-by: oliver könig <okoenig@nvidia.com>

Assets 2

14 Aug 15:54

chtruong814

v0.2.0rc1

62485cc

NVIDIA NeMo-Export-Deploy 0.2.0rc1 Pre-release

Pre-release

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc1 (2025-08-14)

Assets 2

03 Aug 16:48

chtruong814

v0.2.0rc0

657c525

NVIDIA NeMo-Export-Deploy 0.2.0rc0 Pre-release

Pre-release

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc0 (2025-08-03)

Assets 2

Releases: NVIDIA-NeMo/Export-Deploy

NVIDIA NeMo-Export-Deploy 0.5.0

Contributors

Uh oh!

NVIDIA NeMo-Export-Deploy 0.4.0

Highlights

Contributors

Uh oh!

NVIDIA NeMo-Export-Deploy 0.3.1

Uh oh!

NVIDIA NeMo-Export-Deploy 0.3.0

Uh oh!

NVIDIA NeMo-Export-Deploy 0.2.1

Uh oh!

NVIDIA NeMo-Export-Deploy 0.2.0

Uh oh!

NVIDIA NeMo-Export-Deploy 0.2.0rc2

Uh oh!

NVIDIA NeMo-Export-Deploy 0.1.1

Uh oh!

NVIDIA NeMo-Export-Deploy 0.2.0rc1

Uh oh!

NVIDIA NeMo-Export-Deploy 0.2.0rc0

Uh oh!