Skip to content

Bxyu/swe rl dev 20260120#599

Closed
bxyu-nvidia wants to merge 352 commits intomainfrom
bxyu/swe-rl-dev-20260120
Closed

Bxyu/swe rl dev 20260120#599
bxyu-nvidia wants to merge 352 commits intomainfrom
bxyu/swe-rl-dev-20260120

Conversation

@bxyu-nvidia
Copy link
Copy Markdown
Contributor

No description provided.

bxyu-nvidia and others added 30 commits November 3, 2025 10:55
Signed-off-by: Brian Yu <bxyu@nvidia.com>
This PR provides warnings for any server failing validation and outputs
the malformed or incorrect parts of the config. It also adds a new
optional env variable `error_on_almost_servers` for a ValueError to be
raised.

Currently in the gym spinup process, any server(s) failing validation
results in the server getting silently dropped and raising a generic
AssertionError.

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
### Mini-SWE-Agent Environment

This PR adds a new agent that integrates the Mini-SWE-Agent harness for
evaluating and training language models on software engineering tasks.
The agent uses the SWE-Gym dataset (2,438 training instances from 11
Python repos) and SWE-bench Verified (500 human-validated test
instances) to solve real-world GitHub issues in containerized
environments (Docker/Singularity).

---------

Signed-off-by: Sugam Devare <sdevare@nvidia.com>
This PR changes the file structure of the `resources_servers/` folder to
place example-only servers in the `/examples` subdirectory. It also
changes the tables in the README to more clearly distinguish these
servers.

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
This change adds a simple verification process for contributing resource
servers.

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Khushi Bhardwaj <kbhardwaj@nvidia.com>
**Changes:**
- simplify product description
- added reference to broader NeMo software suite

---------

Signed-off-by: Chris Wing <cwing@nvidia.com>
- Update description to match README
- Change development status from Production/Stable to Beta
- Update keywords to be RL/LLM-focused
- Fix Python version classifier from 3.10 to 3.12
- Remove irrelevant classifiers (IT audience, Image Recognition,
Utilities)
- Correct authors and maintainers fields

Signed-off-by: Chris Wing <cwing@nvidia.com>
Update the contribution requirements for models. Remove Qwen 235B
requirement, and keep only Qwen 3 30B for reward profiling and training.

Signed-off-by: banghuaz <banghuaz@nvidia.com>
Based on OSRB requests, we need to update all headers there for license.

Signed-off-by: banghuaz <banghuaz@nvidia.com>
This change adds support for Huggingface dataset management
(upload/download/delete Gitlab artifact(s))

Addresses items 2 and 3 from
#81

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
#306)

updates for site config only

---------

Signed-off-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Co-authored-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>
- added quickstart as top-most section of home page.
- added home <self> to TOC like evaluator docs has, so "home" is first
page in sidebar.

I keep hitting the problem where resources dont close, so i added the
final step for cleanup. but LMK what you'd like changed.

---------

Signed-off-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Co-authored-by: Brian Yu <bxyu@nvidia.com>
Credit to @lbliii and see #257

---------

Signed-off-by: Brian Yu <bxyu@nvidia.com>
)

Credit to @lbliii and see #257

---------

Signed-off-by: Brian Yu <bxyu@nvidia.com>
Update resource server names to the following:

library_judge_math -> math_with_judge
multiverse_math_hard -> math_advanced_calculations
comp_coding -> code_gen
python_math_exec -> math_with_code
workbench -> workspace_assistant
multineedle -> example_multi_step

---------

Signed-off-by: banghuaz <banghuaz@nvidia.com>
Co-authored-by: Brian Yu <bxyu@nvidia.com>
Credit to @lbliii and see #257

---------

Signed-off-by: Brian Yu <bxyu@nvidia.com>
…prompts (#328)

Signed-off-by: Abhibha Gupta <abhibhag@nvidia.com>
Signed-off-by: banghuaz <banghuaz@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
- Alters the table metadata for the README with more info for the user. 
- Temporarily hides the verified column
- Refactor

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Remove verification from contributing doc.

---------

Signed-off-by: Frankie Siino <fsiino@nvidia.com>
Signed-off-by: Chris Wing <cwing@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
#355)

- Updated config path in rl-training-with-nemo-rl.md tutorial
- Updated HuggingFace dataset naming example in how-to-faq.md
- Corrects path from resources_servers/library_judge_math to
resources_servers/math_with_judge

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Update secret detector to work with forks

Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
@vadam5 vadam5 closed this Mar 10, 2026
@vadam5 vadam5 force-pushed the bxyu/swe-rl-dev-20260120 branch from 40874f8 to c7099bb Compare March 10, 2026 23:08
@vadam5
Copy link
Copy Markdown
Contributor

vadam5 commented Mar 11, 2026

Sorry folks, this PR was mistakenly closed when one of our folks mistakenly force-pushed diverging refs to Github. We are looking to remedy this and re-open the PR.

@vadam5 vadam5 mentioned this pull request Mar 11, 2026
@vadam5
Copy link
Copy Markdown
Contributor

vadam5 commented Mar 11, 2026

Replacement PR opened here: #873

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.