Skip to content

fix: include port 65535 in free port search#2040

Closed
nightcityblade wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
nightcityblade:fix/issue-2023
Closed

fix: include port 65535 in free port search#2040
nightcityblade wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
nightcityblade:fix/issue-2023

Conversation

@nightcityblade
Copy link
Copy Markdown
Contributor

Description

Closes #2023.

Updates get_free_port() to include port 65535 in the search range, matching the valid TCP port range and the existing error message. Adds a focused regression test for starting the search at 65535.

Usage

from nemo_curator.core.utils import get_free_port

port = get_free_port(65535)

Checklist

  • I am familiar with the Contributing Guide.
  • New or Existing tests cover these changes.
  • The documentation is up to date with these changes.

Testing

  • uv run --python 3.12 --group linting ruff check nemo_curator/core/utils.py tests/core/test_utils.py — passed
  • uv run --python 3.12 --group test python -m pytest tests/core/test_utils.py — blocked on macOS because nemo_curator raises ValueError on non-Linux systems during import
  • Direct boundary check with get_free_port(65535) and a fake socket — passed

Signed-off-by: nightcityblade <nightcityblade@gmail.com>
@nightcityblade nightcityblade requested a review from a team as a code owner May 30, 2026 15:12
@nightcityblade nightcityblade requested review from praateekmahajan and removed request for a team May 30, 2026 15:12
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 30, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 30, 2026

Greptile Summary

This PR fixes an off-by-one error in get_free_port where range(start_port, 65535) excluded the last valid TCP port (65535), making the function unable to return it even though the error message already referenced 65535 as the upper bound.

  • nemo_curator/core/utils.py: Upper bound of range() changed from 65535 to 65536, so port 65535 is now reachable; the existing error message "No free port found between {start_port} and 65535" is now consistent with the actual search range.
  • tests/core/test_utils.py: Adds a regression test using a FakeSocket that only allows binding on port 65535, directly exercising the boundary case and confirming the loop now reaches it.

Confidence Score: 5/5

Safe to merge — the change is a minimal, well-targeted one-character fix with a matching regression test.

The fix is a single-integer change to the range upper bound, the error message was already consistent with the intended behavior, the Ray worker port exclusion range (10002–19999) is far from 65535 so the filter does not interfere, and the new test directly exercises the boundary via a focused fake socket.

No files require special attention.

Important Files Changed

Filename Overview
nemo_curator/core/utils.py Single off-by-one fix: range upper bound changed from 65535 to 65536 so port 65535 is now included in the search, matching the existing error message and the valid TCP port range.
tests/core/test_utils.py Adds a focused regression test using a monkeypatched FakeSocket that only accepts binding on port 65535, correctly exercising the boundary case introduced by the fix.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["get_free_port(start_port)"] --> B["for port in range(start_port, 65536)"]
    B --> C{Port in Ray worker\nrange 10002–19999?}
    C -- Yes --> B
    C -- No --> D["socket.bind('localhost', port)"]
    D -- Success --> E["return port"]
    D -- OSError --> F{get_next_free_port\nAND port == start_port?}
    F -- Yes, first port --> G["raise RuntimeError\n(port in use)"]
    F -- No --> B
    B -- range exhausted --> H["raise RuntimeError\n(no free port 65535)"]
Loading

Reviews (1): Last reviewed commit: "fix: include port 65535 in free port sea..." | Re-trigger Greptile

@svcnvidia-nemo-ci svcnvidia-nemo-ci added the waiting-on-maintainers Waiting on maintainers to respond label Jun 1, 2026
@sarahyurick
Copy link
Copy Markdown
Contributor

Closing in favor of #2024 which was opened first.

@sarahyurick sarahyurick closed this Jun 2, 2026
@svcnvidia-nemo-ci svcnvidia-nemo-ci removed the waiting-on-maintainers Waiting on maintainers to respond label Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: get_free_port() off-by-one — port 65535 never checked despite appearing in error message

3 participants