Skip to content

fix(utils): include port 65535 in get_free_port() scan range#2024

Open
miclaldogan wants to merge 5 commits into
NVIDIA-NeMo:mainfrom
miclaldogan:fix/get-free-port-off-by-one
Open

fix(utils): include port 65535 in get_free_port() scan range#2024
miclaldogan wants to merge 5 commits into
NVIDIA-NeMo:mainfrom
miclaldogan:fix/get-free-port-off-by-one

Conversation

@miclaldogan
Copy link
Copy Markdown

Summary

  • range(start_port, 65535) in get_free_port() excluded port 65535, the last valid TCP port
  • Changed upper bound to 65536 so the full valid range 0–65535 is covered
  • The existing error message already referenced 65535, making the off-by-one visible

Change

- for port in range(start_port, 65535):
+ for port in range(start_port, 65536):

Test plan

  • Verify get_free_port(65535) now succeeds when port 65535 is free
  • Verify existing port allocation behavior is unchanged for normal port ranges

Closes #2023

@miclaldogan miclaldogan requested a review from a team as a code owner May 22, 2026 23:36
@miclaldogan miclaldogan requested review from VibhuJawa and removed request for a team May 22, 2026 23:36
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 22, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 22, 2026

Greptile Summary

This PR fixes a classic off-by-one error in get_free_port() where range(start_port, 65535) silently excluded port 65535 — the last valid TCP port. The error message already referenced 65535 as the upper bound, making the inconsistency visible.

  • Off-by-one fix: Upper bound changed from 65535 to 65536 so range() now covers the full valid port space through 65535.
  • Error message consistency: The existing "No free port found between {start_port} and 65535" message on line 118 remains accurate since the scan now correctly reaches 65535.

Confidence Score: 5/5

Minimal, correct single-line fix with no side effects on the surrounding logic.

The change is a single range upper-bound adjustment that makes the function's behavior consistent with its own error message. All existing port-scanning logic, error handling, and Ray-reserved-port exclusion remain untouched.

No files require special attention.

Important Files Changed

Filename Overview
nemo_curator/core/utils.py One-line fix to include port 65535 in the scan range by changing upper bound from 65535 to 65536; no other logic changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["get_free_port(start_port)"] --> B["for port in range(start_port, 65536)"]
    B --> C{Port in Ray reserved range?}
    C -- Yes --> B
    C -- No --> D["Try socket.bind(localhost, port)"]
    D -- Success --> E["Return port"]
    D -- OSError --> F{get_next_free_port and port == start_port?}
    F -- "False (strict mode)" --> G["Raise RuntimeError: Port in use"]
    F -- True --> B
    B -- "Exhausted (was: stopped at 65534, now: stops at 65535)" --> H["Raise RuntimeError: No free port found"]
Loading

Reviews (6): Last reviewed commit: "Merge branch 'main' into fix/get-free-po..." | Re-trigger Greptile

range(start_port, 65535) excluded the last valid TCP port (65535).
Changed upper bound to 65536 so the loop covers the full valid range
0-65535, consistent with the error message that already referenced 65535.

Closes NVIDIA-NeMo#2023

Signed-off-by: miclaldogan <m.iclaldogan@gmail.com>
@miclaldogan miclaldogan force-pushed the fix/get-free-port-off-by-one branch from 78f48e1 to 14e3deb Compare May 22, 2026 23:38
@svcnvidia-nemo-ci svcnvidia-nemo-ci added the waiting-on-maintainers Waiting on maintainers to respond label May 25, 2026
Copy link
Copy Markdown
Contributor

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@svcnvidia-nemo-ci svcnvidia-nemo-ci added waiting-on-customer Waiting on the original author to respond waiting-on-maintainers Waiting on maintainers to respond and removed waiting-on-maintainers Waiting on maintainers to respond waiting-on-customer Waiting on the original author to respond labels May 28, 2026
@sarahyurick
Copy link
Copy Markdown
Contributor

/ok to test 6ecacb8

@sarahyurick
Copy link
Copy Markdown
Contributor

/ok to test 954a7eb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-request waiting-on-customer Waiting on the original author to respond

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: get_free_port() off-by-one — port 65535 never checked despite appearing in error message

4 participants