Skip to content

Add IP Address Management#4046

Draft
VishalDalwadi wants to merge 1 commit into
developfrom
feat/ipam
Draft

Add IP Address Management#4046
VishalDalwadi wants to merge 1 commit into
developfrom
feat/ipam

Conversation

@VishalDalwadi

Copy link
Copy Markdown
Collaborator

Describe your changes

Provide Issue ticket number if applicable/not in title

Provide testing steps

Checklist before requesting a review

  • My changes affect only 10 files or less.
  • I have performed a self-review of my code and tested it.
  • If it is a new feature, I have added thorough tests, my code is <= 1450 lines.
  • If it is a bugfix, my code is <= 200 lines.
  • My functions are <= 80 lines.
  • I have had my code reviewed by a peer.
  • My unit tests pass locally.
  • Netmaker is awesome.

@tenki-reviewer

tenki-reviewer Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Review Complete

Files Reviewed: 2
Findings: 1

By Severity:

  • 🟠 High: 1

The new IPAM package introduces a territory-boundary bug in both IPv4 and IPv6 cursor-advance loops: when an IP from one device-type territory is deallocated and the opposite-type cursor advances far enough to reach it, the freed IP is silently allocated to the wrong device type, causing the same address to be handed to both a node and an extclient simultaneously.

Files Reviewed (2 files)
ipam/ipam.go
ipam/ipam_test.go

@tenki-reviewer tenki-reviewer Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk: 🟠 High (70/100) — 1 high finding · 1277 LOC across 2 files


Overview

This PR introduces a new ipam package providing stateful, dual-territory IPv4 and IPv6 address allocators. The overall design is sound (two cursors approaching from opposite ends, a shared sorted freelist with territory guards) and the test coverage is broad. However, there is a high-severity logic bug in both the IPv4 and IPv6 cursor-advance loops that breaks the core territory-isolation guarantee.

Bug: Cursor Walk Ignores Opposite-Territory Boundary

Affected code

  • ipam/ipam.goIPv4Allocator.AllocateNode (lines 160–176)
  • ipam/ipam.goIPv4Allocator.AllocateExtclient (lines 197–213)
  • ipam/ipam.goIPv6Allocator.AllocateNode (lines 383–407)
  • ipam/ipam.goIPv6Allocator.AllocateExtclient (lines 430–454)

Root Cause

The freelist recycling path correctly uses the cursor as a territory guard (!a.nodeCursor.Less(a.freelist[0]) etc.). However, the cursor-advance (walk) path only checks a.allocated[addr] — it does not check whether addr has crossed into the opposite territory.

When Deallocate is called, the IP is removed from a.allocated and inserted into the freelist. If the advancing cursor subsequently reaches that freed IP, it is allocated to the wrong device type — the freelist guard only fires at the start of the method based on the front/back of the list, not during the cursor walk.

Concrete Reproduction (IPv4, 10.0.0.0/28)

  1. Allocate nodes .1.7 (nodeCursor=.7); allocate extclients .14.8 (extCursor=.8).
  2. Deallocate extclient .13.13 removed from allocated; freelist = [.13].
  3. Call AllocateNode:
    • Freelist guard: freelist[0]=.13 > nodeCursor=.7skips freelist (correct).
    • Cursor walk: .8 (allocated), .9, .10, .11, .12 (all allocated), .13not in allocated → returned as a node IP. extCursor stays at .8.
  4. .13 is now assigned to a node; the next AllocateExtclient will also recycle .13 from the freelist, creating a duplicate assignment.

The symmetric bug hits AllocateExtclient when node IPs are freed and the extclient cursor retreats past them. Both IPv4 and IPv6 are affected identically.

Fix

Add a territory-boundary guard inside each cursor-advance loop before the allocated lookup:

// In AllocateNode — stop if addr has reached extclient territory
if !addr.Less(a.extCursor) {
    return netip.Addr{}, ErrExhausted
}

// In AllocateExtclient — stop if addr has retreated into node territory
if !a.nodeCursor.Less(addr) {
    return netip.Addr{}, ErrExhausted
}

Why Tests Miss This

TestIPv4CursorsMeetFreelistRecycle frees node IPs .1, .2 and extclient IPs .14, .13, then calls AllocateNode exactly twice — both calls recycle via the freelist (.1 and .2 satisfy <= nodeCursor=.7). The test never calls AllocateNode a third time with the extclient IPs still in the freelist, which is where the bug manifests. A regression test should be added.

Comment thread ipam/ipam.go
Comment on lines +160 to +176
if errors.Is(err, iplib.ErrAddressOutOfRange) || errors.Is(err, iplib.ErrBroadcastAddress) {
return netip.Addr{}, ErrExhausted
}
cursor = next
addr := netIPv4ToAddr(cursor)
if _, taken := a.allocated[addr]; !taken {
a.nodeCursor = addr
a.allocated[addr] = struct{}{}
return addr, nil
}
}
}

// AllocateExtclient picks the next available IP from the high end of the
// subnet and marks it as allocated. It first attempts to reuse a freed IP
// from the back of the free list, but only if that IP lies within the
// extclient territory (i.e., it is not below the current extclient cursor).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Cursor-advance loop allocates freed opposite-territory IPs to the wrong device type (bug)

In both IPv4Allocator and IPv6Allocator, the AllocateNode and AllocateExtclient cursor-advance loops skip IPs that are present in a.allocated. When Deallocate removes an IP from allocated and places it on the freelist, the cursor walk can reach and allocate that IP regardless of which territory originally owned it.

IPv4 example (10.0.0.0/28): allocate nodes .1.7 and extclients .14.8. Deallocate extclient .13 (removed from allocated, freelist = [.13]). Call AllocateNode: freelist guard fires (freelist[0]=.13 > nodeCursor=.7) and correctly skips the freelist. Cursor walk then visits .8.12 (all in allocated), hits .13 (not in allocated), and returns it as a node IP. extCursor remains .8, so .13 is now simultaneously a live extclient-territory address and an allocated node address — a duplicate assignment.

The symmetric bug exists in AllocateExtclient when node IPs are freed and the extclient cursor retreats past them. Both the IPv4 and IPv6 variants have identical cursor-walk code without this guard.

💡 Suggestion: Add a territory-boundary check inside each cursor-advance loop, immediately after computing addr and before the allocated lookup:

  • In AllocateNode (IPv4 and IPv6): if !addr.Less(a.extCursor) { return netip.Addr{}, ErrExhausted }
  • In AllocateExtclient (IPv4 and IPv6): if !a.nodeCursor.Less(addr) { return netip.Addr{}, ErrExhausted }

This ensures neither cursor can walk into the other's territory even when IPs from that territory have been freed (and are thus absent from allocated).

📋 Prompt for AI Agents

In ipam/ipam.go, the cursor-advance loops in AllocateNode and AllocateExtclient do not enforce the territory boundary, causing freed opposite-territory IPs to be allocated to the wrong device type. Fix by adding boundary guards in four places:

  1. IPv4Allocator.AllocateNode (around line 164-165): after addr := netIPv4ToAddr(cursor), add before the allocated-map check:
if !addr.Less(a.extCursor) {
    return netip.Addr{}, ErrExhausted
}
  1. IPv4Allocator.AllocateExtclient (around line 206-207): after addr := netIPv4ToAddr(cursor), add:
if !a.nodeCursor.Less(addr) {
    return netip.Addr{}, ErrExhausted
}
  1. IPv6Allocator.AllocateNode (around line 391-393): after addr := netIPv6ToAddr(cursor) and before the addr == a.subnetMax check, add:
if !addr.Less(a.extCursor) {
    return netip.Addr{}, ErrExhausted
}
  1. IPv6Allocator.AllocateExtclient (around line 444-446): after addr := netIPv6ToAddr(cursor) and before the addr == a.networkAddr check, add:
if !a.nodeCursor.Less(addr) {
    return netip.Addr{}, ErrExhausted
}

Also add a regression test in ipam_test.go: on 10.0.0.0/28, allocate nodes .1-.7 (nodeCursor=.7) and extclients .14-.8 (extCursor=.8), then deallocate extclient .13, then call AllocateNode and assert it returns ErrExhausted rather than .13.

@abhishek9686 abhishek9686 marked this pull request as draft June 6, 2026 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant