feat: add CompactVec collection with auto-spill support by nberlette · Pull Request #4 · nberlette/moos

nberlette · 2026-01-02T06:16:35Z

No description provided.

Copilot

Pull request overview

This PR adds a new CompactVec<T, N> collection type that optimizes memory usage by storing up to N elements inline before automatically spilling to heap-allocated storage. This complements the existing string-focused types (CowStr and InlineStr) by providing a similar optimization pattern for generic vector collections.

Key changes:

Implements CompactVec<T, N> with automatic inline-to-heap transition
Provides comprehensive trait implementations (Clone, Debug, Index, Iterator, etc.)
Includes Serde support when the feature is enabled

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 11 comments.

File	Description
src/lib.rs	Adds the new `compact_vec` module and re-exports its types
src/compact_vec.rs	Complete implementation of `CompactVec` with 739 lines including documentation, core logic, trait implementations, and comprehensive tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-02T06:20:24Z

src/compact_vec.rs

+
+impl<T: fmt::Debug, const N: usize> fmt::Debug for CompactVec<T, N> {
+  fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+    write!(f, "CompactVec<{N}> {s:?}", s = self.as_slice())


The Debug format string uses a named parameter s but also includes the const generic N directly in the format string using the inline constant syntax. While this works, it would be more consistent to format the capacity value explicitly, especially since it's available via self.capacity(). Consider using a more standard Debug format that shows whether the data is inline or heap-allocated, which would be more informative for debugging.

Suggested change

write!(f, "CompactVec<{N}> {s:?}", s = self.as_slice())

let storage = if self.heap.is_some() { "heap" } else { "inline" };

f.debug_struct("CompactVec")

.field("len", &self.len)

.field("capacity", &self.capacity())

.field("storage", &storage)

.field("elements", &self.as_slice())

.finish()

Copilot · 2026-01-02T06:20:24Z

src/compact_vec.rs

+        fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
+        where
+          A: SeqAccess<'de>,
+        {
+          let mut vec = CompactVec::new();
+          while let Some(value) = seq.next_element::<T>()? {
+            vec.push(value);
+          }
+          Ok(vec)
+        }


The Deserialize implementation doesn't attempt to use the size hint from the deserializer to pre-allocate capacity. When deserializing sequences with known lengths (like JSON arrays), this could be used to avoid multiple reallocations. Consider using seq.size_hint() to create the CompactVec with appropriate initial capacity via with_capacity().

Copilot · 2026-01-02T06:20:24Z

src/compact_vec.rs

+      Some(ref mut heap) => {
+        // Dropping the Vec will drop its contents automatically.
+        heap.clear();


The explicit call to heap.clear() is redundant. When the Vec is dropped (which happens automatically at the end of the function), it will drop all its contents. This extra clear() call adds unnecessary overhead without any benefit.

Suggested change

Some(ref mut heap) => {

// Dropping the Vec will drop its contents automatically.

heap.clear();

Some(_) => {

// Dropping the Vec will drop its contents automatically; no need to clear.

Copilot · 2026-01-02T06:20:25Z

src/compact_vec.rs

+      Some(heap) => heap,
+      None => {
+        let mut vec = Vec::with_capacity(self.len);
+        for i in 0..self.len {
+          unsafe {
+            vec.push(self.inline[i].assume_init_read());
+          }
+        }


After taking the heap Vec or reading inline elements, the Drop implementation will still run on self. Since heap is taken via take() (leaving it as None), the Drop impl will attempt to drop inline elements based on self.len. However, those elements have already been moved via assume_init_read(). This could lead to double-free or use-after-free. Consider using mem::forget(self) after moving the elements to prevent the Drop implementation from running, or set self.len = 0 before returning.

Suggested change

Some(heap) => heap,

None => {

let mut vec = Vec::with_capacity(self.len);

for i in 0..self.len {

unsafe {

vec.push(self.inline[i].assume_init_read());

}

}

Some(heap) => {

// All elements live in `heap`. After taking it, ensure `Drop`

// sees an empty inline vector.

self.len = 0;

heap

}

None => {

// All elements currently live inline; move them into a Vec.

let len = self.len;

let mut vec = Vec::with_capacity(len);

for i in 0..len {

unsafe {

vec.push(self.inline[i].assume_init_read());

}

}

// We've moved out all initialized inline elements; prevent

// `Drop` from attempting to drop them again.

self.len = 0;

Copilot · 2026-01-02T06:20:25Z

src/compact_vec.rs

+impl<T, I: Into<usize>, const N: usize> Index<I> for CompactVec<T, N> {
+  type Output = T;
+  fn index(&self, index: I) -> &Self::Output {
+    let index = index.into();
+    match self.heap {
+      Some(ref heap) => &heap[index],
+      None => {
+        assert!(index < self.len, "index out of bounds");
+        // SAFETY: index < len ensures the element is initialized.
+        unsafe { &*self.inline[index].as_ptr() }
+      }
+    }
+  }
+}


The Index and IndexMut trait implementations accept I: Into<usize> as a generic bound, which is broader than the standard library's approach. This allows types like u8, u16, u32, etc. to be used as indices. However, this is inconsistent with how standard Rust collections work (which implement Index for specific types like usize, Range<usize>, etc.). This could lead to confusion and unexpected behavior. Consider following the standard library pattern by implementing Index for specific index types.

Copilot · 2026-01-02T06:20:26Z

src/compact_vec.rs

+  pub fn extend<I: IntoIterator<Item = T>>(&mut self, iter: I) {
+    for item in iter {
+      self.push(item);
+    }
+  }


The extend method pushes items one at a time without pre-allocating space when the iterator provides a size hint. This can lead to multiple reallocations when extending with a large iterator. Consider checking if the iterator implements ExactSizeIterator or provides a lower bound via size_hint(), and pre-allocating capacity accordingly to improve performance.

Copilot · 2026-01-02T06:20:26Z

src/compact_vec.rs

+  pub const fn len(&self) -> usize {
+    match &self.heap {
+      Some(heap) => heap.len(),
+      None => self.len,
+    }
+  }


The len() method is marked as const, but it calls heap.len() which requires the Vec's len() method to be const. While this works in recent Rust versions (1.39+), the as_slice() and as_mut_slice() methods are also marked const and use more complex unsafe operations. Verify that all const operations here are actually valid in const contexts, as const support for these patterns may be limited in older Rust versions that this crate might target.

Copilot · 2026-01-02T06:20:26Z

src/compact_vec.rs

+  pub fn with_capacity(capacity: usize) -> Self {
+    if capacity <= N {
+      Self::new()
+    } else {
+      Self {
+        // SAFETY: uninitialized array is valid for inline storage.
+        inline: unsafe {
+          MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init()
+        },
+        len:    0,
+        heap:   Some(Vec::with_capacity(capacity)),
+      }
+    }
+  }


Missing test coverage for the with_capacity method when creating a heap-backed instance. While line 620-622 tests that with_capacity(12) creates a heap-backed instance, there's no test that verifies pushing elements into such a pre-allocated heap-backed vector works correctly, or that the capacity is actually used (no intermediate reallocations occur).

Copilot · 2026-01-02T06:20:26Z

src/compact_vec.rs

+          // previous capacity for amortized growth.
+          let mut vec = Vec::with_capacity(N * 2 + 1);


The spill capacity calculation uses N * 2 + 1 which is reasonable for most cases, but for N = 0, this results in a capacity of 1, which means immediate reallocation on the next push. While this works, consider using a more standard growth strategy like max(N * 2, 4) or max(N * 2, 8) to avoid frequent reallocations in the edge case where N is very small.

Suggested change

// previous capacity for amortized growth.

let mut vec = Vec::with_capacity(N * 2 + 1);

// previous capacity for amortized growth. Use a small

// minimum to avoid tiny allocations when N is very small.

let spill_capacity = if N == 0 { 4 } else { N * 2 };

let mut vec = Vec::with_capacity(spill_capacity);

Copilot · 2026-01-02T06:20:27Z

src/compact_vec.rs

+impl<T, const N: usize> CompactVec<T, N> {
+  /// Creates a new empty `CompactVec` with the specified inline
+  /// capacity. No heap allocation occurs until more than `N` elements
+  /// are pushed.
+  pub fn new() -> Self {
+    // SAFETY: An uninitialized array of `MaybeUninit<T>` is valid.
+    let inline =
+      unsafe { MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init() };
+    Self {
+      inline,
+      len: 0,
+      heap: None,
+    }
+  }
+
+  /// Creates a new `CompactVec` with enough capacity to hold at least
+  /// `capacity` elements without reallocation. If `capacity` is less
+  /// than or equal to `N`, no heap allocation will occur. Otherwise,
+  /// a `Vec` of the requested capacity will be allocated.
+  pub fn with_capacity(capacity: usize) -> Self {
+    if capacity <= N {
+      Self::new()
+    } else {
+      Self {
+        // SAFETY: uninitialized array is valid for inline storage.
+        inline: unsafe {
+          MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init()
+        },
+        len:    0,
+        heap:   Some(Vec::with_capacity(capacity)),
+      }
+    }
+  }
+
+  /// Returns the number of elements in the vector.
+  pub const fn len(&self) -> usize {
+    match &self.heap {
+      Some(heap) => heap.len(),
+      None => self.len,
+    }
+  }
+
+  /// Returns `true` if the vector contains no elements.
+  pub const fn is_empty(&self) -> bool {
+    self.len() == 0
+  }
+
+  /// Returns the total capacity of the vector. When stored inline,
+  /// this equals `N`; when stored on the heap, this delegates to the
+  /// internal `Vec`'s capacity.
+  pub fn capacity(&self) -> usize {
+    match &self.heap {
+      Some(heap) => heap.capacity(),
+      None => N,
+    }
+  }
+
+  /// Returns `true` if the data is currently stored inline (i.e.,
+  /// `len() <= N` and `heap` is `None`).
+  pub fn is_inline(&self) -> bool {
+    self.heap.is_none()
+  }
+
+  /// Provides an immutable slice of all elements in the vector.
+  pub const fn as_slice(&self) -> &[T] {
+    match &self.heap {
+      Some(heap) => heap.as_slice(),
+      None => {
+        // SAFETY: The first `len` elements of `inline` are
+        // initialized. We create a slice of that many elements.
+        unsafe {
+          core::slice::from_raw_parts(
+            self.inline.as_ptr() as *const T,
+            self.len,
+          )
+        }
+      }
+    }
+  }
+
+  /// Provides a mutable slice of all elements in the vector.
+  pub const fn as_mut_slice(&mut self) -> &mut [T] {
+    match &mut self.heap {
+      Some(heap) => heap.as_mut_slice(),
+      None => {
+        // SAFETY: The first `len` elements of `inline` are
+        // initialized. We create a mutable slice of that many
+        // elements. No aliasing occurs because either `heap` is
+        // `None` (so no other references exist) or we go into the
+        // `Some` branch above.
+        unsafe {
+          core::slice::from_raw_parts_mut(
+            self.inline.as_mut_ptr() as *mut T,
+            self.len,
+          )
+        }
+      }
+    }
+  }
+
+  /// Pushes a value onto the end of the vector. If the inline
+  /// storage is full, all existing elements are moved into a new
+  /// `Vec` and subsequent pushes are delegated to that vector.
+  pub fn push(&mut self, value: T) {
+    match self.heap {
+      Some(ref mut heap) => {
+        heap.push(value);
+      }
+      None => {
+        if self.len < N {
+          // SAFETY: We have capacity in `inline` at index `len`.
+          unsafe {
+            self.inline[self.len].as_mut_ptr().write(value);
+          }
+          self.len += 1;
+        } else {
+          // Spill to heap: allocate a new Vec with double the
+          // previous capacity for amortized growth.
+          let mut vec = Vec::with_capacity(N * 2 + 1);
+          // Move the existing inline elements into the Vec.
+          for i in 0..self.len {
+            // SAFETY: `i < len` so inline[i] is initialized.
+            unsafe {
+              vec.push(self.inline[i].assume_init_read());
+            }
+          }
+          vec.push(value);
+          self.heap = Some(vec);
+          // We no longer use the inline storage, so reset len.
+          self.len = 0;
+        }
+      }
+    }
+  }
+
+  /// Removes the last element from the vector and returns it, or
+  /// `None` if it is empty. If popping from a heap‑backed vector
+  /// results in a length that can be stored inline, the data is
+  /// automatically moved back into the inline storage to free the
+  /// heap allocation.
+  pub fn pop(&mut self) -> Option<T> {
+    match self.heap {
+      Some(ref mut heap) => {
+        let value = heap.pop();
+        if let Some(v) = value {
+          // If the remaining length fits into inline storage, move
+          // back onto the stack.
+          if heap.len() <= N {
+            let mut new_len = 0;
+            for elem in heap.drain(..) {
+              // SAFETY: We have ensured that `heap.len()`
+              // is less than or equal to `N`, so there is
+              // enough space in `inline` to store all
+              // remaining elements.
+              unsafe {
+                self.inline[new_len].as_mut_ptr().write(elem);
+              }
+              new_len += 1;
+            }
+            self.heap = None;
+            self.len = new_len;
+          }
+          Some(v)
+        } else {
+          None
+        }
+      }
+      None => {
+        if self.len == 0 {
+          None
+        } else {
+          self.len -= 1;
+          // SAFETY: `len` has been decremented, so the element
+          // at index `len` is initialized and can be read. After
+          // reading, we leave the memory uninitialized.
+          Some(unsafe { self.inline[self.len].assume_init_read() })
+        }
+      }
+    }
+  }
+
+  /// Clears the vector, removing all values. This resets the vector
+  /// back to an empty inline state, deallocating any heap storage.
+  pub fn clear(&mut self) {
+    match self.heap {
+      Some(ref mut heap) => {
+        heap.clear();
+        self.heap = None;
+        self.len = 0;
+      }
+      None => {
+        // Drop all inline elements
+        for i in 0..self.len {
+          unsafe {
+            self.inline[i].assume_init_drop();
+          }
+        }
+        self.len = 0;
+      }
+    }
+  }
+
+  /// Returns an iterator over the slice.
+  pub fn iter(&self) -> core::slice::Iter<'_, T> {
+    self.as_slice().iter()
+  }
+
+  /// Returns a mutable iterator over the slice.
+  pub fn iter_mut(&mut self) -> core::slice::IterMut<'_, T> {
+    self.as_mut_slice().iter_mut()
+  }
+
+  /// Extends the vector with the contents of an iterator. Items are
+  /// pushed individually, potentially causing a spill from inline to
+  /// heap if the total number of elements exceeds the inline capacity.
+  pub fn extend<I: IntoIterator<Item = T>>(&mut self, iter: I) {
+    for item in iter {
+      self.push(item);
+    }
+  }
+
+  /// Consumes the `CompactVec` and returns a standard `Vec<T>` with
+  /// identical contents. This performs at most one allocation and
+  /// moves all elements out of the inline storage if necessary.
+  pub fn into_vec(mut self) -> Vec<T> {
+    match self.heap.take() {
+      Some(heap) => heap,
+      None => {
+        let mut vec = Vec::with_capacity(self.len);
+        for i in 0..self.len {
+          unsafe {
+            vec.push(self.inline[i].assume_init_read());
+          }
+        }
+        vec
+      }
+    }
+  }
+}


The CompactVec API is missing common Vec methods like reserve(), reserve_exact(), shrink_to_fit(), truncate(), insert(), remove(), etc. While the current API provides basic functionality, users might expect these standard collection methods to be available. Consider documenting which Vec methods are intentionally omitted and why, or implementing them for API completeness.

feat: add CompactVec collection with auto-spill support

13a5522

Copilot AI review requested due to automatic review settings January 2, 2026 06:16

nberlette added the enhancement New feature or request label Jan 2, 2026

nberlette self-assigned this Jan 2, 2026

Copilot started reviewing on behalf of nberlette January 2, 2026 06:17 View session

Copilot AI reviewed Jan 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add CompactVec collection with auto-spill support#4

feat: add CompactVec collection with auto-spill support#4
nberlette wants to merge 1 commit intomainfrom
feat/compact-vec

nberlette commented Jan 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    write!(f, "CompactVec<{N}> {s:?}", s = self.as_slice())
+    let storage = if self.heap.is_some() { "heap" } else { "inline" };
+    f.debug_struct("CompactVec")
+      .field("len", &self.len)
+      .field("capacity", &self.capacity())
+      .field("storage", &storage)
+      .field("elements", &self.as_slice())
+      .finish()

		// previous capacity for amortized growth.
		let mut vec = Vec::with_capacity(N * 2 + 1);

-          // previous capacity for amortized growth.
-          let mut vec = Vec::with_capacity(N * 2 + 1);
+          // previous capacity for amortized growth. Use a small
+          // minimum to avoid tiny allocations when N is very small.
+          let spill_capacity = if N == 0 { 4 } else { N * 2 };
+          let mut vec = Vec::with_capacity(spill_capacity);

Uh oh!

Conversation

nberlette commented Jan 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants