Skip to content

feat: add CompactVec collection with auto-spill support#4

Open
nberlette wants to merge 1 commit intomainfrom
feat/compact-vec
Open

feat: add CompactVec collection with auto-spill support#4
nberlette wants to merge 1 commit intomainfrom
feat/compact-vec

Conversation

@nberlette
Copy link
Copy Markdown
Owner

No description provided.

Copilot AI review requested due to automatic review settings January 2, 2026 06:16
@nberlette nberlette added the enhancement New feature or request label Jan 2, 2026
@nberlette nberlette self-assigned this Jan 2, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new CompactVec<T, N> collection type that optimizes memory usage by storing up to N elements inline before automatically spilling to heap-allocated storage. This complements the existing string-focused types (CowStr and InlineStr) by providing a similar optimization pattern for generic vector collections.

Key changes:

  • Implements CompactVec<T, N> with automatic inline-to-heap transition
  • Provides comprehensive trait implementations (Clone, Debug, Index, Iterator, etc.)
  • Includes Serde support when the feature is enabled

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 11 comments.

File Description
src/lib.rs Adds the new compact_vec module and re-exports its types
src/compact_vec.rs Complete implementation of CompactVec with 739 lines including documentation, core logic, trait implementations, and comprehensive tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


impl<T: fmt::Debug, const N: usize> fmt::Debug for CompactVec<T, N> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "CompactVec<{N}> {s:?}", s = self.as_slice())
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Debug format string uses a named parameter s but also includes the const generic N directly in the format string using the inline constant syntax. While this works, it would be more consistent to format the capacity value explicitly, especially since it's available via self.capacity(). Consider using a more standard Debug format that shows whether the data is inline or heap-allocated, which would be more informative for debugging.

Suggested change
write!(f, "CompactVec<{N}> {s:?}", s = self.as_slice())
let storage = if self.heap.is_some() { "heap" } else { "inline" };
f.debug_struct("CompactVec")
.field("len", &self.len)
.field("capacity", &self.capacity())
.field("storage", &storage)
.field("elements", &self.as_slice())
.finish()

Copilot uses AI. Check for mistakes.
Comment on lines +523 to +532
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
where
A: SeqAccess<'de>,
{
let mut vec = CompactVec::new();
while let Some(value) = seq.next_element::<T>()? {
vec.push(value);
}
Ok(vec)
}
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Deserialize implementation doesn't attempt to use the size hint from the deserializer to pre-allocate capacity. When deserializing sequences with known lengths (like JSON arrays), this could be used to avoid multiple reallocations. Consider using seq.size_hint() to create the CompactVec with appropriate initial capacity via with_capacity().

Copilot uses AI. Check for mistakes.
Comment on lines +327 to +329
Some(ref mut heap) => {
// Dropping the Vec will drop its contents automatically.
heap.clear();
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit call to heap.clear() is redundant. When the Vec is dropped (which happens automatically at the end of the function), it will drop all its contents. This extra clear() call adds unnecessary overhead without any benefit.

Suggested change
Some(ref mut heap) => {
// Dropping the Vec will drop its contents automatically.
heap.clear();
Some(_) => {
// Dropping the Vec will drop its contents automatically; no need to clear.

Copilot uses AI. Check for mistakes.
Comment on lines +310 to +317
Some(heap) => heap,
None => {
let mut vec = Vec::with_capacity(self.len);
for i in 0..self.len {
unsafe {
vec.push(self.inline[i].assume_init_read());
}
}
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After taking the heap Vec or reading inline elements, the Drop implementation will still run on self. Since heap is taken via take() (leaving it as None), the Drop impl will attempt to drop inline elements based on self.len. However, those elements have already been moved via assume_init_read(). This could lead to double-free or use-after-free. Consider using mem::forget(self) after moving the elements to prevent the Drop implementation from running, or set self.len = 0 before returning.

Suggested change
Some(heap) => heap,
None => {
let mut vec = Vec::with_capacity(self.len);
for i in 0..self.len {
unsafe {
vec.push(self.inline[i].assume_init_read());
}
}
Some(heap) => {
// All elements live in `heap`. After taking it, ensure `Drop`
// sees an empty inline vector.
self.len = 0;
heap
}
None => {
// All elements currently live inline; move them into a Vec.
let len = self.len;
let mut vec = Vec::with_capacity(len);
for i in 0..len {
unsafe {
vec.push(self.inline[i].assume_init_read());
}
}
// We've moved out all initialized inline elements; prevent
// `Drop` from attempting to drop them again.
self.len = 0;

Copilot uses AI. Check for mistakes.
Comment on lines +349 to +362
impl<T, I: Into<usize>, const N: usize> Index<I> for CompactVec<T, N> {
type Output = T;
fn index(&self, index: I) -> &Self::Output {
let index = index.into();
match self.heap {
Some(ref heap) => &heap[index],
None => {
assert!(index < self.len, "index out of bounds");
// SAFETY: index < len ensures the element is initialized.
unsafe { &*self.inline[index].as_ptr() }
}
}
}
}
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Index and IndexMut trait implementations accept I: Into<usize> as a generic bound, which is broader than the standard library's approach. This allows types like u8, u16, u32, etc. to be used as indices. However, this is inconsistent with how standard Rust collections work (which implement Index for specific types like usize, Range<usize>, etc.). This could lead to confusion and unexpected behavior. Consider following the standard library pattern by implementing Index for specific index types.

Copilot uses AI. Check for mistakes.
Comment on lines +299 to +303
pub fn extend<I: IntoIterator<Item = T>>(&mut self, iter: I) {
for item in iter {
self.push(item);
}
}
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extend method pushes items one at a time without pre-allocating space when the iterator provides a size hint. This can lead to multiple reallocations when extending with a large iterator. Consider checking if the iterator implements ExactSizeIterator or provides a lower bound via size_hint(), and pre-allocating capacity accordingly to improve performance.

Copilot uses AI. Check for mistakes.
Comment on lines +119 to +124
pub const fn len(&self) -> usize {
match &self.heap {
Some(heap) => heap.len(),
None => self.len,
}
}
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The len() method is marked as const, but it calls heap.len() which requires the Vec's len() method to be const. While this works in recent Rust versions (1.39+), the as_slice() and as_mut_slice() methods are also marked const and use more complex unsafe operations. Verify that all const operations here are actually valid in const contexts, as const support for these patterns may be limited in older Rust versions that this crate might target.

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +116
pub fn with_capacity(capacity: usize) -> Self {
if capacity <= N {
Self::new()
} else {
Self {
// SAFETY: uninitialized array is valid for inline storage.
inline: unsafe {
MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init()
},
len: 0,
heap: Some(Vec::with_capacity(capacity)),
}
}
}
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the with_capacity method when creating a heap-backed instance. While line 620-622 tests that with_capacity(12) creates a heap-backed instance, there's no test that verifies pushing elements into such a pre-allocated heap-backed vector works correctly, or that the capacity is actually used (no intermediate reallocations occur).

Copilot uses AI. Check for mistakes.
Comment on lines +201 to +202
// previous capacity for amortized growth.
let mut vec = Vec::with_capacity(N * 2 + 1);
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spill capacity calculation uses N * 2 + 1 which is reasonable for most cases, but for N = 0, this results in a capacity of 1, which means immediate reallocation on the next push. While this works, consider using a more standard growth strategy like max(N * 2, 4) or max(N * 2, 8) to avoid frequent reallocations in the edge case where N is very small.

Suggested change
// previous capacity for amortized growth.
let mut vec = Vec::with_capacity(N * 2 + 1);
// previous capacity for amortized growth. Use a small
// minimum to avoid tiny allocations when N is very small.
let spill_capacity = if N == 0 { 4 } else { N * 2 };
let mut vec = Vec::with_capacity(spill_capacity);

Copilot uses AI. Check for mistakes.
Comment on lines +84 to +322
impl<T, const N: usize> CompactVec<T, N> {
/// Creates a new empty `CompactVec` with the specified inline
/// capacity. No heap allocation occurs until more than `N` elements
/// are pushed.
pub fn new() -> Self {
// SAFETY: An uninitialized array of `MaybeUninit<T>` is valid.
let inline =
unsafe { MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init() };
Self {
inline,
len: 0,
heap: None,
}
}

/// Creates a new `CompactVec` with enough capacity to hold at least
/// `capacity` elements without reallocation. If `capacity` is less
/// than or equal to `N`, no heap allocation will occur. Otherwise,
/// a `Vec` of the requested capacity will be allocated.
pub fn with_capacity(capacity: usize) -> Self {
if capacity <= N {
Self::new()
} else {
Self {
// SAFETY: uninitialized array is valid for inline storage.
inline: unsafe {
MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init()
},
len: 0,
heap: Some(Vec::with_capacity(capacity)),
}
}
}

/// Returns the number of elements in the vector.
pub const fn len(&self) -> usize {
match &self.heap {
Some(heap) => heap.len(),
None => self.len,
}
}

/// Returns `true` if the vector contains no elements.
pub const fn is_empty(&self) -> bool {
self.len() == 0
}

/// Returns the total capacity of the vector. When stored inline,
/// this equals `N`; when stored on the heap, this delegates to the
/// internal `Vec`'s capacity.
pub fn capacity(&self) -> usize {
match &self.heap {
Some(heap) => heap.capacity(),
None => N,
}
}

/// Returns `true` if the data is currently stored inline (i.e.,
/// `len() <= N` and `heap` is `None`).
pub fn is_inline(&self) -> bool {
self.heap.is_none()
}

/// Provides an immutable slice of all elements in the vector.
pub const fn as_slice(&self) -> &[T] {
match &self.heap {
Some(heap) => heap.as_slice(),
None => {
// SAFETY: The first `len` elements of `inline` are
// initialized. We create a slice of that many elements.
unsafe {
core::slice::from_raw_parts(
self.inline.as_ptr() as *const T,
self.len,
)
}
}
}
}

/// Provides a mutable slice of all elements in the vector.
pub const fn as_mut_slice(&mut self) -> &mut [T] {
match &mut self.heap {
Some(heap) => heap.as_mut_slice(),
None => {
// SAFETY: The first `len` elements of `inline` are
// initialized. We create a mutable slice of that many
// elements. No aliasing occurs because either `heap` is
// `None` (so no other references exist) or we go into the
// `Some` branch above.
unsafe {
core::slice::from_raw_parts_mut(
self.inline.as_mut_ptr() as *mut T,
self.len,
)
}
}
}
}

/// Pushes a value onto the end of the vector. If the inline
/// storage is full, all existing elements are moved into a new
/// `Vec` and subsequent pushes are delegated to that vector.
pub fn push(&mut self, value: T) {
match self.heap {
Some(ref mut heap) => {
heap.push(value);
}
None => {
if self.len < N {
// SAFETY: We have capacity in `inline` at index `len`.
unsafe {
self.inline[self.len].as_mut_ptr().write(value);
}
self.len += 1;
} else {
// Spill to heap: allocate a new Vec with double the
// previous capacity for amortized growth.
let mut vec = Vec::with_capacity(N * 2 + 1);
// Move the existing inline elements into the Vec.
for i in 0..self.len {
// SAFETY: `i < len` so inline[i] is initialized.
unsafe {
vec.push(self.inline[i].assume_init_read());
}
}
vec.push(value);
self.heap = Some(vec);
// We no longer use the inline storage, so reset len.
self.len = 0;
}
}
}
}

/// Removes the last element from the vector and returns it, or
/// `None` if it is empty. If popping from a heap‑backed vector
/// results in a length that can be stored inline, the data is
/// automatically moved back into the inline storage to free the
/// heap allocation.
pub fn pop(&mut self) -> Option<T> {
match self.heap {
Some(ref mut heap) => {
let value = heap.pop();
if let Some(v) = value {
// If the remaining length fits into inline storage, move
// back onto the stack.
if heap.len() <= N {
let mut new_len = 0;
for elem in heap.drain(..) {
// SAFETY: We have ensured that `heap.len()`
// is less than or equal to `N`, so there is
// enough space in `inline` to store all
// remaining elements.
unsafe {
self.inline[new_len].as_mut_ptr().write(elem);
}
new_len += 1;
}
self.heap = None;
self.len = new_len;
}
Some(v)
} else {
None
}
}
None => {
if self.len == 0 {
None
} else {
self.len -= 1;
// SAFETY: `len` has been decremented, so the element
// at index `len` is initialized and can be read. After
// reading, we leave the memory uninitialized.
Some(unsafe { self.inline[self.len].assume_init_read() })
}
}
}
}

/// Clears the vector, removing all values. This resets the vector
/// back to an empty inline state, deallocating any heap storage.
pub fn clear(&mut self) {
match self.heap {
Some(ref mut heap) => {
heap.clear();
self.heap = None;
self.len = 0;
}
None => {
// Drop all inline elements
for i in 0..self.len {
unsafe {
self.inline[i].assume_init_drop();
}
}
self.len = 0;
}
}
}

/// Returns an iterator over the slice.
pub fn iter(&self) -> core::slice::Iter<'_, T> {
self.as_slice().iter()
}

/// Returns a mutable iterator over the slice.
pub fn iter_mut(&mut self) -> core::slice::IterMut<'_, T> {
self.as_mut_slice().iter_mut()
}

/// Extends the vector with the contents of an iterator. Items are
/// pushed individually, potentially causing a spill from inline to
/// heap if the total number of elements exceeds the inline capacity.
pub fn extend<I: IntoIterator<Item = T>>(&mut self, iter: I) {
for item in iter {
self.push(item);
}
}

/// Consumes the `CompactVec` and returns a standard `Vec<T>` with
/// identical contents. This performs at most one allocation and
/// moves all elements out of the inline storage if necessary.
pub fn into_vec(mut self) -> Vec<T> {
match self.heap.take() {
Some(heap) => heap,
None => {
let mut vec = Vec::with_capacity(self.len);
for i in 0..self.len {
unsafe {
vec.push(self.inline[i].assume_init_read());
}
}
vec
}
}
}
}
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CompactVec API is missing common Vec methods like reserve(), reserve_exact(), shrink_to_fit(), truncate(), insert(), remove(), etc. While the current API provides basic functionality, users might expect these standard collection methods to be available. Consider documenting which Vec methods are intentionally omitted and why, or implementing them for API completeness.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants