feat: disk-backed fallback when memory exceeds 80% RSS#3
Open
StrongWind1 wants to merge 6 commits into
Open
Conversation
Add automatic memory pressure detection via MemMonitor (80% RSS threshold). When triggered, MessageStore and PmkidStore flush their in-memory data to temp files and switch to disk-backed mode for the remainder of the run. MessageStore disk mode: - Binary serialization format for EapolMessage (99-byte fixed header + optional 57-byte FtFields + variable eapol_frame) - flush_to_disk() serializes all groups, replaces Vec<EapolMessage> with Vec<MessageRef> (8 bytes per message vs ~228) - New messages route directly to disk via add_to_disk() - group_keys() + load_group() for lazy Phase 4 iteration - canonicalize_pairs() rewrites index keys without loading data PmkidStore disk mode: - Same pattern: binary serialization, flush, disk-backed add/iter - iter() returns owned PmkidEntry values (Box<dyn Iterator>) Pairing engine (pair/mod.rs): - Disk mode: single-threaded iteration via group_keys() + load_group() - Memory mode: unchanged rayon parallel path - estimate_total_cost() returns 0 in disk mode (skip dedup pre-sizing) Output pipeline: - PerSinkDedup::reserve() now takes active_sinks mask, only pre-sizes HashSets for configured sinks (fixes 9x over-allocation bug) Main integration: - MemMonitor checks every 50K packets + every file transition - Flush both stores when disk_mode activates - Cleanup temp files on shutdown
Add memory pressure check before PerSinkDedup::reserve(). When the estimated allocation (12 bytes × estimated_hashes × active_sink_count) would push RSS past the 80% threshold, skip pre-sizing entirely and let sets grow incrementally. This prevents the single-shot allocation OOM that occurs when the estimated cost is in the billions. Pass MemMonitor through OutputContext::emit() and run_output() so the output pipeline can check memory pressure before pre-sizing. Fix non-ASCII em-dashes in mem_monitor.rs and disk_messages.rs.
When the dedup HashSet would exceed the 80% RSS threshold, switch to write-through mode: hash lines go directly to output files (accepting temporary duplicates) while fingerprints are recorded in 256 partitioned bucket files per sink (fingerprint % 256, 16 bytes per record). After emission completes, a cleaning pass processes buckets one at a time: sort by fingerprint, identify runs with count > 1, collect line numbers to remove (keep first occurrence), then rewrite each output file without the duplicate lines. Mid-emission switchover is supported: if memory pressure activates during Phase 4 output, the in-memory HashSet is flushed to bucket files with sentinel line numbers (u64::MAX), the HashSet is drained to free memory, and emission continues in write-through mode. The cleaning pass handles the mixed state correctly. New module: src/output/disk_dedup.rs - DiskDedup: coordinator with per-sink bucket state - DiskDedupSink: 256 bucket file writers per sink - build_removal_set(): sort-based duplicate detection - rewrite_without_lines(): line-number-based output filter - Drop impl ensures bucket files are cleaned up - 5 unit tests covering no-dups, dups, sentinels, cleanup Modified: src/output/dedup.rs - SinkId::from_index() for index-to-enum conversion - PerSinkDedup::flush_to_buckets() for mid-emission switchover - PerSinkDedup::drain() to free HashSet memory Modified: src/output/mod.rs - fan_out() accepts Option<DiskDedup> for write-through mode - OutputContext holds disk_dedup state - Cleaning pass runs in finalize() before auxiliary outputs - HashSinks::path() accessor for cleaning pass
Two bugs found during forced disk-mode testing (WPAWOLF_MEM_THRESHOLD=1): 1. BufWriter flush: add_to_disk() writes through a BufWriter, but load_group()/iter() opens the file for reading independently. Records still in the BufWriter buffer were invisible to the reader, causing 124 PMKID lines to be silently lost. Fix: add flush_disk_writer() methods to both MessageStore and PmkidStore, called before Phase 4. 2. PMKID dedup in disk mode: add_to_disk() skipped the per-pair byte-equality check, allowing duplicate PMKIDs through. Fix: add a disk_seen HashMap<MacPair, HashSet<[u8;16]>> that tracks seen PMKID values per pair. Populated during flush_to_disk() for already-stored entries. Costs ~20 bytes per unique PMKID. Also adds WPAWOLF_MEM_THRESHOLD env var override (integer percent) for testing disk fallback without needing a machine at 80% RSS. Verified: WPAWOLF_MEM_THRESHOLD=1 produces sorted-content-identical output to the in-memory path (SHA-256 match, 0 diff lines).
The disk-backed fallback makes --per-file unnecessary -- memory pressure is handled automatically by spilling stores to disk. Remove the flag, all per-file code paths (per-file emit loop, per-file WDS resolve, per-file MLD canonicalization), and the per-file integration test. --strict now bundles 4 filters instead of 5: --eapoltimeout=5, --rc-drift=8, --dedup-hash-combos, --nc-dedup.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When process RSS reaches 80% of system RAM, automatically switch heavy stores to disk-backed mode. Small captures stay fully in-memory; large ones spill automatically. No new dependencies.
Changes
Memory monitor (
src/mem_monitor.rs)disk_modeflagwould_exceed()predicts whether a large allocation would cross the thresholdWPAWOLF_MEM_THRESHOLDenv var override for testing (integer percent)MessageStore disk fallback (
src/store/messages.rs,src/store/disk_messages.rs)EapolMessage(99-byte fixed header + optional 57-byte FtFields + variable frame)flush_to_disk()serializes all groups to temp file, replacesVec<EapolMessage>with lightweightVec<MessageRef>(8 bytes vs ~228)add_to_disk()group_keys()+load_group()for lazy Phase 4 iteration (one group in memory at a time)canonicalize_pairs()rewrites index keys without loading message dataPmkidStore disk fallback (
src/store/pmkid.rs)HashSet<[u8;16]>tracks seen PMKID values in disk mode (prevents duplicate insertion without scanning the disk file)flush_to_disk()so entries serialized pre-flush are trackedPhase 4 disk-mode pairing (
src/pair/mod.rs)pair_all_groups_disk(): single-threaded iteration viagroup_keys()+load_group()estimate_total_cost()returns 0 in disk mode (dedup pre-sizing skipped)PerSinkDedup reserve() fix (
src/output/dedup.rs)reserve()now takes anactive_sinksmask, only pre-sizes HashSets for configured sinksDisk-backed dedup (
src/output/disk_dedup.rs)fingerprint % 256, 16 bytes per record)Dropimpl ensures bucket files are cleaned upMain integration (
src/main.rs)MemMonitorchecks every 50K packets + every file transitiondisk_modeactivatesTesting
make clean && make check-allpasses clean on both branches