Skip to content

AWK getline allows unbounded file caching in memory #988

@chaliy

Description

@chaliy

Summary

The getline var < file action in AWK reads files from the VFS and caches them fully in memory via ensure_file_loaded(). There is no limit on the number of distinct files that can be opened or on per-file buffered size. An AWK program can open hundreds of large files simultaneously, exhausting memory.

Severity: Medium
Category: Denial of Service / Memory Exhaustion (TM-DOS)

Affected Files

  • crates/bashkit/src/builtins/awk.rs lines 3180-3208

Steps to Reproduce

# Create many files, then read them all via getline
for i in $(seq 1 1000); do dd if=/dev/urandom bs=1024 count=100 of="/tmp/file_$i" 2>/dev/null; done
echo "" | awk 'BEGIN { for(i=1;i<=1000;i++) { f="/tmp/file_"i; while((getline line < f) > 0) {} } }'
# Caches 1000 × 100KB = 100MB in memory

Impact

Unbounded memory consumption via opening many large files through getline.

Acceptance Criteria

  • Limit number of concurrently cached files in file_inputs (e.g., 100)
  • Enforce per-file size limits consistent with FsLimits::max_file_size
  • Close/evict least-recently-used files when limit is exceeded
  • Test: Opening 200 files via getline hits limit and returns error
  • Test: Opening 50 files via getline works correctly

Metadata

Metadata

Assignees

No one assigned

    Labels

    securitySecurity vulnerability or hardening

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions