diff --git a/CLAUDE.md b/CLAUDE.md index 588f369..ae979f6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,13 +6,13 @@ ## Project -`noisekit` is a `uvx`-compatible Python CLI that generates degraded speech datasets from clean HuggingFace corpora. It simulates seven atomic audio degradation scenarios — telecom (G.711 calls), low-bitrate codec compression, noisy environments (real ambient noise), far-field reverb, transmission dropout, and clipping distortion — plus compound multi-condition scenarios built by chaining atomic presets. Designed for ASR noise-robustness benchmarking. A `clean_reference` control completes the catalog. +`noisekit` is a `uvx`-compatible Python CLI that generates degraded speech datasets from clean HuggingFace corpora. It simulates six atomic audio degradation scenarios — telecom (G.711 calls), low-bitrate codec compression, noisy environments (real ambient noise), far-field reverb, and clipping distortion — plus compound multi-condition scenarios built by chaining atomic presets. Designed for ASR noise-robustness benchmarking. A `clean_reference` control completes the catalog. ## Package Management Use **UV** for everything: `uv add`, `uv run`, `uv sync`. Never use pip directly. -Key runtime dependencies: `audiomentations>=0.38`, `lameenc>=1.4` (pure-Python MP3 encoder used by `Mp3Compression` in `telecom` and `low_bitrate`; no system ffmpeg needed), `torchmetrics>=1.7.0` (NISQA scoring — downloads ~50 MB model weights to `~/.torchmetrics/NISQA/` on first use), `pyroomacoustics` (room acoustics simulation for `reverb_far_field` — now a core dependency, no extra install needed). +Key runtime dependencies: `audiomentations>=0.38`, `lameenc>=1.4` (pure-Python MP3 encoder used by `Mp3Compression` in `telecom` and `low_bitrate`; no system ffmpeg needed), `torchmetrics>=1.7.0` (NISQA scoring — downloads ~50 MB model weights to `~/.torchmetrics/NISQA/` on first use), `pyroomacoustics` (room acoustics simulation for `reverb` — now a core dependency, no extra install needed). ## Architecture @@ -23,7 +23,7 @@ noisekit/ ├── dataset.py # HuggingFace dataset loading (soundfile decoder, no torchcodec) ├── transforms.py # Preset loading; returns PresetTransforms(full, scoring, scoring_sr) ├── scoring.py # PESQ + SNR + NISQA; PESQ NB at 8 kHz for telephony presets -├── noise_cache.py # Auto-downloads MUSAN music+noise for noisy_environment +├── noise_cache.py # Auto-downloads MUSAN music+noise for noise └── presets/ # YAML preset files bundled with the package ``` @@ -32,7 +32,7 @@ noisekit/ ```bash noisekit generate --dataset --samples N --presets P1 P2 --output ./out --seed 42 -noisekit generate ... --presets noisy_environment --noise-dir /path/to/noise_wavs +noisekit generate ... --presets noise --noise-dir /path/to/noise_wavs noisekit generate ... --no-nisqa # skip NISQA (no model download, faster) noisekit score ./audio_dir [--reference-dir ./ref] [--output scores.json] noisekit score ./audio_dir --no-nisqa # skip NISQA for standalone scoring @@ -41,7 +41,7 @@ noisekit list-presets [--verbose] Custom presets: `--preset-file ./my_preset.yaml` -The `noisy_environment` preset uses a directory of background-noise WAVs. If `--noise-dir` is omitted, noisekit auto-downloads a small MUSAN **noise-only** subset (~20 files, ~120 MB) from `Aynursusuz/musan-audio-dataset` on HuggingFace to `~/.cache/noisekit/noise/musan_ambient/` on first use. Both `speech` and `music` classes are excluded: speech pollutes ASR/PESQ scoring; music sounds artificial as a background and is indistinguishable from white noise at low levels. Only label 2 (`noise` — wind, rain, traffic, machinery) is downloaded. +The `noise` preset uses a directory of background-noise WAVs. If `--noise-dir` is omitted, noisekit auto-downloads a small MUSAN **noise-only** subset (~20 files, ~120 MB) from `Aynursusuz/musan-audio-dataset` on HuggingFace to `~/.cache/noisekit/noise/musan_ambient/` on first use. Both `speech` and `music` classes are excluded: speech pollutes ASR/PESQ scoring; music sounds artificial as a background and is indistinguishable from white noise at low levels. Only label 2 (`noise` — wind, rain, traffic, machinery) is downloaded. Pass `--noise-dir /path/to/wavs` to use your own corpus (e.g. MUSAN, DEMAND, FSD50K) instead. Inside a preset YAML, use the literal string `${NOISE_DIR}` as a parameter value and `transforms.load_preset` substitutes the resolved path at load time. Auto-download is wired in `pipeline.run_generate` via `noise_cache.ensure_default_noise_dir()`, gated by `transforms.preset_requires_noise_dir()`. @@ -70,10 +70,9 @@ Built-in presets: | `clean_reference` | Minimal gain normalization (PESQ ceiling) | full | WB 16 kHz | 4.0-4.5 | | `telecom` | G.711 call + low-bitrate MP3 codec artifacts | 300-3400 Hz @ 8 kHz | NB 8 kHz | 2.0-3.5 | | `low_bitrate` | Wideband low-bitrate MP3 compression (16-32 kbps) | 80-7500 Hz @ 16 kHz | WB 16 kHz | 1.5-2.5 | -| `noisy_environment` | Real ambient noise via `AddBackgroundNoise` | up to 8-12 kHz | WB 16 kHz | 2.0-3.5 | -| `clipping_distortion` | Microphone overload / ADC saturation (`ClippingDistortion` 10-25%) | full | WB 16 kHz | 2.0-3.5 | -| `transmission_dropout` | VoIP packet loss: 1-3 silent dropout windows | full | WB 16 kHz | 1.5-3.0 | -| `reverb_far_field` | Far-field reverberant room via `RoomSimulator` | full | WB 16 kHz | 2.0-3.5 | +| `noise` | Real ambient noise via `AddBackgroundNoise` | up to 8-12 kHz | WB 16 kHz | 2.0-3.5 | +| `clipping` | Microphone overload / ADC saturation (`ClippingDistortion` 10-25%) | full | WB 16 kHz | 2.0-3.5 | +| `reverb` | Far-field reverberant room via `RoomSimulator` | full | WB 16 kHz | 2.0-3.5 | `telecom` and any compound preset ending with `telecom` use the 8 kHz PESQ NB scoring split (see below). All other presets score in PESQ WB at 16 kHz. @@ -83,9 +82,9 @@ Compound presets chain two or more atomic presets together. Noise is added first | Preset | Chain | Requires | PESQ mode | Target MOS | | ------------------ | ----------------------------------------- | ------------- | --------- | ---------- | -| `noisy_telecom` | `noisy_environment` → `telecom` | `--noise-dir` | NB 8 kHz | 1.5-2.5 | -| `reverb_noisy` | `reverb_far_field` → `noisy_environment` | `--noise-dir` | WB 16 kHz | 1.0-2.5 | -| `clipping_telecom` | `clipping_distortion` → `telecom` | — | NB 8 kHz | 1.0-2.5 | +| `noise_telecom` | `noise` → `telecom` | `--noise-dir` | NB 8 kHz | 1.5-2.5 | +| `noise_reverb` | `noise` → `reverb` | `--noise-dir` | WB 16 kHz | 1.0-2.5 | +| `clipping_telecom` | `clipping` → `telecom` | — | NB 8 kHz | 1.0-2.5 | ### Compound Preset YAML Format @@ -103,14 +102,14 @@ Rules: - `chain` and `transforms` are mutually exclusive. - Chained entries must be names of built-in atomic presets (no nesting chains). - `${NOISE_DIR}` resolution and the PESQ NB scoring split are detected automatically across the full concatenated chain. -- `reverb_far_field` uses `pyroomacoustics` (bundled as a core dependency — no extra install needed). +- `reverb` uses `pyroomacoustics` (bundled as a core dependency — no extra install needed). ### Why no white noise The catalog deliberately avoids `AddGaussianSNR` — white Gaussian noise sounds artificial and doesn't reflect real production audio. Instead: - `telecom` and `low_bitrate` rely on `Mp3Compression` at 16-32 kbps for realistic codec smearing/pre-echo. -- `noisy_environment` uses `AddBackgroundNoise` over a user-supplied WAV corpus (MUSAN/DEMAND/FSD50K), so the noise floor matches the real environment you care about. +- `noise` uses `AddBackgroundNoise` over a user-supplied WAV corpus (MUSAN/DEMAND/FSD50K), so the noise floor matches the real environment you care about. ## PESQ Scoring — Important Design Decision @@ -134,7 +133,7 @@ if peak > 1e-9: **Safety:** The same normalized `ref_16k` is used as both the transform input and the PESQ/SNR reference, so all quality metrics remain valid relative comparisons. The mid-chain `Normalize` inside `telecom.yaml` (before `BitCrush`) is still needed separately — the bandpass filter removes energy and that step re-normalizes before quantization. -**`noisy_environment` also pre-normalizes:** `noisy_environment.yaml` adds a `Normalize` as its first transform. This handles the `reverb_noisy` compound case: `RoomSimulator` can attenuate the signal by ~10× at large mic distances; without the mid-chain normalize, `AddBackgroundNoise` would see the attenuated level and mix noise too quietly. All compound presets using `noisy_environment` inherit this fix automatically. +**`noise` also pre-normalizes:** `noise.yaml` adds a `Normalize` as its first transform. This handles the `noise_reverb` compound case: `RoomSimulator` can attenuate the signal by ~10× at large mic distances; without the mid-chain normalize, `AddBackgroundNoise` would see the attenuated level and mix noise too quietly. All compound presets using `noise` inherit this fix automatically. `transforms.py` auto-detects this split: if the last transform is `Resample(16000)`, it creates a `scoring` Compose (all-but-last) alongside the `full` Compose. @@ -184,19 +183,19 @@ cat test_out/metadata.jsonl # New atomic presets — no external dependencies uv run noisekit generate \ --dataset google/fleurs --config en_us --split test \ - --samples 3 --presets clipping_distortion transmission_dropout \ + --samples 3 --presets clipping \ --no-nisqa --output ./test_atomic --seed 42 -# noisy_environment — auto-downloads MUSAN noise-only clips on first run +# noise — auto-downloads MUSAN noise-only clips on first run uv run noisekit generate \ --dataset google/fleurs --config en_us --split test \ - --samples 3 --presets noisy_environment \ + --samples 3 --presets noise \ --output ./test_noise --seed 42 # Compound presets (auto-downloads MUSAN noise on first run) uv run noisekit generate \ --dataset google/fleurs --config en_us --split test \ - --samples 3 --presets noisy_telecom \ + --samples 3 --presets noise_telecom \ --no-nisqa --output ./test_compound --seed 42 # clipping_telecom — no noise dir needed @@ -208,19 +207,19 @@ uv run noisekit generate \ # Far-field reverb uv run noisekit generate \ --dataset google/fleurs --config en_us --split test \ - --samples 3 --presets reverb_far_field reverb_noisy \ + --samples 3 --presets reverb noise_reverb \ --no-nisqa --output ./test_reverb --seed 42 -# noisy_environment with your own noise corpus (skips auto-download) +# noise with your own noise corpus (skips auto-download) uv run noisekit generate \ --dataset google/fleurs --config en_us --split test \ - --samples 3 --presets noisy_environment \ + --samples 3 --presets noise \ --noise-dir ~/datasets/musan/noise \ --output ./test_noise --seed 42 ``` -Expected PESQ spread: clean ~4.6, telecom ~2.5-3.5 (NB), low_bitrate ~1.5-2.5 (WB), noisy_environment ~1.0-2.5 (WB), clipping_distortion ~2.0-3.5 (WB), transmission_dropout ~1.5-3.0 (WB), reverb_far_field ~2.0-3.5 (WB). +Expected PESQ spread: clean ~4.6, telecom ~2.5-3.5 (NB), low_bitrate ~1.5-2.5 (WB), noise ~1.0-2.5 (WB), clipping ~2.0-3.5 (WB), reverb ~2.0-3.5 (WB). -Compound preset PESQ: noisy_telecom ~1.5-2.5 (NB), clipping_telecom ~1.0-2.5 (NB), reverb_noisy ~1.0-2.5 (WB). +Compound preset PESQ: noise_telecom ~1.5-2.5 (NB), clipping_telecom ~1.0-2.5 (NB), noise_reverb ~1.0-2.5 (WB). Expected NISQA spread: clean ~4.0-4.5, degraded presets ~1.5-3.0. NISQA model weights (~50 MB) are downloaded on first run. diff --git a/README.md b/README.md index dc1002e..d9c3f42 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ Generate degraded speech datasets for noise-robust ASR benchmarking. Takes a clean HuggingFace speech dataset, applies real-world degradation presets via [audiomentations](https://github.com/iver56/audiomentations), and scores each output with PESQ, SNR, and NISQA, producing a JSONL manifest ready for noise-robustness benchmarking. -Seven atomic degradation scenarios are built in: telephony (G.711 + low-bitrate codec), wideband codec compression, ambient noise, clipping distortion, transmission dropout, and far-field reverb. Atomic presets compose into compound multi-condition scenarios. +Six atomic degradation scenarios are built in: telephony (G.711 + low-bitrate codec), wideband codec compression, ambient noise, clipping distortion, and far-field reverb. Atomic presets compose into compound multi-condition scenarios. > [!NOTE] > Degradations are programmatically simulated. Scores may not generalize to genuine production recordings; validate final benchmarks on annotated real-world data. @@ -61,12 +61,12 @@ uvx noisekit generate \ --seed 42 ``` -For `noisy_environment`, supply a directory of real noise WAVs (e.g. [MUSAN](https://www.openslr.org/17/), [DEMAND](https://zenodo.org/record/1227121), or [FSD50K](https://zenodo.org/record/4060432)): +For `noise`, you can supply your own background-noise WAVs with `--noise-dir` (e.g. [MUSAN](https://www.openslr.org/17/), [DEMAND](https://zenodo.org/record/1227121), or [FSD50K](https://zenodo.org/record/4060432)): ```bash uvx noisekit generate \ --dataset google/fleurs --config en_us --split test \ - --samples 300 --presets noisy_environment \ + --samples 300 --presets noise \ --noise-dir ~/datasets/musan/noise \ --output ./benchmark_dataset --seed 42 ``` @@ -131,7 +131,7 @@ uvx noisekit list-presets --verbose # show full transform stack ## Presets -Ten built-in presets: seven atomic scenarios, three compound multi-condition presets, and a clean reference control. None use synthetic white noise; codec artifacts, real ambient recordings, and room simulation produce the degradation instead. +Nine built-in presets: six atomic scenarios, three compound multi-condition presets, and a clean reference control. None use synthetic white noise; codec artifacts, real ambient recordings, and room simulation produce the degradation instead. ### Atomic presets @@ -140,26 +140,25 @@ Ten built-in presets: seven atomic scenarios, three compound multi-condition pre | `clean_reference` | Minimal processing (PESQ ceiling / control) | 4.0-4.5 | | `telecom` | G.711-style call: 8 kHz bandpass + 8-bit BitCrush + 16-32 kbps MP3 codec | NB 2.0-3.5 | | `low_bitrate` | Wideband audio crushed by 16-32 kbps MP3 compression | WB 1.5-2.5 | -| `noisy_environment` | Real ambient noise from `--noise-dir` mixed in at SNR 5-15 dB | WB 1.0-2.5 | -| `clipping_distortion` | Microphone overload: clips the loudest 10-25% of samples | WB 2.0-3.5 | -| `transmission_dropout` | VoIP packet loss: 1-3 silent dropout windows (60-180 ms each) | WB 1.5-3.0 | -| `reverb_far_field` | Far-field room reverb at 1-3 m mic distance | WB 2.0-3.5 | +| `noise` | Real ambient noise from `--noise-dir` mixed in at SNR 5-15 dB | WB 1.0-2.5 | +| `clipping` | Microphone overload: clips the loudest 10-25% of samples | WB 2.0-3.5 | +| `reverb` | Far-field room reverb at 1-3 m mic distance | WB 2.0-3.5 | `telecom` is scored with PESQ narrowband at 8 kHz (before the final upsample); all other presets are scored wideband at 16 kHz. -All atomic presets require no noise corpus. All dependencies, including `pyroomacoustics` (used by `reverb_far_field`), are bundled with no extra install needed. +All dependencies, including `pyroomacoustics` (used by `reverb`), are bundled with no extra install needed. -`noisy_environment` requires `--noise-dir` pointing at a directory of background-noise WAVs (e.g. MUSAN, DEMAND, FSD50K). If omitted, noisekit auto-downloads a small MUSAN noise-only subset (~120 MB) from HuggingFace on first use. +`noise` accepts a `--noise-dir` pointing at a directory of background-noise WAVs (e.g. MUSAN, DEMAND, FSD50K). If omitted, noisekit auto-downloads a small MUSAN noise-only subset (~20 files, ~120 MB) to `~/.cache/noisekit/noise/musan_ambient/` on first use. ### Compound presets Compound presets chain two atomic presets together. Noise is applied first (acoustic environment), then codec or dropout (digital processing on the already-degraded signal). -| Preset | Chain | Requires | PESQ | -| ------------------ | ---------------------------------------- | ------------- | ---------- | -| `noisy_telecom` | `noisy_environment` → `telecom` | `--noise-dir` | NB 1.5-2.5 | -| `clipping_telecom` | `clipping_distortion` → `telecom` | (none) | NB 1.0-2.5 | -| `reverb_noisy` | `reverb_far_field` → `noisy_environment` | `--noise-dir` | WB 1.0-2.5 | +| Preset | Chain | Noise source | PESQ | +| ------------------ | ---------------------------------------- | ------------------------------ | ---------- | +| `noise_telecom` | `noise` → `telecom` | `--noise-dir` or auto-download | NB 1.5-2.5 | +| `clipping_telecom` | `clipping` → `telecom` | (none) | NB 1.0-2.5 | +| `noise_reverb` | `noise` → `reverb` | `--noise-dir` or auto-download | WB 1.0-2.5 | You can also define your own compound preset with a `chain:` key in a YAML file: @@ -167,7 +166,7 @@ You can also define your own compound preset with a `chain:` key in a YAML file: name: my_compound description: "Noisy environment then telephony codec" chain: - - noisy_environment + - noise - telecom ``` diff --git a/noisekit/cli.py b/noisekit/cli.py index bb9c011..3ed524d 100644 --- a/noisekit/cli.py +++ b/noisekit/cli.py @@ -34,8 +34,9 @@ def generate( "--noise-dir", help=( "Directory of background-noise WAVs (e.g. MUSAN, DEMAND, FSD50K). " - "Used by noisy_environment. If omitted, a small MUSAN music+noise " - "subset is auto-downloaded to ~/.cache/noisekit/ on first use." + "Used by the noise preset and compound noise presets. " + "If omitted, a small MUSAN noise-only subset (~20 files, ~120 MB) " + "is auto-downloaded to ~/.cache/noisekit/noise/musan_ambient/ on first use." ), ), ] = None, diff --git a/noisekit/noise_cache.py b/noisekit/noise_cache.py index 4d3fc71..20ea88d 100644 --- a/noisekit/noise_cache.py +++ b/noisekit/noise_cache.py @@ -1,4 +1,4 @@ -"""Auto-download and cache MUSAN noise-only clips for noisy_environment. +"""Auto-download and cache MUSAN noise-only clips for noise. Only the `noise` class (wind, rain, traffic, machinery…) is downloaded. Speech and music are both excluded: speech pollutes ASR/PESQ scoring; @@ -28,7 +28,7 @@ def get_default_noise_cache_dir() -> Path: def ensure_default_noise_dir(num_samples: int = DEFAULT_NOISE_NUM_SAMPLES) -> Path: - """Return a directory of MUSAN music+noise WAVs, downloading on first use.""" + """Return a directory of MUSAN noise-only WAVs, downloading on first use.""" cache_dir = get_default_noise_cache_dir() cache_dir.mkdir(parents=True, exist_ok=True) diff --git a/noisekit/presets/clipping_distortion.yaml b/noisekit/presets/clipping.yaml similarity index 92% rename from noisekit/presets/clipping_distortion.yaml rename to noisekit/presets/clipping.yaml index 98285b6..9be9522 100644 --- a/noisekit/presets/clipping_distortion.yaml +++ b/noisekit/presets/clipping.yaml @@ -1,4 +1,4 @@ -name: clipping_distortion +name: clipping description: "Amplitude clipping simulating microphone overload or ADC saturation (10–25% of peak samples)." transforms: - type: ClippingDistortion diff --git a/noisekit/presets/clipping_telecom.yaml b/noisekit/presets/clipping_telecom.yaml index b1eebb1..8cfbc86 100644 --- a/noisekit/presets/clipping_telecom.yaml +++ b/noisekit/presets/clipping_telecom.yaml @@ -1,5 +1,5 @@ name: clipping_telecom description: "Amplitude clipping over a telephone channel: ADC saturation then G.711-style narrowband codec." chain: - - clipping_distortion + - clipping - telecom diff --git a/noisekit/presets/noisy_environment.yaml b/noisekit/presets/noise.yaml similarity index 79% rename from noisekit/presets/noisy_environment.yaml rename to noisekit/presets/noise.yaml index 7c7f4cd..cc85acb 100644 --- a/noisekit/presets/noisy_environment.yaml +++ b/noisekit/presets/noise.yaml @@ -1,5 +1,5 @@ -name: noisy_environment -description: "Real-world ambient background noise mixed at variable SNR (5–15 dB). Requires --noise-dir." +name: noise +description: "Real-world ambient background noise mixed at variable SNR (5-15 dB). Uses --noise-dir or auto-downloads MUSAN noise on first use." transforms: - type: Normalize parameters: {} diff --git a/noisekit/presets/noise_reverb.yaml b/noisekit/presets/noise_reverb.yaml new file mode 100644 index 0000000..5216e53 --- /dev/null +++ b/noisekit/presets/noise_reverb.yaml @@ -0,0 +1,5 @@ +name: noise_reverb +description: "Far-field reverberant room with superimposed ambient background noise. Uses --noise-dir or auto-downloads MUSAN noise on first use." +chain: + - noise + - reverb diff --git a/noisekit/presets/noise_telecom.yaml b/noisekit/presets/noise_telecom.yaml new file mode 100644 index 0000000..3ea8127 --- /dev/null +++ b/noisekit/presets/noise_telecom.yaml @@ -0,0 +1,5 @@ +name: noise_telecom +description: "G.711 telephone channel with ambient background noise: background noise then narrowband codec. Uses --noise-dir or auto-downloads MUSAN noise on first use." +chain: + - noise + - telecom diff --git a/noisekit/presets/noisy_telecom.yaml b/noisekit/presets/noisy_telecom.yaml deleted file mode 100644 index eb7b79d..0000000 --- a/noisekit/presets/noisy_telecom.yaml +++ /dev/null @@ -1,5 +0,0 @@ -name: noisy_telecom -description: "G.711 telephone channel with ambient background noise: background noise then narrowband codec. Requires --noise-dir." -chain: - - noisy_environment - - telecom diff --git a/noisekit/presets/reverb_far_field.yaml b/noisekit/presets/reverb.yaml similarity index 95% rename from noisekit/presets/reverb_far_field.yaml rename to noisekit/presets/reverb.yaml index 83b09e6..72f0f7f 100644 --- a/noisekit/presets/reverb_far_field.yaml +++ b/noisekit/presets/reverb.yaml @@ -1,4 +1,4 @@ -name: reverb_far_field +name: reverb description: "Far-field recording in a reverberant room via acoustic room simulation (1–3 m mic distance)." transforms: - type: RoomSimulator diff --git a/noisekit/presets/reverb_noisy.yaml b/noisekit/presets/reverb_noisy.yaml deleted file mode 100644 index 56d9d6d..0000000 --- a/noisekit/presets/reverb_noisy.yaml +++ /dev/null @@ -1,5 +0,0 @@ -name: reverb_noisy -description: "Far-field reverberant room with superimposed ambient background noise. Requires --noise-dir." -chain: - - reverb_far_field - - noisy_environment diff --git a/noisekit/presets/transmission_dropout.yaml b/noisekit/presets/transmission_dropout.yaml deleted file mode 100644 index 88c8583..0000000 --- a/noisekit/presets/transmission_dropout.yaml +++ /dev/null @@ -1,21 +0,0 @@ -name: transmission_dropout -description: "Simulated packet loss: 1–3 silent dropout windows per utterance." -transforms: - - type: TimeMask - parameters: - min_band_part: 0.02 - max_band_part: 0.06 - fade_duration: 0.002 - p: 1.0 - - type: TimeMask - parameters: - min_band_part: 0.02 - max_band_part: 0.06 - fade_duration: 0.002 - p: 0.8 - - type: TimeMask - parameters: - min_band_part: 0.02 - max_band_part: 0.06 - fade_duration: 0.002 - p: 0.5 diff --git a/tests/test_smoke.py b/tests/test_smoke.py index f77bd2b..bde5813 100644 --- a/tests/test_smoke.py +++ b/tests/test_smoke.py @@ -17,17 +17,16 @@ def test_list_builtin_presets() -> None: from noisekit.transforms import list_builtin_presets presets = list_builtin_presets() - assert len(presets) == 10 + assert len(presets) == 9 names = {p["name"] for p in presets} assert "clean_reference" in names assert "telecom" in names assert "low_bitrate" in names - assert "noisy_environment" in names - assert "clipping_distortion" in names - assert "transmission_dropout" in names - assert "reverb_far_field" in names - assert "noisy_telecom" in names - assert "reverb_noisy" in names + assert "noise" in names + assert "clipping" in names + assert "reverb" in names + assert "noise_telecom" in names + assert "noise_reverb" in names assert "clipping_telecom" in names @@ -40,9 +39,9 @@ def test_load_compound_preset_scoring_split(tmp_path) -> None: # AddBackgroundNoise scans sounds_path at construction — write a minimal WAV. sf.write(tmp_path / "noise.wav", np.zeros(16000, dtype=np.float32), 16000) - # noisy_telecom chains noisy_environment → telecom. + # noise_telecom chains noise → telecom. # The concatenated transform list ends with Resample(16000), so the NB 8 kHz # scoring split should be detected automatically. - pt = load_preset("noisy_telecom", noise_dir=tmp_path) - assert pt.scoring is not None, "noisy_telecom should inherit telecom's NB scoring split" + pt = load_preset("noise_telecom", noise_dir=tmp_path) + assert pt.scoring is not None, "noise_telecom should inherit telecom's NB scoring split" assert pt.scoring_sr == 8000, "scoring_sr should be 8000 from telecom's Resample"