Add rescale to MatrixEntry/ArrayEntry and rename scale_array → rescale_array#104
Merged
Conversation
…gh to scale_array resources Extends the high-level entry-point API to expose the scale_array support added in #89. - MatrixEntry gains `rescale: float = 1.0` (multiplicative factor, distinct from the existing `scale` distribution parameter). - ArrayEntry gains `scale: Optional[np.ndarray] = None` with shape and float-coercion validation. - dictionary_formatter and resolve_dict_iterator now carry a `rescale` field so the value is sorted exactly once alongside row/col/amount; scale_array (or None when all values are 1.0) is returned as a fifth element. - add_persistent_vector_from_iterator forwards the extracted scale_array to add_persistent_vector, keeping add_entries a simple one-liner. - add_array_entries passes scale_array=entry.scale to add_persistent_array.
… uncertainty scale
The stats_arrays `scale` parameter (standard deviation / distribution
scale) already owns the name in this codebase. Using the same word for
the per-exchange multiplicative factor caused silent ambiguity.
Renames everywhere — parameters, helper method, resource kind and file
suffix, local variables, docstrings, and tests:
- `scale_array` → `rescale_array` on all four `add_persistent/dynamic_*`
methods, `add_persistent_vector_from_iterator`, and `_add_rescale_array_resource`
- Resource kind `"scale"` → `"rescale"`, file suffix `.scale` → `.rescale`
- `kind in ("flip", "scale", …)` guard in `write_modified` → `"rescale"`
- `ArrayEntry.scale` field → `ArrayEntry.rescale`
- `utils.py` local variable and returned tuple element
- All test assertions and variable names updated accordingly
Distribution-scale references (`UNCERTAINTY_DTYPE["scale"]`,
`MatrixEntry.scale`, `dictionary_formatter` `row.get("scale")`) are
intentionally left unchanged.
- Correct add_array_entries docstring: scale→rescale and kind="scale"→kind="rescale" - Note in MatrixEntry.rescale docstring that the Python float is downcast to numpy.float32 on write - Add test_rescale_resource_written_when_only_some_entries_rescaled covering the mixed case (one entry at default 1.0, one non-default)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MatrixEntry,ArrayEntry,add_entries,add_array_entries) to expose therescale_arraysupport added in Add scale_array support for per-exchange multiplicative rescaling #89.scale_array→rescale_arraythroughout — parameters, helper method, resourcekind, file suffix, local variables, docstrings, and tests — to eliminate the ambiguity with the stats_arrays distributionscaleparameter (loc/scale/shapeinUNCERTAINTY_DTYPE).New fields:
MatrixEntry.rescale: float = 1.0— per-exchange multiplicative factor; flows throughdictionary_formatterandresolve_dict_iteratorso it is sorted once alongside the other arrays, then stored as akind="rescale"resource when any value ≠ 1.0.ArrayEntry.rescale: Optional[np.ndarray] = None— same concept as an array; passed directly toadd_persistent_array.Unchanged:
UNCERTAINTY_DTYPE["scale"],MatrixEntry.scale(distribution scale parameter),dictionary_formatter'srow.get("scale").Test plan
pytest tests/— 247 passed, 1 skippedkind="rescale"resources are written and round-trip correctly through the Parquet path (covered by existingtest_scale_array_parquet_roundtrip, now renamedtest_rescale_array_parquet_roundtrip)MatrixEntry.rescalevalues are aligned with sorted indices/data (newtest_rescale_sorted_with_data)