add load-legacy-tests and load-legacy-runs commands by willr3 · Pull Request #61 · Hyperfoil/h5m

willr3 · 2026-04-21T14:43:31Z

This will add two cli commands that can be used to migrate data from the Horreum model to h5m.

stalep · 2026-04-22T12:27:47Z

Legacy Import Testing - Bugs Found and Fixed

Tested load-legacy-tests and load-legacy-runs against a real Horreum backup (162 tests, ~50K runs). Three tests were used for detailed validation: Hyperfoil Hot Rod 100R (10 runs), Hot Rod 80R/20W (20 runs), and rhivos-perf-comprehensive (1,327 runs).

My fixes; https://github.com/stalep/h5m/tree/legacy_tests

Bugs Fixed

folder.group not initialized — FolderEntity was created without a NodeGroupEntity, causing NPE when building the node tree. Fixed by adding folder.group = new NodeGroupEntity(test.name).
FolderEntity.persist() outside transaction context — Picocli's Callable.call() runs outside CDI, so direct Panache persist() throws ContextNotActiveException. Fixed by routing through FolderService.create(FolderEntity) which has @Transactional.
Dataset jq expression splits objects incorrectly — The hardcoded .[] on the transformer output iterates object values when the transformer returns a single object (e.g., {info, stats}), producing N datasets (one per key) instead of 1. Horreum treats non-array transformer output as a single dataset. Fixed with if type == "array" then .[] else . end.
JS functions fail on string-encoded numbers — Horreum data sometimes stores numbers as JSON strings ("759660"). Label functions like v => v.reduce((a,b) => a+b) fail with TypeError. Fixed by adding wrapWithNumberCoercion() in LoadLegacyTests that rewrites the JS function to coerce parameters at the node operation level (not in h5m's generic JS engine).
Fingerprint ambiguity from transformer/label name collisions — Transformer extractors and target schema labels can share names (e.g., tag). Both become nodes, causing ambiguous fingerprint lookups. Fixed by de-duplicating: when a target schema label is created, existing extractor nodes with the same name are removed via NodeTracking.removeNodes().
Multi-transformer tests skipped entirely — Tests with >1 transformer were skipped even when all transformers target the same schema (just handling different source versions). Fixed by selecting the transformer with the most extractors when all share the same target URI.
OOM on large test imports — LoadLegacyRuns loads all run data in a single query, causing OOM for tests with many/large runs. Fixed by setting connection.setAutoCommit(false) and ps.setFetchSize(1) for cursor-based streaming.

Validation

Fingerprints — Compared fingerprint values between Horreum and h5m for all three tests:

Hot Rod 100R: 10 distinct fingerprints — identical values and structure
Hot Rod 80R/20W: 10 distinct fingerprints — identical. Count differences are due to Horreum duplicate datasets (8 runs have 3 identical datasets each from re-processing)
rhivos-perf-comprehensive: 4 fingerprint labels (RHIVOS Target, RHIVOS OS ID, RHIVOS Mode, Run ID) plus fingerprint filter — all correctly imported. Fingerprint values match Horreum exactly

Dataset counts — All 10 imported rhivos runs produce exactly 10 datasets each, matching Horreum.

Label values — Compared all scalar and computed label values for run 91582 (10 datasets), matched by Sample UUID:

All scalar labels (Description, Hostname, Kernel, Run ID, UUID, RHIVOS fields, etc.) — exact match
All numeric extraction labels (CoreMark-PRO Scaling, Stress-ng latencies) — exact match
All JS-computed labels (FIO Mean Read/Write IOPS, Bandwidth, Latency) — exact match (e.g., 52.392, 6647.724, 29648486.4)
Multi-extractor JS labels (Autobench) — 0 values in h5m. The function value => { if (value["workload"] != "autobench") return null; ... } expects a named-property object {workload, results} (Horreum's behavior for multi-extractor labels with a single parameter). h5m doesn't construct this object — the createNodesFromLabel fallback collects sources but doesn't map them to named properties.
Null handling: Horreum stores explicit nulls, h5m deletes null values (existing sqlpath behavior)

stalep · 2026-04-22T12:28:01Z

Legacy Import Performance

Benchmarked run import performance with rhivos-perf-comprehensive (112 nodes per folder, 30 of which are change detection nodes):

Configuration	Sec/run	vs baseline
50 threads, no indexes	~720s	baseline
50 threads + indexes	~295s	2.4x
5 threads + indexes	~27s	27x
5 threads + indexes, no detection nodes	~18s	40x

Findings

value_edge(parent_id) index is missing — the recursive CTE that finds descendant values joins on parent_id, but only child_id is indexed. Adding the index gives 2.4x.
Thread pool contention dominates — 50 threads all running recursive CTEs simultaneously cause massive DB contention. Reducing to 5 threads gives 10x on top of the index improvement.
Change detection nodes are wasted work during import — 30 rd nodes with no domain node queue work but produce 0 values. Removing them saves another 1.5x.

At the best configuration (~18s/run), the full rhivos import would take ~6.5 hours. A --skip-detection flag or deferred detection pass would be a natural improvement.

stalep · 2026-04-22T12:47:48Z

Multi-extractor JS labels (Autobench) — analysis

During import validation of rhivos-perf-comprehensive, all scalar labels, single-extractor labels, and single-extractor JS labels (e.g. FIO) produce correct values matching Horreum. However, multi-extractor JS labels produce 0 values. This affects labels like the Autobench workload labels that have multiple extractors and a JS function with a single parameter:

value => { if (value["workload"] != "autobench") return null; return value["results"].map(...); }

Root cause

The issue is in how h5m handles sqlpath no-match vs how Horreum does it:

Horreum: When a jsonpath extractor doesn't match the data, it stores null as the extracted value. The downstream JS function still receives a complete object like {workload: null, results: null, autobench_results: [...]} and can make filtering decisions (e.g. return null for non-autobench datasets).
h5m: When sqlpath returns null, the value is deleted (NodeService.java:654-655). The downstream JS node then has missing entries in its sourceValues map. This causes either no computation (empty combinations) or an incomplete object, producing "unable to find parameter value for X" errors and TypeError crashes.

What already works

JsNode.createParameters (lines 120-128) already has fallback logic that bundles multiple source values into a named-property ObjectNode when the function parameter doesn't match any source name. So the argument-passing mechanism itself correctly handles the Horreum pattern — the problem is that source values are missing upstream.

Possible fixes

This can't be fixed in the import code — the import only creates the node structure. The deletion happens at runtime during value calculation.

Two options in h5m core:

Option A — Store null values instead of deleting them
Change NodeService.calculateSqlJsonpathValuesFirstOrAll (line 654) to keep the value with data = null instead of calling valueService.delete(). This matches Horreum semantics and ensures downstream JS nodes always have complete source maps. Risk: any existing code that assumes "no value = no match" would need to handle explicit nulls.

Option B — Allow JS calculation with partial sources
Modify calculateSourceValuePermutations to include null/placeholder entries for missing sources, and update createParameters to put null in the ObjectNode for missing values. More targeted but adds complexity to the permutation logic.

Both are relatively small changes (~10-20 lines) but touch core value calculation semantics, so worth discussing the tradeoffs before implementing.

stalep · 2026-04-22T15:07:15Z

+    public record Extractor(String name,String jsonpath,boolean isArray){};
+    public record Transformer(long id,String name,String function,String targetUri, List<Extractor> extractors){
+        @Override
+        public boolean equals(Object o) {


Will this method ever return true?

stalep · 2026-04-22T15:09:03Z

    }

+    @Transactional
+    public boolean functionalyEquivalent(NodeEntity a,NodeEntity b){


should be functionallyEquivalent?

stalep · 2026-04-30T06:51:35Z

Test Results: rhivos-perf-comprehensive (test 339)

Tested the latest commit (5b56d0a) against rhivos-perf-comprehensive, which has 2 transformers targeting the same schema (schema version evolution — old format $.autobench_workload[*].results vs new format $.autobench_workload.data[*].results).

Result: Import fails

The multi-transformer case logs MORE THAN 1 TRANSFORMER FOR and skips all transformer processing. This means:

No transformer or dataset nodes created — the transformer→dataset→label pipeline is never built
All 15 variables FAILED — FAILED TO MAKE VARIABLE ... Found count for X is 0 because labels were never created
Fingerprint FAILED — missing node RHIVOS Target/OS ID/Mode/Run ID
All 30 change detection nodes FAILED — cascading from missing variables

Nodes created

Type	Count
sqlall	31
sql	8
ecma	2
fp	1
jq	1

Only 43 nodes from the no-transform path (schema labels from run_schema_paths). No transformer, dataset, or change detection nodes.

Root cause

The multi-transformer handling at the current code simply logs and skips:

if(transformers.size() > 1){
    log("MORE THAN 1 TRANSFORMER FOR "+test);
}else {
    // only processes single-transformer case
}

In Horreum, both transformers run independently — the one matching the run's data format produces output, the other returns nulls. For the import, the common approach is to merge extractors from all same-target transformers (picking one function) or run all transformers and combine results.

Additional issue: no-transform label loop

The no-transform path has an empty loop body at lines 771-774:

for(String labelName : labelsByName.keys()){
    log(6,"label="+labelName);
    Set<Label> labels = labelsByName.get(labelName);
    // body is empty — labels loaded but never wired to nodes
}

This means tests without transformers that have per-jsonpath label groups also won't produce label nodes.

Suggestions

Handle multi-transformer with same target schema — either merge extractors or pick the best match (most extractors)
Complete the no-transform label loop body to wire labels to nodes
Consider using if type == "array" then .[] else . end for the dataset JQ node instead of .[] to handle non-array transformer output

stalep · 2026-04-30T07:42:21Z

Suggested minimal fixes for multi-transformer support

Tested 3 small changes against rhivos-perf-comprehensive. Together they make the import work for tests with multiple transformers targeting the same schema (schema version evolution).

1. Run all transformers instead of skipping (lines 668-670)

 if(transformers.size() > 1){
-    log("MORE THAN 1 TRANSFORMER FOR "+test);
-}else {
+    log(2, "multiple transformers (" + transformers.size() + ") for same target, creating pipeline for each");
+}
+{

Add a suffix to disambiguate transformer/dataset names inside createFolder():

String transformerSuffix = test.transformers().size() > 1 ? "_" + transformer.id() : "";
// Use: "transformer_" + name + transformerSuffix
// Use: "dataset" + transformerSuffix

This creates one pipeline per transformer. The non-matching transformer's extractors return empty arrays → transformer JS produces [] → dataset JQ produces nothing → labels produce nothing. Only the matching transformer's pipeline produces values.

2. Dataset JQ expression (line 329)

-new JqNode("dataset",".[]",List.of(transform));
+new JqNode("dataset" + transformerSuffix,"if type == \"array\" then .[] else . end",List.of(transform));

Handles non-array transformer output gracefully.

3. Handle multiple label matches in variable/fingerprint lookup

With multiple transformers, the same label exists multiple times (one per transformer's dataset). The variable lookup at line 413 and fingerprint lookup at line 452 should accept the first match instead of failing:

-if(found.size()==1){
+if(found.size()>=1){
     variableIdToNode.put(variable.id(),found.get(0));
-}else {
-    ...
-    if(found.size()>1){
-        System.exit(1);
-    }
+    if(found.size()>1){
+        log(4,"variable "+variable.name()+" matched "+found.size()+" label nodes, using first");
+    }
+}else {
+    // found.size() == 0
+    ...
 }

Same pattern for fingerprint lookup:

-if (foundNodes.size() == 1) {
+if (foundNodes.size() >= 1) {
     fingerprintNodes.add(foundNodes.get(0));
-} else {
-    // report the ambiguity?
 }

Test results

After these 3 changes, rhivos-perf-comprehensive imports successfully:

129 nodes created (2 transformer pipelines, 30 rd change detection nodes, fingerprint)
Upload of run 198223 (new .data format): matching transformer produces 10 dataset splits, all labels produce values
Non-matching transformer logs one expected error (harmless)

stalep mentioned this pull request Apr 22, 2026

Fix dataset jq expression, OOM, and import correctness issues willr3/h5m#1

Open

4 tasks

stalep reviewed Apr 22, 2026

View reviewed changes

willr3 marked this pull request as draft April 24, 2026 13:36

willr3 force-pushed the legacy_tests branch 3 times, most recently from de7b73b to 5b56d0a Compare April 30, 2026 02:28

stalep mentioned this pull request Apr 30, 2026

Support per-iteration value pairing for multi-source labels #66

Closed

willr3 force-pushed the legacy_tests branch from 5b56d0a to 2120150 Compare April 30, 2026 13:15

add load-legacy-tests and load-legacy-runs commands

7fd3466

willr3 force-pushed the legacy_tests branch from 2120150 to 7fd3466 Compare April 30, 2026 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add load-legacy-tests and load-legacy-runs commands#61

add load-legacy-tests and load-legacy-runs commands#61
willr3 wants to merge 1 commit intoHyperfoil:mainfrom
willr3:legacy_tests

willr3 commented Apr 21, 2026

Uh oh!

stalep commented Apr 22, 2026 •

edited

Loading

Uh oh!

stalep commented Apr 22, 2026

Uh oh!

stalep commented Apr 22, 2026

Uh oh!

stalep Apr 22, 2026

Uh oh!

stalep Apr 22, 2026

Uh oh!

stalep commented Apr 30, 2026

Uh oh!

stalep commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

willr3 commented Apr 21, 2026

Uh oh!

stalep commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Legacy Import Testing - Bugs Found and Fixed

Bugs Fixed

Validation

Uh oh!

stalep commented Apr 22, 2026

Legacy Import Performance

Findings

Uh oh!

stalep commented Apr 22, 2026

Multi-extractor JS labels (Autobench) — analysis

Root cause

What already works

Possible fixes

Uh oh!

stalep Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

stalep Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

stalep commented Apr 30, 2026

Test Results: rhivos-perf-comprehensive (test 339)

Result: Import fails

Nodes created

Root cause

Additional issue: no-transform label loop

Suggestions

Uh oh!

stalep commented Apr 30, 2026

Suggested minimal fixes for multi-transformer support

1. Run all transformers instead of skipping (lines 668-670)

2. Dataset JQ expression (line 329)

3. Handle multiple label matches in variable/fingerprint lookup

Test results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stalep commented Apr 22, 2026 •

edited

Loading