Adding New Algorithms

Adding New Reordering Algorithms

This guide explains how to add a new vertex reordering algorithm to GraphBrew.

Overview

Adding a new algorithm involves:

Adding an enum value
Implementing the reordering function
Updating the switch statement
(Optional) Adding perceptron weights

Step 1: Add Enum Value

Location

bench/include/external/gapbs/util.h

Find the Enum

enum ReorderingAlgo {
    ORIGINAL = 0,
    Random = 1,
    Sort = 2,
    HubSort = 3,
    HubCluster = 4,
    DBG = 5,
    HubSortDBG = 6,
    HubClusterDBG = 7,
    RabbitOrder = 8,        // Has variants (see RABBITORDER_VARIANTS in core/utils.py)
    GOrder = 9,
    COrder = 10,
    RCMOrder = 11,
    GraphBrewOrder = 12,    // Has variants (see GRAPHBREW_VARIANTS in core/utils.py)
    MAP = 13,               // Load reordering from file
    AdaptiveOrder = 14,     // ML-based perceptron selector
    // Leiden algorithms (15) - grouped together for easier sweeping
    LeidenOrder = 15,       // Format: 15:resolution (GVE-Leiden baseline)
    // LeidenCSR (16) deprecated — GraphBrew (12) subsumes it
    // ADD YOUR ALGORITHM HERE (next available ID = 16)
    MyNewOrder = 16,
};

// Note: Variant lists are defined in scripts/lib/core/utils.py for Python integration
// (RABBITORDER_VARIANTS, GRAPHBREW_VARIANTS, RCM_VARIANTS)

Naming Convention

Use PascalCase (e.g., MyNewOrder, LocalitySensitiveOrder)
Be descriptive
Add comment with brief description

Step 2: Implement the Reordering Function

Function Signature

template <typename NodeID>
pvector<NodeID> MyNewReorder(const CSRGraph<NodeID>& g) {
  // Returns mapping: new_id[old_id] = new_vertex_id
  pvector<NodeID> new_ids(g.num_nodes());
  
  // Your algorithm here
  
  return new_ids;
}

Understanding the Return Value

The function returns a permutation where:

new_ids[old_vertex_id] = new vertex ID
This is used to relabel vertices

Example: Simple Degree-Based Ordering

template <typename NodeID>
pvector<NodeID> DegreeOrder(const CSRGraph<NodeID>& g) {
  NodeID num_nodes = g.num_nodes();
  pvector<NodeID> new_ids(num_nodes);
  
  // Create (degree, vertex) pairs
  vector<pair<int64_t, NodeID>> degree_pairs(num_nodes);
  #pragma omp parallel for
  for (NodeID n = 0; n < num_nodes; n++) {
    degree_pairs[n] = {g.out_degree(n), n};
  }
  
  // Sort by degree (descending)
  sort(degree_pairs.begin(), degree_pairs.end(), greater<pair<int64_t, NodeID>>());
  
  // Assign new IDs
  #pragma omp parallel for
  for (NodeID i = 0; i < num_nodes; i++) {
    new_ids[degree_pairs[i].second] = i;
  }
  
  return new_ids;
}

Example: Using Community Information

template <typename NodeID>
pvector<NodeID> CommunityAwareOrder(const CSRGraph<NodeID>& g) {
  NodeID num_nodes = g.num_nodes();
  pvector<NodeID> new_ids(num_nodes);
  
  // Detect communities using Leiden
  auto communities = RunLeidenCommunityDetection(g);
  
  // Group vertices by community, then order within
  NodeID next_id = 0;
  for (const auto& community : communities) {
    for (NodeID v : community.members) {
      new_ids[v] = next_id++;
    }
  }
  
  return new_ids;
}

Step 3: Update the Switch Statements

Location

The main dispatcher is in bench/include/external/gapbs/builder.h. The GenerateMapping function delegates to standalone implementations in the reorder/*.h headers.

1. The `GenerateMapping` Function

Find the GenerateMapping function in builder.h and add your case:

void GenerateMapping(CSRGraph<NodeID_, DestID_, invert> &g,
                     pvector<NodeID_> &new_ids,
                     ReorderingAlgo reordering_algo, bool useOutdeg,
                     std::vector<std::string> reordering_options)
{
    switch (reordering_algo)
    {
    case HubSort:
        GenerateHubSortMapping(g, new_ids, useOutdeg);
        break;
    // ... other cases ...
    
    // ADD YOUR CASE HERE
    case MyNewOrder:
        MyNewReorder(g, new_ids);
        break;
    
    default:
        cerr << "Unknown reordering algorithm: " << reordering_algo << endl;
        exit(1);
    }
}

2. The `ReorderingAlgoStr` Function

Add your algorithm name for display:

const std::string ReorderingAlgoStr(ReorderingAlgo type)
{
    switch (type)
    {
    // ... existing cases ...
    case MyNewOrder:
        return "MyNewOrder";
    default:
        std::cerr << "Unknown Reordering Algorithm type: " << type << std::endl;
        abort();
    }
}

Step 4: Add Algorithm Name Mapping

For Perceptron Weights (in reorder_types.h)

In reorder_types.h, find the getAlgorithmNameMap() function and add your algorithm's UPPERCASE base name. This map contains ~16 UPPERCASE base-name entries with case-insensitive lookup via lookupAlgorithm().

Note: Variant names (e.g., MyNewOrder_v1) do NOT go in this map. They are auto-discovered by ParseWeightsFromJSON() via VARIANT_PREFIXES prefix scanning, and resolved by ResolveVariantSelection() at runtime.

inline const std::map<std::string, ReorderingAlgo>& getAlgorithmNameMap() {
    static const std::map<std::string, ReorderingAlgo> name_to_algo = {
        {"ORIGINAL", ORIGINAL}, {"RANDOM", Random}, {"SORT", Sort},
        {"HUBSORT", HubSort}, {"HUBCLUSTER", HubCluster},
        // ... ~16 UPPERCASE base-name entries ...
        {"MYNEWORDER", MyNewOrder},  // ADD YOUR ALGORITHM HERE (UPPERCASE)
    };
    return name_to_algo;
}

If your algorithm has variants, add the prefix to the VARIANT_PREFIXES array and add parsing logic in ResolveVariantSelection().

For Python Scripts

Edit scripts/lib/core/utils.py and add to the ALGORITHMS dict:

ALGORITHMS = {
    0: "ORIGINAL",
    # ... existing entries ...
    15: "LeidenOrder",
    16: "MyNewOrder",  # ADD YOUR ALGORITHM HERE
}

If your algorithm has variants, add them to the appropriate *_VARIANTS tuple in the VARIANT REGISTRY section of core/utils.py. The get_all_algorithm_variant_names() function auto-derives the full canonical name list from the registry — no manual editing needed.

Naming rule: All code that needs an algorithm name as a dict key, filename component, or JSON field must call canonical_algo_key(algo_id, variant) from core/utils.py. This guarantees variant-aware naming everywhere (e.g., "RABBITORDER_csr" instead of bare "RABBITORDER"). Use the paired algo_converter_opt(algo_id, variant) when building C++ -o arguments. See Configuration-Files#unified-algorithm-naming-scriptslibutilspy.

Step 5: Add Perceptron Weights (Optional)

If you want AdaptiveOrder to consider your algorithm:

Add Weights to Model Store

When adding a new algorithm, weights are automatically generated by --train. The trained weights are stored in results/data/adaptive_models.json under the perceptron.weights key.

Example Weight Entry

{
  "MyNewOrder": {
    "bias": 0.5,
    "w_modularity": 0.1,
    "w_log_nodes": 0.0,
    "w_log_edges": 0.0,
    "w_density": 0.0,
    "w_avg_degree": 0.0,
    "w_degree_variance": 0.0,
    "w_hub_concentration": 0.0,
    "w_clustering_coeff": 0.0,
    "w_avg_path_length": 0.0,
    "w_diameter": 0.0,
    "w_community_count": 0.0,
    "w_packing_factor": 0.0,
    "w_forward_edge_fraction": 0.0,
    "w_working_set_ratio": 0.0,
    "w_dv_x_hub": 0.0,
    "w_mod_x_logn": 0.0,
    "w_pf_x_wsr": 0.0,
    "w_fef_convergence": 0.0,
    "w_reorder_time": 0.0,
    "cache_l1_impact": 0.0,
    "cache_l2_impact": 0.0,
    "cache_l3_impact": 0.0,
    "cache_dram_penalty": 0.0,
    "benchmark_weights": {
      "pr": 1.0,
      "pr_spmv": 1.0,
      "bfs": 1.0,
      "cc": 1.0,
      "cc_sv": 1.0,
      "sssp": 1.0,
      "bc": 1.0,
      "tc": 1.0
    }
  }
}

Automatic Training

After adding the algorithm, run --train to train weights from benchmarks:

python3 scripts/graphbrew_experiment.py --train --size small

This will:

Benchmark your algorithm on test graphs
Compute appropriate weights based on performance
Generate per-type weight files automatically

Weight Guidelines

Weight	Positive means...
`bias`	Generally preferred (0.3-1.0)
`w_modularity`	Better on modular graphs
`w_log_nodes`	Better on larger graphs
`w_density`	Better on denser graphs
`w_hub_concentration`	Better when hubs dominate

Complete Example: Locality-Sensitive Ordering

1. Add Enum (in util.h)

enum ReorderingAlgo {
    // ...existing...
    LocalitySensitiveOrder = 16,  // Next available ID after LeidenOrder (15)
};

2. Implement Algorithm

// Locality-sensitive ordering: keeps connected vertices close
template <typename NodeID>
pvector<NodeID> LocalitySensitiveReorder(const CSRGraph<NodeID>& g) {
  NodeID num_nodes = g.num_nodes();
  pvector<NodeID> new_ids(num_nodes, -1);
  pvector<bool> visited(num_nodes, false);
  
  NodeID next_id = 0;
  queue<NodeID> frontier;
  
  // Start from highest degree vertex
  NodeID start = 0;
  int64_t max_deg = g.out_degree(0);
  for (NodeID n = 1; n < num_nodes; n++) {
    if (g.out_degree(n) > max_deg) {
      max_deg = g.out_degree(n);
      start = n;
    }
  }
  
  // BFS traversal for locality
  frontier.push(start);
  visited[start] = true;
  
  while (!frontier.empty() || next_id < num_nodes) {
    if (frontier.empty()) {
      // Find unvisited vertex
      for (NodeID n = 0; n < num_nodes; n++) {
        if (!visited[n]) {
          frontier.push(n);
          visited[n] = true;
          break;
        }
      }
    }
    
    NodeID u = frontier.front();
    frontier.pop();
    new_ids[u] = next_id++;
    
    // Add neighbors in degree order
    vector<pair<int64_t, NodeID>> neighbors;
    for (NodeID v : g.out_neigh(u)) {
      if (!visited[v]) {
        neighbors.push_back({g.out_degree(v), v});
      }
    }
    sort(neighbors.rbegin(), neighbors.rend());
    
    for (auto& [deg, v] : neighbors) {
      if (!visited[v]) {
        visited[v] = true;
        frontier.push(v);
      }
    }
  }
  
  return new_ids;
}

3. Add to Switch

case LocalitySensitiveOrder:
  new_ids = LocalitySensitiveReorder(g);
  break;

4. Add Weights

{
  "LocalitySensitiveOrder": {
    "bias": 0.6,
    "w_modularity": 0.15,
    "w_log_nodes": 0.05,
    "w_log_edges": 0.05,
    "w_density": -0.1,
    "w_avg_degree": 0.1,
    "w_degree_variance": 0.1,
    "w_hub_concentration": 0.1
  }
}

5. Add to Python Scripts

Edit scripts/lib/core/utils.py:

ALGORITHMS = {
    # ... existing ...
    16: "LocalitySensitiveOrder",
}

6. Test

make clean && make all
# Test your new algorithm (ID 17 = next available after existing 0-16)
./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 17 -n 3

Best Practices

Performance

Use OpenMP: Parallelize where possible

#pragma omp parallel for
for (NodeID n = 0; n < num_nodes; n++) { ... }

Avoid allocations in loops: Pre-allocate vectors
Cache-friendly access: Process vertices sequentially when possible

Correctness

Return valid permutation: Every vertex must have a new ID
Handle disconnected graphs: Don't assume connectivity
Test with small graphs first: Use scripts/test/graphs/tiny/tiny.el

Integration

Consistent naming: Match enum name, function name, JSON key
Document complexity: Add comments about time/space complexity
Add tests: Create test cases for your algorithm

Testing Your Algorithm

Unit Test

# Test on small graph
./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 16 -n 3

# Verify ordering is valid
./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 16 -n 1 2>&1 | grep -i error

Performance Test

# Compare with baseline
./bench/bin/pr -f large_graph.el -s -o 0 -n 5   # Baseline
./bench/bin/pr -f large_graph.el -s -o 16 -n 5  # Your algorithm

Memory Test

# Check for leaks
valgrind ./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 16 -n 1

Debugging Tips

Print Debug Info

template <typename NodeID>
pvector<NodeID> MyNewReorder(const CSRGraph<NodeID>& g) {
  #ifdef DEBUG
  cout << "MyNewReorder: num_nodes=" << g.num_nodes() << endl;
  #endif
  // ...
}

Build with Debug

make clean
make DEBUG=1

Check Permutation Validity

bool ValidPermutation(const pvector<NodeID>& perm, NodeID n) {
  vector<bool> seen(n, false);
  for (NodeID i = 0; i < n; i++) {
    if (perm[i] < 0 || perm[i] >= n || seen[perm[i]]) {
      return false;
    }
    seen[perm[i]] = true;
  }
  return true;
}

Next Steps

Adding-New-Benchmarks - Add new graph algorithms
Code-Architecture - Understand the codebase
AdaptiveOrder-ML - Train perceptron for your algorithm

← Back to Home | Code Architecture →

Adding New Algorithms

Adding New Reordering Algorithms

Overview

Step 1: Add Enum Value

Location

Find the Enum

Naming Convention

Step 2: Implement the Reordering Function

Function Signature

Understanding the Return Value

Example: Simple Degree-Based Ordering

Example: Using Community Information

Step 3: Update the Switch Statements

Location

1. The GenerateMapping Function

2. The ReorderingAlgoStr Function

Step 4: Add Algorithm Name Mapping

For Perceptron Weights (in reorder_types.h)

For Python Scripts

Step 5: Add Perceptron Weights (Optional)

Add Weights to Model Store

Example Weight Entry

Automatic Training

Weight Guidelines

Complete Example: Locality-Sensitive Ordering

1. Add Enum (in util.h)

2. Implement Algorithm

3. Add to Switch

4. Add Weights

5. Add to Python Scripts

6. Test

Best Practices

Performance

Correctness

Integration

Testing Your Algorithm

Unit Test

Performance Test

Memory Test

Debugging Tips

Print Debug Info

Build with Debug

Check Permutation Validity

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

1. The `GenerateMapping` Function

2. The `ReorderingAlgoStr` Function