-
Notifications
You must be signed in to change notification settings - Fork 2
Adding New Algorithms
This guide explains how to add a new vertex reordering algorithm to GraphBrew.
Adding a new algorithm involves:
- Adding an enum value
- Implementing the reordering function
- Updating the switch statement
- (Optional) Adding perceptron weights
bench/include/external/gapbs/util.h
enum ReorderingAlgo {
ORIGINAL = 0,
Random = 1,
Sort = 2,
HubSort = 3,
HubCluster = 4,
DBG = 5,
HubSortDBG = 6,
HubClusterDBG = 7,
RabbitOrder = 8, // Has variants (see RABBITORDER_VARIANTS in core/utils.py)
GOrder = 9,
COrder = 10,
RCMOrder = 11,
GraphBrewOrder = 12, // Has variants (see GRAPHBREW_VARIANTS in core/utils.py)
MAP = 13, // Load reordering from file
AdaptiveOrder = 14, // ML-based perceptron selector
// Leiden algorithms (15) - grouped together for easier sweeping
LeidenOrder = 15, // Format: 15:resolution (GVE-Leiden baseline)
// LeidenCSR (16) deprecated — GraphBrew (12) subsumes it
// ADD YOUR ALGORITHM HERE (next available ID = 16)
MyNewOrder = 16,
};
// Note: Variant lists are defined in scripts/lib/core/utils.py for Python integration
// (RABBITORDER_VARIANTS, GRAPHBREW_VARIANTS, RCM_VARIANTS)- Use PascalCase (e.g.,
MyNewOrder,LocalitySensitiveOrder) - Be descriptive
- Add comment with brief description
template <typename NodeID>
pvector<NodeID> MyNewReorder(const CSRGraph<NodeID>& g) {
// Returns mapping: new_id[old_id] = new_vertex_id
pvector<NodeID> new_ids(g.num_nodes());
// Your algorithm here
return new_ids;
}The function returns a permutation where:
-
new_ids[old_vertex_id]= new vertex ID - This is used to relabel vertices
template <typename NodeID>
pvector<NodeID> DegreeOrder(const CSRGraph<NodeID>& g) {
NodeID num_nodes = g.num_nodes();
pvector<NodeID> new_ids(num_nodes);
// Create (degree, vertex) pairs
vector<pair<int64_t, NodeID>> degree_pairs(num_nodes);
#pragma omp parallel for
for (NodeID n = 0; n < num_nodes; n++) {
degree_pairs[n] = {g.out_degree(n), n};
}
// Sort by degree (descending)
sort(degree_pairs.begin(), degree_pairs.end(), greater<pair<int64_t, NodeID>>());
// Assign new IDs
#pragma omp parallel for
for (NodeID i = 0; i < num_nodes; i++) {
new_ids[degree_pairs[i].second] = i;
}
return new_ids;
}template <typename NodeID>
pvector<NodeID> CommunityAwareOrder(const CSRGraph<NodeID>& g) {
NodeID num_nodes = g.num_nodes();
pvector<NodeID> new_ids(num_nodes);
// Detect communities using Leiden
auto communities = RunLeidenCommunityDetection(g);
// Group vertices by community, then order within
NodeID next_id = 0;
for (const auto& community : communities) {
for (NodeID v : community.members) {
new_ids[v] = next_id++;
}
}
return new_ids;
}The main dispatcher is in bench/include/external/gapbs/builder.h. The GenerateMapping function delegates to standalone implementations in the reorder/*.h headers.
Find the GenerateMapping function in builder.h and add your case:
void GenerateMapping(CSRGraph<NodeID_, DestID_, invert> &g,
pvector<NodeID_> &new_ids,
ReorderingAlgo reordering_algo, bool useOutdeg,
std::vector<std::string> reordering_options)
{
switch (reordering_algo)
{
case HubSort:
GenerateHubSortMapping(g, new_ids, useOutdeg);
break;
// ... other cases ...
// ADD YOUR CASE HERE
case MyNewOrder:
MyNewReorder(g, new_ids);
break;
default:
cerr << "Unknown reordering algorithm: " << reordering_algo << endl;
exit(1);
}
}Add your algorithm name for display:
const std::string ReorderingAlgoStr(ReorderingAlgo type)
{
switch (type)
{
// ... existing cases ...
case MyNewOrder:
return "MyNewOrder";
default:
std::cerr << "Unknown Reordering Algorithm type: " << type << std::endl;
abort();
}
}In reorder_types.h, find the getAlgorithmNameMap() function and add your algorithm's UPPERCASE base name. This map contains ~16 UPPERCASE base-name entries with case-insensitive lookup via lookupAlgorithm().
Note: Variant names (e.g.,
MyNewOrder_v1) do NOT go in this map. They are auto-discovered byParseWeightsFromJSON()viaVARIANT_PREFIXESprefix scanning, and resolved byResolveVariantSelection()at runtime.
inline const std::map<std::string, ReorderingAlgo>& getAlgorithmNameMap() {
static const std::map<std::string, ReorderingAlgo> name_to_algo = {
{"ORIGINAL", ORIGINAL}, {"RANDOM", Random}, {"SORT", Sort},
{"HUBSORT", HubSort}, {"HUBCLUSTER", HubCluster},
// ... ~16 UPPERCASE base-name entries ...
{"MYNEWORDER", MyNewOrder}, // ADD YOUR ALGORITHM HERE (UPPERCASE)
};
return name_to_algo;
}If your algorithm has variants, add the prefix to the VARIANT_PREFIXES array
and add parsing logic in ResolveVariantSelection().
Edit scripts/lib/core/utils.py and add to the ALGORITHMS dict:
ALGORITHMS = {
0: "ORIGINAL",
# ... existing entries ...
15: "LeidenOrder",
16: "MyNewOrder", # ADD YOUR ALGORITHM HERE
}If your algorithm has variants, add them to the appropriate *_VARIANTS tuple
in the VARIANT REGISTRY section of core/utils.py. The get_all_algorithm_variant_names()
function auto-derives the full canonical name list from the registry — no manual
editing needed.
Naming rule: All code that needs an algorithm name as a dict key, filename component, or JSON field must call
canonical_algo_key(algo_id, variant)fromcore/utils.py. This guarantees variant-aware naming everywhere (e.g.,"RABBITORDER_csr"instead of bare"RABBITORDER"). Use the pairedalgo_converter_opt(algo_id, variant)when building C++-oarguments. See Configuration-Files#unified-algorithm-naming-scriptslibutilspy.
If you want AdaptiveOrder to consider your algorithm:
When adding a new algorithm, weights are automatically generated by --train. The trained weights are stored in results/data/adaptive_models.json under the perceptron.weights key.
{
"MyNewOrder": {
"bias": 0.5,
"w_modularity": 0.1,
"w_log_nodes": 0.0,
"w_log_edges": 0.0,
"w_density": 0.0,
"w_avg_degree": 0.0,
"w_degree_variance": 0.0,
"w_hub_concentration": 0.0,
"w_clustering_coeff": 0.0,
"w_avg_path_length": 0.0,
"w_diameter": 0.0,
"w_community_count": 0.0,
"w_packing_factor": 0.0,
"w_forward_edge_fraction": 0.0,
"w_working_set_ratio": 0.0,
"w_dv_x_hub": 0.0,
"w_mod_x_logn": 0.0,
"w_pf_x_wsr": 0.0,
"w_fef_convergence": 0.0,
"w_reorder_time": 0.0,
"cache_l1_impact": 0.0,
"cache_l2_impact": 0.0,
"cache_l3_impact": 0.0,
"cache_dram_penalty": 0.0,
"benchmark_weights": {
"pr": 1.0,
"pr_spmv": 1.0,
"bfs": 1.0,
"cc": 1.0,
"cc_sv": 1.0,
"sssp": 1.0,
"bc": 1.0,
"tc": 1.0
}
}
}After adding the algorithm, run --train to train weights from benchmarks:
python3 scripts/graphbrew_experiment.py --train --size smallThis will:
- Benchmark your algorithm on test graphs
- Compute appropriate weights based on performance
- Generate per-type weight files automatically
| Weight | Positive means... |
|---|---|
bias |
Generally preferred (0.3-1.0) |
w_modularity |
Better on modular graphs |
w_log_nodes |
Better on larger graphs |
w_density |
Better on denser graphs |
w_hub_concentration |
Better when hubs dominate |
enum ReorderingAlgo {
// ...existing...
LocalitySensitiveOrder = 16, // Next available ID after LeidenOrder (15)
};// Locality-sensitive ordering: keeps connected vertices close
template <typename NodeID>
pvector<NodeID> LocalitySensitiveReorder(const CSRGraph<NodeID>& g) {
NodeID num_nodes = g.num_nodes();
pvector<NodeID> new_ids(num_nodes, -1);
pvector<bool> visited(num_nodes, false);
NodeID next_id = 0;
queue<NodeID> frontier;
// Start from highest degree vertex
NodeID start = 0;
int64_t max_deg = g.out_degree(0);
for (NodeID n = 1; n < num_nodes; n++) {
if (g.out_degree(n) > max_deg) {
max_deg = g.out_degree(n);
start = n;
}
}
// BFS traversal for locality
frontier.push(start);
visited[start] = true;
while (!frontier.empty() || next_id < num_nodes) {
if (frontier.empty()) {
// Find unvisited vertex
for (NodeID n = 0; n < num_nodes; n++) {
if (!visited[n]) {
frontier.push(n);
visited[n] = true;
break;
}
}
}
NodeID u = frontier.front();
frontier.pop();
new_ids[u] = next_id++;
// Add neighbors in degree order
vector<pair<int64_t, NodeID>> neighbors;
for (NodeID v : g.out_neigh(u)) {
if (!visited[v]) {
neighbors.push_back({g.out_degree(v), v});
}
}
sort(neighbors.rbegin(), neighbors.rend());
for (auto& [deg, v] : neighbors) {
if (!visited[v]) {
visited[v] = true;
frontier.push(v);
}
}
}
return new_ids;
}case LocalitySensitiveOrder:
new_ids = LocalitySensitiveReorder(g);
break;{
"LocalitySensitiveOrder": {
"bias": 0.6,
"w_modularity": 0.15,
"w_log_nodes": 0.05,
"w_log_edges": 0.05,
"w_density": -0.1,
"w_avg_degree": 0.1,
"w_degree_variance": 0.1,
"w_hub_concentration": 0.1
}
}Edit scripts/lib/core/utils.py:
ALGORITHMS = {
# ... existing ...
16: "LocalitySensitiveOrder",
}make clean && make all
# Test your new algorithm (ID 17 = next available after existing 0-16)
./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 17 -n 3-
Use OpenMP: Parallelize where possible
#pragma omp parallel for for (NodeID n = 0; n < num_nodes; n++) { ... }
-
Avoid allocations in loops: Pre-allocate vectors
-
Cache-friendly access: Process vertices sequentially when possible
- Return valid permutation: Every vertex must have a new ID
- Handle disconnected graphs: Don't assume connectivity
- Test with small graphs first: Use scripts/test/graphs/tiny/tiny.el
- Consistent naming: Match enum name, function name, JSON key
- Document complexity: Add comments about time/space complexity
- Add tests: Create test cases for your algorithm
# Test on small graph
./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 16 -n 3
# Verify ordering is valid
./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 16 -n 1 2>&1 | grep -i error# Compare with baseline
./bench/bin/pr -f large_graph.el -s -o 0 -n 5 # Baseline
./bench/bin/pr -f large_graph.el -s -o 16 -n 5 # Your algorithm# Check for leaks
valgrind ./bench/bin/pr -f scripts/test/graphs/tiny/tiny.el -s -o 16 -n 1template <typename NodeID>
pvector<NodeID> MyNewReorder(const CSRGraph<NodeID>& g) {
#ifdef DEBUG
cout << "MyNewReorder: num_nodes=" << g.num_nodes() << endl;
#endif
// ...
}make clean
make DEBUG=1bool ValidPermutation(const pvector<NodeID>& perm, NodeID n) {
vector<bool> seen(n, false);
for (NodeID i = 0; i < n; i++) {
if (perm[i] < 0 || perm[i] >= n || seen[perm[i]]) {
return false;
}
seen[perm[i]] = true;
}
return true;
}- Adding-New-Benchmarks - Add new graph algorithms
- Code-Architecture - Understand the codebase
- AdaptiveOrder-ML - Train perceptron for your algorithm