Load Balancing Partition Generation¶
Initial feasible partition generators and helpers.
- asunder.load_balancing.utils.partition_generation.assign_from_order_with_links_range(order_idx, N, K, R, R_bounds=None, must_link=None, cannot_link=None, max_K_increase=50, max_restarts=8, branch=6, max_attempts=100, seed=42, w_contig=1.0, w_switch=0.25, w_target=0.15)¶
Assign nodes to balanced clusters from an ordered node sequence.
Must-link constraints are first contracted into components, and cannot-link constraints are enforced as hard conflicts between components. The assignment is built from the component order induced by
order_idxwhile searching for a feasible number of used clusters and respecting the size bounds implied byKandR.- Parameters:
order_idx (array_like of int) – Ordered node indices used to induce the component assignment order. Entries are interpreted as node ids in
{0, ..., N - 1}.N (int) – Number of nodes.
K (int) – Target number of clusters. Larger feasible values may be considered up to
K + max_K_increase.R (int) – Width of the allowed cluster-size range. For a selected cluster count, the lower and upper bounds are computed from the corresponding balanced range rule.
R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).
must_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to the same cluster.
cannot_link (iterable of tuple of int, optional) – Pairs of node indices that must not be assigned to the same cluster.
max_K_increase (int, default=50) – Maximum increase above
Kallowed when searching for a feasible cluster count.max_restarts (int, default=8) – Maximum number of randomized restarts used during assignment.
branch (int, default=6) – Maximum number of candidate partial assignments retained or explored at each branching step.
max_attempts (int, default=100) – Maximum number of assignment attempts before declaring failure.
seed (int, default=None) – Random seed used for restart and tie-breaking behavior.
w_contig (float, default=1.0) – Weight for preserving contiguity with respect to the supplied order.
w_switch (float, default=0.25) – Weight for discouraging unnecessary cluster-label switching.
w_target (float, default=0.15) – Weight for favoring assignments close to target cluster sizes.
- Returns:
On success, returns
(gvec, meta).gvecis an integer array of shape(N,)whose entrygvec[i]is the cluster label assigned to nodei.metacontains"r_min","r_max", and"K_used". ReturnsNoneif no assignment satisfying the hard link constraints and size bounds is found.- Return type:
tuple of numpy.ndarray and dict[str, int] or None
Notes
This function returns the node-label assignment vector, not the co-clustering matrix. Convert
gvecseparately when aZcolumn is required.
- asunder.load_balancing.utils.partition_generation.make_partitions(G, K, R, R_bounds=None, must_link=None, cannot_link=None, n_cols=15, seed=42, nodes=None)¶
Generate graph-informed feasible partition columns.
Candidate node orderings are constructed from the graph and converted into balanced feasible partitions using must-link and cannot-link constraints. Each successful partition is returned as a binary co-clustering matrix.
- Parameters:
G (networkx.Graph) – Graph whose nodes define the partitioning problem and whose structure is used to generate candidate orderings.
K (int) – Target number of clusters.
R (int) – Width of the allowed cluster-size range.
R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).
must_link (iterable of tuple, optional) – Pairs of node indices that must be assigned to the same cluster.
cannot_link (iterable of tuple, optional) – Pairs of node indices that must be assigned to different clusters.
n_cols (int, default=15) – Maximum number of feasible partition columns to generate.
seed (int, default=None) – Random seed used for randomized or noisy graph orderings.
nodes (sequence, optional) – Node ordering used to map graph node labels to integer indices. If
None, the graph’s node iteration order is used.
- Returns:
Z_star – List of binary
N x Nco-clustering matrices. For each matrixZ,Z[i, j] = 1if nodesiandjare assigned to the same cluster, and0otherwise.- Return type:
list of numpy.ndarray
- asunder.load_balancing.utils.partition_generation.make_one_feasible_partition_mrv_restarts(N, K, R, R_bounds=None, must_link=None, cannot_link=None, seed=42, max_tries=200, jitter_top=2, max_K_increase=50, return_Z=True)¶
Construct one feasible partition using MRV search with restarts.
Must-link constraints are compressed into components and cannot-link constraints are enforced between components. The search repeatedly chooses the unassigned component with the fewest feasible cluster choices and assigns it using a randomized best-fit rule. Failed partial assignments are discarded and retried until a feasible partition is found or the restart limit is reached.
- Parameters:
N (int) – Number of nodes.
K (int) – Target number of clusters.
R (int) – Width of the allowed cluster-size range.
R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).
must_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to the same cluster.
cannot_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to different clusters.
seed (int, default=None) – Random seed used for tie-breaking and restart randomness.
max_tries (int, default=200) – Maximum number of restart attempts.
jitter_top (int, default=2) – Number of top-ranked feasible groups considered during randomized best-fit assignment. Larger values increase randomness.
max_K_increase (int, default=50) – Maximum increase above
Kallowed when searching for a feasible cluster count.return_Z (bool, default=True) – If
True, return a binary co-clustering matrix. IfFalse, return an integer assignment vector.
- Returns:
A feasible partition, or
Noneif no feasible assignment is found. Ifreturn_ZisTrue, the result is anN x Nbinary matrixZwithZ[i, j] = 1when nodesiandjare co-clustered. Ifreturn_ZisFalse, the result is an integer vectorgof shape(N,).- Return type:
numpy.ndarray or None
- asunder.load_balancing.utils.partition_generation.make_partitions_random(N, K, R, R_bounds=None, must_link=None, cannot_link=None, seed=42, return_Z=True, max_K_increase=50)¶
Generate random feasible partitions subject to link and size constraints.
Must-link constraints are contracted into components, cannot-link constraints are enforced as hard incompatibilities, and feasible balanced assignments are attempted for
Kand larger candidate values up toK + max_K_increase.- Parameters:
N (int) – Number of nodes.
K (int) – Target number of clusters.
R (int) – Width of the allowed cluster-size range.
R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).
must_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to the same cluster.
cannot_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to different clusters.
seed (int, default=None) – Random seed used for shuffled component orders.
return_Z (bool, default=True) – If
True, return binary co-clustering matrices. IfFalse, return integer assignment vectors.max_K_increase (int, default=50) – Maximum increase above
Kallowed when searching for a feasible cluster count.
- Returns:
Feasible partitions. If
return_ZisTrue, each entry is anN x Nbinary matrixZwithZ[i, j] = 1when nodesiandjare co-clustered. Ifreturn_ZisFalse, each entry is an integer assignment vectorgof shape(N,).Returns an empty list if no feasible partition is found.
- Return type:
list of numpy.ndarray
- asunder.load_balancing.utils.partition_generation.check_balance(Z, K, R, R_bounds=None)¶
Check whether a co-clustering matrix satisfies balance bounds.
The row sums of
Zare interpreted as cluster sizes. A partition is balanced if every row sum lies betweenR_minandR_max, whereR_min = max(1, floor(N / K - R / 2 + 1 / 2))andR_max = R_min + R.- Parameters:
Z (numpy.ndarray) – Binary
N x Nco-clustering matrix.Z[i, j] = 1indicates that nodesiandjare assigned to the same cluster.K (int) – Target number of clusters.
R (int) – Width of the allowed cluster-size range.
- Returns:
min_size (scalar) – Minimum row sum of
Z.max_size (scalar) – Maximum row sum of
Z.is_balanced (bool) –
Trueif all row sums are within the computed balance bounds; otherwiseFalse.