Load Balancing Partition Generation

Initial feasible partition generators and helpers.

Assign nodes to balanced clusters from an ordered node sequence.

Must-link constraints are first contracted into components, and cannot-link constraints are enforced as hard conflicts between components. The assignment is built from the component order induced by order_idx while searching for a feasible number of used clusters and respecting the size bounds implied by K and R.

Parameters:
  • order_idx (array_like of int) – Ordered node indices used to induce the component assignment order. Entries are interpreted as node ids in {0, ..., N - 1}.

  • N (int) – Number of nodes.

  • K (int) – Target number of clusters. Larger feasible values may be considered up to K + max_K_increase.

  • R (int) – Width of the allowed cluster-size range. For a selected cluster count, the lower and upper bounds are computed from the corresponding balanced range rule.

  • R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).

  • must_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to the same cluster.

  • cannot_link (iterable of tuple of int, optional) – Pairs of node indices that must not be assigned to the same cluster.

  • max_K_increase (int, default=50) – Maximum increase above K allowed when searching for a feasible cluster count.

  • max_restarts (int, default=8) – Maximum number of randomized restarts used during assignment.

  • branch (int, default=6) – Maximum number of candidate partial assignments retained or explored at each branching step.

  • max_attempts (int, default=100) – Maximum number of assignment attempts before declaring failure.

  • seed (int, default=None) – Random seed used for restart and tie-breaking behavior.

  • w_contig (float, default=1.0) – Weight for preserving contiguity with respect to the supplied order.

  • w_switch (float, default=0.25) – Weight for discouraging unnecessary cluster-label switching.

  • w_target (float, default=0.15) – Weight for favoring assignments close to target cluster sizes.

Returns:

On success, returns (gvec, meta). gvec is an integer array of shape (N,) whose entry gvec[i] is the cluster label assigned to node i. meta contains "r_min", "r_max", and "K_used". Returns None if no assignment satisfying the hard link constraints and size bounds is found.

Return type:

tuple of numpy.ndarray and dict[str, int] or None

Notes

This function returns the node-label assignment vector, not the co-clustering matrix. Convert gvec separately when a Z column is required.

asunder.load_balancing.utils.partition_generation.make_partitions(G, K, R, R_bounds=None, must_link=None, cannot_link=None, n_cols=15, seed=42, nodes=None)

Generate graph-informed feasible partition columns.

Candidate node orderings are constructed from the graph and converted into balanced feasible partitions using must-link and cannot-link constraints. Each successful partition is returned as a binary co-clustering matrix.

Parameters:
  • G (networkx.Graph) – Graph whose nodes define the partitioning problem and whose structure is used to generate candidate orderings.

  • K (int) – Target number of clusters.

  • R (int) – Width of the allowed cluster-size range.

  • R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).

  • must_link (iterable of tuple, optional) – Pairs of node indices that must be assigned to the same cluster.

  • cannot_link (iterable of tuple, optional) – Pairs of node indices that must be assigned to different clusters.

  • n_cols (int, default=15) – Maximum number of feasible partition columns to generate.

  • seed (int, default=None) – Random seed used for randomized or noisy graph orderings.

  • nodes (sequence, optional) – Node ordering used to map graph node labels to integer indices. If None, the graph’s node iteration order is used.

Returns:

Z_star – List of binary N x N co-clustering matrices. For each matrix Z, Z[i, j] = 1 if nodes i and j are assigned to the same cluster, and 0 otherwise.

Return type:

list of numpy.ndarray

asunder.load_balancing.utils.partition_generation.make_one_feasible_partition_mrv_restarts(N, K, R, R_bounds=None, must_link=None, cannot_link=None, seed=42, max_tries=200, jitter_top=2, max_K_increase=50, return_Z=True)

Construct one feasible partition using MRV search with restarts.

Must-link constraints are compressed into components and cannot-link constraints are enforced between components. The search repeatedly chooses the unassigned component with the fewest feasible cluster choices and assigns it using a randomized best-fit rule. Failed partial assignments are discarded and retried until a feasible partition is found or the restart limit is reached.

Parameters:
  • N (int) – Number of nodes.

  • K (int) – Target number of clusters.

  • R (int) – Width of the allowed cluster-size range.

  • R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).

  • must_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to the same cluster.

  • cannot_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to different clusters.

  • seed (int, default=None) – Random seed used for tie-breaking and restart randomness.

  • max_tries (int, default=200) – Maximum number of restart attempts.

  • jitter_top (int, default=2) – Number of top-ranked feasible groups considered during randomized best-fit assignment. Larger values increase randomness.

  • max_K_increase (int, default=50) – Maximum increase above K allowed when searching for a feasible cluster count.

  • return_Z (bool, default=True) – If True, return a binary co-clustering matrix. If False, return an integer assignment vector.

Returns:

A feasible partition, or None if no feasible assignment is found. If return_Z is True, the result is an N x N binary matrix Z with Z[i, j] = 1 when nodes i and j are co-clustered. If return_Z is False, the result is an integer vector g of shape (N,).

Return type:

numpy.ndarray or None

asunder.load_balancing.utils.partition_generation.make_partitions_random(N, K, R, R_bounds=None, must_link=None, cannot_link=None, seed=42, return_Z=True, max_K_increase=50)

Generate random feasible partitions subject to link and size constraints.

Must-link constraints are contracted into components, cannot-link constraints are enforced as hard incompatibilities, and feasible balanced assignments are attempted for K and larger candidate values up to K + max_K_increase.

Parameters:
  • N (int) – Number of nodes.

  • K (int) – Target number of clusters.

  • R (int) – Width of the allowed cluster-size range.

  • R_bounds (tuple[int, int] | None) – Minimum and maximum number of nodes per community (community size constraint).

  • must_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to the same cluster.

  • cannot_link (iterable of tuple of int, optional) – Pairs of node indices that must be assigned to different clusters.

  • seed (int, default=None) – Random seed used for shuffled component orders.

  • return_Z (bool, default=True) – If True, return binary co-clustering matrices. If False, return integer assignment vectors.

  • max_K_increase (int, default=50) – Maximum increase above K allowed when searching for a feasible cluster count.

Returns:

Feasible partitions. If return_Z is True, each entry is an N x N binary matrix Z with Z[i, j] = 1 when nodes i and j are co-clustered. If return_Z is False, each entry is an integer assignment vector g of shape (N,).

Returns an empty list if no feasible partition is found.

Return type:

list of numpy.ndarray

asunder.load_balancing.utils.partition_generation.check_balance(Z, K, R, R_bounds=None)

Check whether a co-clustering matrix satisfies balance bounds.

The row sums of Z are interpreted as cluster sizes. A partition is balanced if every row sum lies between R_min and R_max, where R_min = max(1, floor(N / K - R / 2 + 1 / 2)) and R_max = R_min + R.

Parameters:
  • Z (numpy.ndarray) – Binary N x N co-clustering matrix. Z[i, j] = 1 indicates that nodes i and j are assigned to the same cluster.

  • K (int) – Target number of clusters.

  • R (int) – Width of the allowed cluster-size range.

Returns:

  • min_size (scalar) – Minimum row sum of Z.

  • max_size (scalar) – Maximum row sum of Z.

  • is_balanced (bool) – True if all row sums are within the computed balance bounds; otherwise False.