Reduced Cost Community Search¶
Reduced Cost Community Search: A greedy and local search heuristic for finding commiunities that maximize the reduced cost.
- asunder.base.algorithms.RCCS._canonicalize_labels(labels)¶
Relabel a partition vector to contiguous integer community IDs.
The relabeling preserves community membership but replaces the original labels with
0, 1, ..., K-1in order of first appearance. This is useful for normalizing equivalent partitions that use different label names.- Parameters:
labels (ndarray or list of int) – One-dimensional array-like of community labels.
- Returns:
Canonicalized label vector with contiguous integer labels.
- Return type:
ndarray of int
- Raises:
ValueError – If
labelsis not one-dimensional.
Notes
Two label vectors that induce the same partition may differ only by label names. This function removes that ambiguity.
- asunder.base.algorithms.RCCS._validate_adjacency(adjacency, symmetrize_inputs)¶
Validate and optionally symmetrize an adjacency matrix.
The matrix must be square, finite, and have strictly positive total weight. When
symmetrize_inputsisTrue, the returned matrix is replaced by0.5 * (A + A.T).- Parameters:
adjacency (ndarray, shape (N, N)) – Candidate adjacency matrix.
symmetrize_inputs (bool) – Whether to replace the input by its symmetric average.
- Returns:
Validated adjacency matrix.
- Return type:
ndarray of float
- Raises:
ValueError – If the matrix is not square, contains non-finite values, is not symmetric when required, or has non-positive total weight.
- asunder.base.algorithms.RCCS._parse_duals_to_matrix_and_constant(duals, n, symmetrize_inputs)¶
Convert heterogeneous dual inputs into a pairwise matrix and a constant.
Scalar dual values contribute only to the constant term. One-dimensional dual arrays are interpreted as node-wise quantities and converted into a symmetric pairwise contribution via
0.5 * (d_i + d_j).Two-dimensional dual arrays are interpreted directly as pairwise terms.
- Parameters:
duals (dict[str, object]) – Mapping from dual names to scalar, vector, matrix, or
Nonevalues.n (int) – Expected problem dimension.
symmetrize_inputs (bool) – Whether to symmetrize two-dimensional dual arrays.
- Return type:
tuple[ndarray,float]- Returns:
dualW (ndarray of float) – Aggregated pairwise dual contribution matrix of shape
(n, n).constant_terms (float) – Sum of all scalar-valued constant dual contributions.
- Raises:
ValueError – If a vector has length different from
n, a matrix has shape different from(n, n), a matrix is not symmetric when required, or any array contains non-finite values.
- asunder.base.algorithms.RCCS._build_objective_matrix_W(adjacency, duals, symmetrize_inputs)¶
Build the reduced-cost objective matrix used by the search heuristic.
The returned matrix is
W = modularity_base - dualW,where
modularity_baseis the dense modularity kernelA / m - (a a^T) / m^2.- Parameters:
adjacency (ndarray) – Problem adjacency matrix.
duals (dict[str, object]) – Dual information to be folded into the reduced-cost objective.
symmetrize_inputs (bool) – Whether to symmetrize adjacency and matrix-valued dual inputs.
- Return type:
tuple[ndarray,ndarray,ndarray,float,float]- Returns:
W (ndarray of float) – Dense reduced-cost objective matrix.
modularity_base (ndarray of float) – Modularity contribution before subtracting dual terms.
dualW (ndarray of float) – Dense pairwise matrix derived from dual inputs.
constant_terms (float) – Scalar dual contribution independent of the partition.
m (float) – Total adjacency weight.
- asunder.base.algorithms.RCCS._score_labels_from_W(labels, W, constant_terms)¶
Score a partition exactly from a precomputed objective matrix.
The score is the sum of all within-community entries of
Wminus the provided constant term.- Parameters:
labels (ndarray) – Community label vector.
W (ndarray) – Dense reduced-cost objective matrix.
constant_terms (float) – Partition-independent scalar contribution to subtract.
- Returns:
Exact reduced-cost value for the given partition.
- Return type:
float
- asunder.base.algorithms.RCCS._build_members_and_S(labels, W)¶
Build community membership sets and node-to-community score sums.
For each community
c, the matrixSstoresS[i, c] = sum_{j in c} W[i, j].This structure supports efficient move-gain evaluation.
- Parameters:
labels (ndarray) – Community label vector.
W (ndarray) – Dense reduced-cost objective matrix.
- Return type:
tuple[list[set[int]],ndarray]- Returns:
members (list of set of int) – Community membership sets indexed by community ID.
S (ndarray of float) – Node-to-community score-sum matrix of shape
(N, K).
- asunder.base.algorithms.RCCS._compute_move_gain(node, from_comm, to_comm, S, W_diag)¶
Compute the exact objective gain of moving one node between communities.
The move gain is evaluated from the cached node-to-community sums in
Sand the diagonal ofWwithout recomputing the full objective.- Parameters:
node (int) – Node index to move.
from_comm (int) – Current community of the node.
to_comm (int) – Target community of the node. This may refer to an existing community or to a newly created singleton community when handled upstream.
S (ndarray) – Node-to-community score-sum matrix.
W_diag (ndarray) – Diagonal of the dense objective matrix
W.
- Returns:
Exact score change obtained by performing the move.
- Return type:
float
- asunder.base.algorithms.RCCS._rebuild_structures_after_empty_comm(labels, W)¶
Rebuild community data structures after a community becomes empty.
This function canonicalizes labels, reconstructs membership sets, and recomputes the cached node-to-community score sums.
- Parameters:
labels (ndarray) – Current label vector, possibly with a missing community index.
W (ndarray) – Dense reduced-cost objective matrix.
- Return type:
tuple[ndarray,list[set[int]],ndarray]- Returns:
labels (ndarray) – Canonicalized label vector.
members (list of set of int) – Rebuilt community membership sets.
S (ndarray) – Rebuilt node-to-community score-sum matrix.
- asunder.base.algorithms.RCCS._tabu_sweep(labels, current_score, W, constant_terms, *, max_tabu_steps, tabu_tenure, allow_non_improving_every, rng)¶
Run one tabu-search sweep over single-node moves.
At each step, the method evaluates moves from every node to every existing community, plus a candidate new singleton community. Improving moves are preferred. Periodic non-improving moves may also be accepted, subject to the tabu restrictions and aspiration rule.
- Parameters:
labels (ndarray) – Initial partition for the sweep.
current_score (float) – Exact reduced-cost score of
labels.W (ndarray) – Dense reduced-cost objective matrix.
constant_terms (float) – Partition-independent scalar contribution.
max_tabu_steps (int) – Maximum number of tabu iterations in the sweep.
tabu_tenure (int) – Number of steps for which the reverse move is tabu.
allow_non_improving_every (int) – Frequency at which the best available non-improving move may be taken. If zero, non-improving moves are never forced.
rng (numpy.random.Generator) – Random number generator.
- Return type:
tuple[ndarray,float,dict[str,object]]- Returns:
best_labels (ndarray) – Best partition encountered during the sweep.
best_score (float) – Best score encountered during the sweep.
meta (dict[str, object]) – Sweep diagnostics, including number of steps, moves, non-improving moves, and best score reached.
Notes
The running score is periodically corrected against an exact recomputation to guard against numerical drift.
- asunder.base.algorithms.RCCS._greedy_merge_phase(labels, current_score, W, constant_terms)¶
Greedily merge communities while the merge gain is positive.
For each pair of communities, the merge gain is computed from the sum of cross-community weights. The best positive merge is applied repeatedly until no improving merge remains.
- Parameters:
labels (ndarray) – Current partition.
current_score (float) – Current exact reduced-cost score.
W (ndarray) – Dense reduced-cost objective matrix.
constant_terms (float) – Partition-independent scalar contribution.
- Return type:
tuple[ndarray,float,int]- Returns:
labels (ndarray) – Partition after greedy merging.
score (float) – Exact score after the final merge sequence.
merge_count (int) – Number of merges applied.
- asunder.base.algorithms.RCCS._attempt_split_weakest_community(labels, current_score, W, constant_terms, rng)¶
Attempt an improving split of the weakest community.
The weakest community is chosen by lowest average internal cohesion. A two-way split is then seeded by the least compatible node pair inside that community and completed greedily. The split is accepted only if it strictly improves the exact reduced-cost score.
- Parameters:
labels (ndarray) – Current partition.
current_score (float) – Current exact reduced-cost score.
W (ndarray) – Dense reduced-cost objective matrix.
constant_terms (float) – Partition-independent scalar contribution.
rng (numpy.random.Generator) – Random number generator for tie-breaking and ordering.
- Return type:
tuple[ndarray,float,bool,dict[str,object]]- Returns:
labels (ndarray) – Updated partition if the split is accepted, otherwise the input partition.
score (float) – Updated score if the split is accepted, otherwise the input score.
split_applied (bool) – Whether an improving split was found and applied.
meta (dict[str, object]) – Diagnostic information describing the accepted split or the reason no split was applied.
- asunder.base.algorithms.RCCS._shake_partition(labels, W, rng, shake_fraction)¶
Randomly perturb weakly attached nodes to diversify the search.
Nodes are ranked by their internal attachment to their current community. A fraction of the weakest nodes is reassigned uniformly at random to a different existing community.
- Parameters:
labels (ndarray) – Current partition.
W (ndarray) – Dense reduced-cost objective matrix.
rng (numpy.random.Generator) – Random number generator.
shake_fraction (float) – Fraction of nodes to perturb.
- Returns:
Canonicalized perturbed partition.
- Return type:
ndarray of int
- asunder.base.algorithms.RCCS.compute_modularity_reduced_cost(adjacency, duals, labels, *, symmetrize_inputs=True, return_details=False)¶
Compute the exact modularity-based reduced cost of a partition.
The reduced cost is evaluated as the modularity contribution of the induced co-clustering matrix minus the pairwise dual contribution and minus all partition-independent constant dual terms.
- Parameters:
adjacency (ndarray) – Adjacency matrix of the graph.
duals (dict[str, object]) – Dual information represented as scalars, vectors, matrices, or
Nonevalues.labels (ndarray or list of int) – Community label vector defining the partition.
symmetrize_inputs (bool, default=True) – Whether to symmetrize adjacency and matrix-valued dual inputs.
return_details (bool, default=False) – Whether to return a diagnostics dictionary instead of only the reduced cost value.
- Returns:
If
return_detailsisFalse, the exact reduced cost. Otherwise, a dictionary containing the reduced cost, modularity term, dual partition term, constant terms, total adjacency weight, and the aggregated pairwise dual matrix.- Return type:
float or dict[str, object]
- Raises:
ValueError – If the adjacency matrix is invalid or if its dimension is inconsistent with the label vector.
Notes
This function is the exact scorer and can be used independently of the search heuristic.
- asunder.base.algorithms.RCCS.search_partition_by_reduced_cost(adjacency, duals, current_labels=None, *, n_restarts=12, max_local_passes=25, max_tabu_steps=5000, tabu_tenure=7, allow_non_improving_every=20, shake_rounds=3, shake_fraction=0.06, split_trigger_no_improve_passes=3, random_seed=42)¶
Heuristically search for a partition with high reduced cost.
The search uses a multistart strategy with a warm start from the supplied partition, followed by repeated rounds of tabu-based node moves, greedy community merges, occasional community splitting, and shake-based diversification.
- Parameters:
adjacency (ndarray, shape (N, N)) – Adjacency matrix of the graph.
duals (dict[str, object]) – Dual information represented as scalars, vectors, matrices, or
Nonevalues.current_labels (ndarray or list of int) – Initial partition used both as the initial incumbent and as a seed for some restarts.
n_restarts (int, default=12) – Number of restart trajectories.
max_local_passes (int, default=25) – Maximum number of local search passes per shake round.
max_tabu_steps (int, default=5000) – Maximum number of tabu iterations per tabu sweep.
tabu_tenure (int, default=7) – Tabu tenure used in the node-move search.
allow_non_improving_every (int, default=20) – Frequency at which the best non-improving move may be accepted.
shake_rounds (int, default=3) – Number of diversification rounds after the initial round.
shake_fraction (float, default=0.06) – Fraction of nodes perturbed in each shake step.
split_trigger_no_improve_passes (int, default=3) – Number of consecutive non-improving passes required before attempting a split.
random_seed (int, default=0) – Seed for the internal random number generator.
- Returns:
Dictionary containing the best label vector found, its reduced cost, the initial reduced cost, the improvement, detailed restart history, the number of restarts, and the effective search configuration.
- Return type:
dict[str, object]
- Raises:
ValueError – If any search parameter is outside its allowed range or if the initial labels are inconsistent with the adjacency dimension.
Notes
The objective is fully defined by
adjacencyandduals. Thecurrent_labelsargument serves only as a warm start and baseline for the heuristic search.