Very Fortunate Descent

Modular Very Fortunate Descent (VFD): A local search algorithm for partition refinement under pairwise constraints and optionally, balance constraints.

class asunder.base.algorithms.modular_VFD._Feas(*, r_min, r_max, use_bitmask, require_nonempty_groups, gsz, members_b, in_mask, in_set, b_size, b_mask=None, b_forb=None, b_forb_sets=None, b_comp_set=None)

Bases: object

Incremental feasibility checker for block-to-community operations.

Parameters:
  • r_min (int) – Minimum allowed community size.

  • r_max (int) – Maximum allowed community size.

  • use_bitmask (bool) – If True, use bitmask-based cannot-link checks. Otherwise, use sets.

  • require_nonempty_groups (bool) – If True, disallow operations that empty a community.

  • gsz (numpy.ndarray) – Current community sizes.

  • members_b (list of list of int) – Block indices currently assigned to each community.

  • in_mask (list of int or None) – Component-membership bitmask per community when bitmask mode is used.

  • in_set (list of set of int or None) – Component-membership set per community when set mode is used.

  • b_size (list of int) – Size of each block.

  • b_mask (list of int or None) – Component-membership bitmask per block when bitmask mode is used.

  • b_forb (list of int or None) – Cannot-link bitmask per block when bitmask mode is used.

  • b_forb_sets (list of set of int or None) – Cannot-link sets per block when set mode is used.

  • b_comp_set (list of set of int or None) – Component-membership sets per block when set mode is used.

can_remove(bi, g)

Check whether block bi can be removed from community g.

Parameters:
  • bi (int) – Block index.

  • g (int) – Community index.

Returns:

True if removal is feasible.

Return type:

bool

can_add(bi, g)

Check whether block bi can be added to community g.

Parameters:
  • bi (int) – Block index.

  • g (int) – Community index.

Returns:

True if insertion is feasible.

Return type:

bool

can_add_after_removal(bi, g, bj_remove)

Check whether block bi can be added to community g after removing block bj_remove from that same community.

Parameters:
  • bi (int) – Candidate entering block.

  • g (int) – Community index.

  • bj_remove (int) – Candidate leaving block.

Returns:

True if the modified community remains feasible.

Return type:

bool

asunder.base.algorithms.modular_VFD._symmetrize_unitdiag(M)

Symmetrize a matrix and set its diagonal entries to one.

Parameters:

M (numpy.ndarray) – Square input matrix.

Returns:

Symmetric matrix equal to 0.5 * (M + M.T) with unit diagonal.

Return type:

numpy.ndarray

asunder.base.algorithms.modular_VFD._normalize_pair(i, j)

Return an ordered pair with the smaller index first.

Parameters:
  • i (int) – First index.

  • j (int) – Second index.

Returns:

Pair (min(i, j), max(i, j)).

Return type:

tuple of int

asunder.base.algorithms.modular_VFD._build_components(N, must_link, cannot_link, bitmask_C_max=4096, dense_deg_threshold=64)

Build must-link components and component-level cannot-link structure.

Parameters:
  • N (int) – Number of original nodes.

  • must_link (list of tuple of int) – Pairwise constraints requiring the linked nodes to belong to the same component.

  • cannot_link (list of tuple of int) – Pairwise constraints requiring the linked nodes to belong to different components.

  • bitmask_C_max (int, optional) – Maximum number of components for which bitmask-based cannot-link storage is used unconditionally.

  • dense_deg_threshold (int, optional) – Average component-level cannot-link degree threshold above which bitmask storage is preferred.

Returns:

Component data with keys such as "C", "cid", "comps", "csz", "use_bitmask", "forb_mask", and "comp_bit". Returns None if the link constraints are infeasible.

Return type:

dict or None

Notes

Must-link edges are compressed with union-find. Cannot-link edges are then lifted to the component level.

asunder.base.algorithms.modular_VFD._build_coassociation_matrix(sym_wz, n_components_list, seeds, methods=('kmeans', 'gmm', 'spectral'))

Estimate a co-association matrix from repeated clustering runs.

Parameters:
  • sym_wz (numpy.ndarray) – Symmetric similarity or affinity matrix.

  • n_components_list (sequence of int) – Candidate numbers of clusters to try.

  • seeds (sequence of int) – Random seeds for repeated runs.

  • methods (sequence of str, optional) – Clustering methods to use. Supported values are "kmeans", "gmm", and "spectral".

Returns:

Matrix C in [0, 1] where C[i, j] is the fraction of successful runs in which nodes i and j are co-clustered. Returns None if scikit-learn is unavailable or no run succeeds.

Return type:

numpy.ndarray or None

asunder.base.algorithms.modular_VFD._component_matrices_from_node_matrix(M, comp)

Aggregate a node-level matrix to the component level by block averaging.

Parameters:
  • M (numpy.ndarray) – Node-level square matrix.

  • comp (dict) – Component structure returned by _build_components.

Returns:

Component-level matrix whose entries are averages over node pairs between components. The diagonal is set to one.

Return type:

numpy.ndarray

asunder.base.algorithms.modular_VFD._component_sum_matrix_B(A, a, m, comp)

Aggregate the modularity-style matrix to the component level by summation.

Parameters:
  • A (numpy.ndarray) – Node-level adjacency or weight matrix of shape (N, N).

  • a (numpy.ndarray) – Node weight vector of shape (N,).

  • m (float) – Normalizing scalar used in A - aa^T / m.

  • comp (dict) – Component structure returned by _build_components.

Returns:

Component-level summed matrix P.T @ B @ P where B = A - aa^T / m.

Return type:

numpy.ndarray

Raises:

ValueError – If the input shapes are inconsistent or if m is zero.

asunder.base.algorithms.modular_VFD._fingerprint_blocks_from_rounded_rows(C_comp, comp, fingerprint_decimals, r_max)

Form component blocks from rounded co-association fingerprints.

Parameters:
  • C_comp (numpy.ndarray) – Component-level co-association matrix.

  • comp (dict) – Component structure returned by _build_components.

  • fingerprint_decimals (int) – Number of decimals used to round component rows before grouping.

  • r_max (Any) – Maximum allowed total node count per block. If None, no effective size cap is imposed.

Returns:

Blocks of component IDs with matching rounded fingerprints, split further to avoid cannot-link conflicts and to respect r_max.

Return type:

list of list of int

asunder.base.algorithms.modular_VFD._ensure_at_least_K_blocks(blocks, K_used, csz)

Split large blocks until at least K_used blocks are available.

Parameters:
  • blocks (list of list of int) – Current blocks of component IDs.

  • K_used (int) – Required minimum number of blocks.

  • csz (numpy.ndarray) – Component sizes indexed by component ID.

Returns:

Updated block list. If too few splittable blocks exist, the result may still contain fewer than K_used blocks.

Return type:

list of list of int

asunder.base.algorithms.modular_VFD._range_bounds_from_KR(N, K, R)

Compute lower and upper block-size bounds from (N, K, R).

Parameters:
  • N (int) – Number of nodes.

  • K (int) – Number of partitions.

  • R (int) – Allowed size range width.

Returns:

Pair (r_min, r_max) with r_min = floor(N / K - R / 2 + 1/2) and r_max = r_min + R.

Return type:

tuple of int

Raises:

ValueError – If K is not positive.

asunder.base.algorithms.modular_VFD._lb_bounds(N, K, R)

Compute load-balance bounds implied by (N, K, R).

Parameters:
  • N (int) – Number of nodes.

  • K (int) – Number of partitions.

  • R (int) – Allowed size range width.

Returns:

Pair (r_min, r_max) used for load-balance checks.

Return type:

tuple of int

asunder.base.algorithms.modular_VFD._feasible_K_range(N, r_min, r_max)

Compute the feasible range of partition counts under size bounds.

Parameters:
  • N (int) – Number of nodes.

  • r_min (int) – Minimum allowed partition size.

  • r_max (int) – Maximum allowed partition size.

Returns:

Pair (k_lo, k_hi) such that any feasible K must satisfy k_lo <= K <= k_hi.

Return type:

tuple of int

asunder.base.algorithms.modular_VFD._target_sizes_from_bounds(N, K_used, r_min, r_max)

Construct a balanced target size vector within prescribed bounds.

Parameters:
  • N (int) – Total number of nodes.

  • K_used (int) – Number of partitions.

  • r_min (int) – Minimum target size per partition.

  • r_max (int) – Maximum target size per partition.

Returns:

Integer vector of length K_used whose entries sum to N and lie in [r_min, r_max].

Return type:

numpy.ndarray

Raises:

ValueError – If the bounds are infeasible for the given N and K_used.

Notes

This target vector is intended for scoring or guidance, not as a hard partition-equality requirement.

asunder.base.algorithms.modular_VFD._greedy_split_block_by_cannot(block, *, use_bitmask, forb, comp_bit=None)

Split a block so that no resulting sub-block contains an internal cannot-link conflict.

Parameters:
  • block (list of int) – Component indices in the block.

  • use_bitmask (bool) – If True, use bitmask logic. Otherwise, use sets.

  • forb (object) – Cannot-link representation from the component structure.

  • comp_bit (object, optional) – Component bitmasks used in bitmask mode.

Returns:

Conflict-free sub-blocks.

Return type:

list of list of int

asunder.base.algorithms.modular_VFD._make_blocks_conflict_free(blocks, *, use_bitmask, forb, comp_bit=None)

Apply cannot-link-safe splitting to every block.

Parameters:
  • blocks (list of list of int) – Candidate fingerprint blocks.

  • use_bitmask (bool) – If True, use bitmask logic. Otherwise, use sets.

  • forb (object) – Cannot-link representation from the component structure.

  • comp_bit (object, optional) – Component bitmasks used in bitmask mode.

Returns:

Conflict-free blocks.

Return type:

list of list of int

asunder.base.algorithms.modular_VFD._resolve_k_control(*, N, Cn, K, R, use_K_constraint, max_K_increase, clustering_Ks, candidate_Ks)

Resolve size bounds and the list of fixed-K subproblems to evaluate.

Parameters:
  • N (int) – Number of original nodes.

  • Cn (int) – Number of must-link components.

  • K (int or None) – Baseline number of communities.

  • R (int or None) – Width of the allowed cluster-size range. Also corresponds to the load balance tightness (smaller R implies tighter load balance). For a selected cluster count, the lower and upper bounds are computed from the corresponding balanced range rule. Used when the K-constraint is active.

  • use_K_constraint (bool) – If True, use K/R-derived balance bounds and only test K-neighborhood values. If False, remove K-derived balance bounds and instead test candidate_Ks.

  • max_K_increase (int) – Maximum allowed increase over the baseline K when the K-constraint is active.

  • clustering_Ks (sequence of int) – Clustering sizes used to build co-association information.

  • candidate_Ks (sequence of int or None) – Explicit K values to test when the K-constraint is inactive.

Return type:

Tuple[int, int, List[int]]

Returns:

  • r_min (int) – Minimum allowed community size.

  • r_max (int) – Maximum allowed community size.

  • K_values (list of int) – K values to test in the outer loop.

asunder.base.algorithms.modular_VFD._objective_B_from_comp_assignment(W_B, comp2g, K_used)

Recompute the modularity-style objective from a component-to-community assignment.

Parameters:
  • W_B (numpy.ndarray) – Component-level modularity matrix.

  • comp2g (numpy.ndarray) – Component-to-community assignment.

  • K_used (int) – Number of active communities.

Returns:

Sum of within-community entries of W_B.

Return type:

float

asunder.base.algorithms.modular_VFD.modular_very_fortunate_descent(wz, A, a, m, K, R, must_link, cannot_link, seed=0, fingerprint_decimals=6, allow_block_splitting=True, max_K_increase=0, use_K_constraint=True, candidate_Ks=None, restarts=6, local_iters=60, w_coassoc=0.05, clustering_Ks=None, clustering_seeds=(0, 1, 2), clustering_methods=('kmeans', 'gmm', 'spectral'), wz_is_C_node=False, tabu_max_steps=60, shake_rounds=3, orbit_fallback=False)

Modular function for building a feasible decomposition column from co-association structure and local search.

Parameters:
  • wz (numpy.ndarray) – Input node-level matrix.

  • A (numpy.ndarray) – Original adjacency matrix.

  • a (numpy.ndarray) – Degree-like vector used in the modularity construction.

  • m (float) – Modularity scaling constant.

  • K (int or None) – Baseline number of communities. Used directly only when use_K_constraint=True. When use_K_constraint=False, it is treated only as an optional search hint.

  • R (int or None) – Width of the allowed cluster-size range. Also corresponds to the load balance tightness (smaller R implies tighter load balance). For a selected cluster count, the lower and upper bounds are computed from the corresponding balanced range rule. Used when use_K_constraint=True.

  • must_link (sequence of tuple of int) – Must-link pairs.

  • cannot_link (sequence of tuple of int) – Cannot-link pairs.

  • seed (int, default=0) – Random seed.

  • fingerprint_decimals (int, default=6) – Decimal rounding used to form fingerprint blocks.

  • allow_block_splitting (bool, default=True) – If True, allow refinement of coarse fingerprint blocks.

  • max_K_increase (int, default=0) – Maximum increase above the baseline K when the K-constraint is active.

  • use_K_constraint (bool, default=True) – If True, enforce K/R-derived balance bounds. If False, ignore K/R-derived balance bounds and search over candidate_Ks.

  • candidate_Ks (sequence of int or None, default=None) – K values to test when use_K_constraint=False.

  • restarts (int, default=6) – Number of constructive restarts.

  • local_iters (int, default=60) – Base local-search iteration parameter.

  • w_coassoc (float, default=0.05) – Weight of the co-association cohesion term in local decisions.

  • clustering_Ks (sequence of int or None, default=None) – Community counts used to build the co-association matrix.

  • clustering_seeds (sequence of int, default=(0, 1, 2)) – Seeds used during co-association construction.

  • clustering_methods (sequence of str, default=("kmeans", "gmm", "spectral")) – Clustering methods used during co-association construction.

  • wz_is_C_node (bool, default=False) – If True, treat wz directly as the node-level co-association matrix.

  • tabu_max_steps (int, default=60) – Maximum tabu-search steps.

  • shake_rounds (int, default=3) – Number of perturb-and-improve rounds.

  • orbit_fallback (bool, default=False) – If True, build a fallback co-association proxy from orbits when needed.

Returns:

A pair (Z_col, meta) if a feasible column is found, else None.

Return type:

tuple of (numpy.ndarray, dict) or None