Skip to content

[Rule] Graph 3-Colorability to Clustering #924

@isPANN

Description

@isPANN

Source: Graph 3-Colorability (KColoring with K=3)
Target: Clustering

Motivation: Establishes NP-completeness of CLUSTERING by encoding a graph coloring instance as a distance-based clustering problem. The key insight is that adjacency in the graph translates to distance 1 (must be in different clusters), and non-adjacency translates to distance 0 (can be in the same cluster). With K=3 clusters and diameter bound B=0, a valid 3-coloring corresponds exactly to a valid clustering.
Reference: Garey & Johnson, Computers and Intractability, A1.2 MS9; Brucker 1978

GJ Source Entry

[MS9] CLUSTERING
INSTANCE: Finite set X, "distance" d(x,y) ∈ Z₀⁺ for each pair x,y ∈ X, positive integers K and B.
QUESTION: Can X be partitioned into k ≤ K disjoint sets X_1, X_2, ..., X_k such that, for 1 ≤ i ≤ k, d(x,y) ≤ B for all x,y ∈ X_i?
Reference: [Brucker, 1978]. Transformation from GRAPH 3-COLORABILITY.
Comment: NP-complete even for K = 3 and d restricted to {0,1}. Polynomial for K = 2.

Reduction Algorithm

Summary:
Given a Graph 3-Colorability instance G = (V, E), construct a Clustering instance as follows:

  1. Elements: Set X = V (one element per vertex).

  2. Distance function: For each pair of vertices u, v ∈ V:

    • d(u, v) = 1 if {u, v} ∈ E (adjacent vertices)
    • d(u, v) = 0 if {u, v} ∉ E (non-adjacent vertices)
    • d(u, u) = 0 for all u
  3. Parameters: Set K = 3 (number of clusters) and B = 0 (diameter bound).

  4. Correctness (forward): If G has a proper 3-coloring c: V → {1,2,3}, then the color classes X_i = {v : c(v) = i} form a valid clustering. Within each color class, all vertices are non-adjacent (by definition of proper coloring), so all pairwise distances are 0 ≤ B = 0.

  5. Correctness (reverse): If X can be partitioned into ≤ 3 clusters with all intra-cluster distances ≤ 0, then within each cluster all pairs must have distance 0, meaning no two vertices in the same cluster are adjacent. This is exactly a proper 3-coloring.

  6. Solution extraction: Given a valid clustering X_1, X_2, X_3, assign color i to all vertices in X_i.

Key invariant: d(u,v) = 0 iff {u,v} ∉ E, so "distance ≤ 0 within cluster" is equivalent to "independent set in G."

Time complexity of reduction: O(|V|²) to construct the distance matrix.

Size Overhead

Symbols:

  • n = num_vertices of source graph G
  • m = num_edges of source graph G
Target metric (code name) Polynomial (using symbols above)
num_elements num_vertices

Derivation: Each vertex becomes an element. The distance matrix is n×n with entries in {0,1}. The parameters K=3 and B=0 are constants.

Validation Method

  • Closed-loop test: construct a KColoring instance with K=3, reduce to Clustering, solve target by brute-force enumeration of all 3-partitions, verify the clustering corresponds to a valid 3-coloring of the original graph.
  • Test with a known 3-colorable graph: cycle C_6 (6-cycle is 2-colorable, hence 3-colorable). Verify the reduction produces a YES instance.
  • Test with a known non-3-colorable graph: complete graph K_4 requires 4 colors. Verify the reduction produces a NO instance (no valid clustering with K=3, B=0).
  • Verify distance matrix correctness: for each edge, distance is 1; for each non-edge, distance is 0.

Example

Source instance (Graph 3-Colorability):
Graph G with 5 vertices {0, 1, 2, 3, 4} and 5 edges forming a 5-cycle:

  • Edges: {0,1}, {1,2}, {2,3}, {3,4}, {4,0}
  • C_5 is 3-colorable (odd cycle needs 3 colors)
  • Valid 3-coloring: c(0)=1, c(1)=2, c(2)=1, c(3)=2, c(4)=3

Constructed target instance (Clustering):

  • Elements: X = {0, 1, 2, 3, 4}
  • Distance matrix (symmetric, only upper triangle):
    • d(0,1)=1, d(0,2)=0, d(0,3)=0, d(0,4)=1
    • d(1,2)=1, d(1,3)=0, d(1,4)=0
    • d(2,3)=1, d(2,4)=0
    • d(3,4)=1
  • K = 3, B = 0

Solution mapping:

  • Clustering: X_1 = {0, 2} (distance d(0,2)=0 ≤ 0 ✓), X_2 = {1, 3} (distance d(1,3)=0 ≤ 0 ✓), X_3 = {4} (singleton ✓)
  • Extracted 3-coloring: c(0)=1, c(2)=1, c(1)=2, c(3)=2, c(4)=3
  • Verification: edge {0,1}: c(0)=1 ≠ c(1)=2 ✓, edge {1,2}: c(1)=2 ≠ c(2)=1 ✓, edge {2,3}: c(2)=1 ≠ c(3)=2 ✓, edge {3,4}: c(3)=2 ≠ c(4)=3 ✓, edge {4,0}: c(4)=3 ≠ c(0)=1 ✓

References

  • [Brucker, 1978]: P. Brucker (1978). "On the complexity of clustering problems." In Optimization and Operations Research, Lecture Notes in Economics and Mathematical Systems, vol. 157, pp. 45-54, Springer.
  • [Garey and Johnson, 1979]: M. R. Garey and D. S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, pp. 226 (MS9).

Metadata

Metadata

Assignees

No one assigned

    Labels

    ruleA new reduction rule to be added.

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions