Source: Graph 3-Colorability (KColoring with K=3)
Target: Clustering
Motivation: Establishes NP-completeness of CLUSTERING by encoding a graph coloring instance as a distance-based clustering problem. The key insight is that adjacency in the graph translates to distance 1 (must be in different clusters), and non-adjacency translates to distance 0 (can be in the same cluster). With K=3 clusters and diameter bound B=0, a valid 3-coloring corresponds exactly to a valid clustering.
Reference: Garey & Johnson, Computers and Intractability, A1.2 MS9; Brucker 1978
GJ Source Entry
[MS9] CLUSTERING
INSTANCE: Finite set X, "distance" d(x,y) ∈ Z₀⁺ for each pair x,y ∈ X, positive integers K and B.
QUESTION: Can X be partitioned into k ≤ K disjoint sets X_1, X_2, ..., X_k such that, for 1 ≤ i ≤ k, d(x,y) ≤ B for all x,y ∈ X_i?
Reference: [Brucker, 1978]. Transformation from GRAPH 3-COLORABILITY.
Comment: NP-complete even for K = 3 and d restricted to {0,1}. Polynomial for K = 2.
Reduction Algorithm
Summary:
Given a Graph 3-Colorability instance G = (V, E), construct a Clustering instance as follows:
-
Elements: Set X = V (one element per vertex).
-
Distance function: For each pair of vertices u, v ∈ V:
- d(u, v) = 1 if {u, v} ∈ E (adjacent vertices)
- d(u, v) = 0 if {u, v} ∉ E (non-adjacent vertices)
- d(u, u) = 0 for all u
-
Parameters: Set K = 3 (number of clusters) and B = 0 (diameter bound).
-
Correctness (forward): If G has a proper 3-coloring c: V → {1,2,3}, then the color classes X_i = {v : c(v) = i} form a valid clustering. Within each color class, all vertices are non-adjacent (by definition of proper coloring), so all pairwise distances are 0 ≤ B = 0.
-
Correctness (reverse): If X can be partitioned into ≤ 3 clusters with all intra-cluster distances ≤ 0, then within each cluster all pairs must have distance 0, meaning no two vertices in the same cluster are adjacent. This is exactly a proper 3-coloring.
-
Solution extraction: Given a valid clustering X_1, X_2, X_3, assign color i to all vertices in X_i.
Key invariant: d(u,v) = 0 iff {u,v} ∉ E, so "distance ≤ 0 within cluster" is equivalent to "independent set in G."
Time complexity of reduction: O(|V|²) to construct the distance matrix.
Size Overhead
Symbols:
- n =
num_vertices of source graph G
- m =
num_edges of source graph G
| Target metric (code name) |
Polynomial (using symbols above) |
num_elements |
num_vertices |
Derivation: Each vertex becomes an element. The distance matrix is n×n with entries in {0,1}. The parameters K=3 and B=0 are constants.
Validation Method
- Closed-loop test: construct a KColoring instance with K=3, reduce to Clustering, solve target by brute-force enumeration of all 3-partitions, verify the clustering corresponds to a valid 3-coloring of the original graph.
- Test with a known 3-colorable graph: cycle C_6 (6-cycle is 2-colorable, hence 3-colorable). Verify the reduction produces a YES instance.
- Test with a known non-3-colorable graph: complete graph K_4 requires 4 colors. Verify the reduction produces a NO instance (no valid clustering with K=3, B=0).
- Verify distance matrix correctness: for each edge, distance is 1; for each non-edge, distance is 0.
Example
Source instance (Graph 3-Colorability):
Graph G with 5 vertices {0, 1, 2, 3, 4} and 5 edges forming a 5-cycle:
- Edges: {0,1}, {1,2}, {2,3}, {3,4}, {4,0}
- C_5 is 3-colorable (odd cycle needs 3 colors)
- Valid 3-coloring: c(0)=1, c(1)=2, c(2)=1, c(3)=2, c(4)=3
Constructed target instance (Clustering):
- Elements: X = {0, 1, 2, 3, 4}
- Distance matrix (symmetric, only upper triangle):
- d(0,1)=1, d(0,2)=0, d(0,3)=0, d(0,4)=1
- d(1,2)=1, d(1,3)=0, d(1,4)=0
- d(2,3)=1, d(2,4)=0
- d(3,4)=1
- K = 3, B = 0
Solution mapping:
- Clustering: X_1 = {0, 2} (distance d(0,2)=0 ≤ 0 ✓), X_2 = {1, 3} (distance d(1,3)=0 ≤ 0 ✓), X_3 = {4} (singleton ✓)
- Extracted 3-coloring: c(0)=1, c(2)=1, c(1)=2, c(3)=2, c(4)=3
- Verification: edge {0,1}: c(0)=1 ≠ c(1)=2 ✓, edge {1,2}: c(1)=2 ≠ c(2)=1 ✓, edge {2,3}: c(2)=1 ≠ c(3)=2 ✓, edge {3,4}: c(3)=2 ≠ c(4)=3 ✓, edge {4,0}: c(4)=3 ≠ c(0)=1 ✓
References
- [Brucker, 1978]: P. Brucker (1978). "On the complexity of clustering problems." In Optimization and Operations Research, Lecture Notes in Economics and Mathematical Systems, vol. 157, pp. 45-54, Springer.
- [Garey and Johnson, 1979]: M. R. Garey and D. S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, pp. 226 (MS9).
Source: Graph 3-Colorability (KColoring with K=3)
Target: Clustering
Motivation: Establishes NP-completeness of CLUSTERING by encoding a graph coloring instance as a distance-based clustering problem. The key insight is that adjacency in the graph translates to distance 1 (must be in different clusters), and non-adjacency translates to distance 0 (can be in the same cluster). With K=3 clusters and diameter bound B=0, a valid 3-coloring corresponds exactly to a valid clustering.
Reference: Garey & Johnson, Computers and Intractability, A1.2 MS9; Brucker 1978
GJ Source Entry
Reduction Algorithm
Summary:
Given a Graph 3-Colorability instance G = (V, E), construct a Clustering instance as follows:
Elements: Set X = V (one element per vertex).
Distance function: For each pair of vertices u, v ∈ V:
Parameters: Set K = 3 (number of clusters) and B = 0 (diameter bound).
Correctness (forward): If G has a proper 3-coloring c: V → {1,2,3}, then the color classes X_i = {v : c(v) = i} form a valid clustering. Within each color class, all vertices are non-adjacent (by definition of proper coloring), so all pairwise distances are 0 ≤ B = 0.
Correctness (reverse): If X can be partitioned into ≤ 3 clusters with all intra-cluster distances ≤ 0, then within each cluster all pairs must have distance 0, meaning no two vertices in the same cluster are adjacent. This is exactly a proper 3-coloring.
Solution extraction: Given a valid clustering X_1, X_2, X_3, assign color i to all vertices in X_i.
Key invariant: d(u,v) = 0 iff {u,v} ∉ E, so "distance ≤ 0 within cluster" is equivalent to "independent set in G."
Time complexity of reduction: O(|V|²) to construct the distance matrix.
Size Overhead
Symbols:
num_verticesof source graph Gnum_edgesof source graph Gnum_elementsnum_verticesDerivation: Each vertex becomes an element. The distance matrix is n×n with entries in {0,1}. The parameters K=3 and B=0 are constants.
Validation Method
Example
Source instance (Graph 3-Colorability):
Graph G with 5 vertices {0, 1, 2, 3, 4} and 5 edges forming a 5-cycle:
Constructed target instance (Clustering):
Solution mapping:
References