Source: FEEDBACK VERTEX SET (FeedbackVertexSet)
Target: CODE GENERATION WITH UNLIMITED REGISTERS (MinimumCodeGenerationUnlimitedRegisters)
Motivation: The previous body was only a placeholder. This issue gives the actual chain gadget behind the Aho-Johnson-Ullman reduction.
Reference: A. V. Aho, R. Sethi, "How Hard Is Compiler Code Generation?" (1977), summarizing the Aho-Johnson-Ullman construction for code generation with common subexpressions and its unlimited-register variant.
Target Machine Model
Use the standard two-address unlimited-register model:
-
leaves are values already stored in named registers
-
every internal node is a binary operation
-
if internal node u has left child a and right child b, then one instruction
r <- r op s
computes u, where r currently holds the value of a and s currently holds the value of b
-
the left operand register is overwritten
-
copy instructions
r' <- r
are allowed and cost one instruction
Because registers are unlimited, the only optimization issue is how many copy instructions are needed.
Reduction Algorithm
Input:
- a directed graph
G = (V, E)
- an integer
k
Assume
V = {x_1, ..., x_n}.
For each vertex x in V, let its outgoing edges be ordered arbitrarily as
(x, y_1), ..., (x, y_d),
where d = outdeg(x).
Construct a DAG D = (N, A) with left/right arc partition (L, R) as follows.
Leaves
For each source vertex x in V, create one leaf
x^0
whose value is initially stored in a distinct register
R_x.
Internal chain nodes
For each source vertex x with outgoing neighbors y_1, ..., y_d, create internal nodes
x^1, x^2, ..., x^d.
For each i = 1..d, add:
- left arc
(x^i, x^(i-1)) to L
- right arc
(x^i, y_i^0) to R
So each x^i has exactly two outgoing arcs.
Roots
For each vertex x with outdeg(x) > 0, the root corresponding to x is x^d.
Vertices with outdeg(x) = 0 create only the leaf x^0; no internal node is needed for them.
Budget
Let
m = |E|.
Set
K = m + k.
Output the instance
(D, L, R, K).
Interpretation of the Gadget
- evaluating the chain for vertex
x destroys the original register R_x, because the first instruction for x^1 uses R_x as a left operand
- every incoming edge
(u, x) of the source graph creates an occurrence of leaf x^0 as a right child in the chain for u
- therefore, if
x is evaluated before some predecessor u, then u can still use x^0 only if R_x was copied first
Thus copies correspond to source vertices whose values are preserved across cyclic dependencies.
Forward Witness Map
Suppose F subseteq V is a feedback vertex set of G with |F| <= k.
Define
E_F = { (u, v) in E : v notin F }.
Because F intersects every directed cycle, the digraph
(V, E_F)
is acyclic.
Choose any topological ordering
<_F
of (V, E_F).
Now construct a target program as follows.
Step 1: copy the feedback vertices
For each x in F, emit one copy instruction
S_x <- R_x
where S_x is a fresh register.
Step 2: evaluate the chains in topological order
Process the source vertices in the order <_F.
For a fixed vertex x with outgoing neighbors y_1, ..., y_d:
where
T_i = S_(y_i) if y_i in F
T_i = R_(y_i) if y_i notin F
This computes the chain
x^1, x^2, ..., x^d
using exactly d operation instructions.
Why this is valid
If y_i notin F, then edge (x, y_i) belongs to E_F, so x <_F y_i.
Hence x is processed before y_i, and the original register R_(y_i) has not yet been destroyed when x needs it as a right operand.
If y_i in F, then x uses the preserved copy S_(y_i) instead.
Instruction count
There are exactly m internal nodes in D, so exactly m operation instructions are emitted.
There are exactly |F| copy instructions.
Hence the program length is
m + |F| <= m + k = K.
Extraction Pass
Suppose there is a target program of length at most K = m + k that computes all roots of D.
Normalization lemma
Every internal node of D has in-degree 1.
Hence no internal value is ever needed by two different parents.
Therefore, in a minimum-length program:
- every internal node is computed exactly once
- copying an internal value is useless
- every extra instruction beyond the mandatory
m operation instructions can be assumed to be a copy of a leaf register R_x
Let
F = { x in V : some copy of R_x is made }.
Since the total program length is at most m + k, we get
|F| <= k.
Why F is a feedback vertex set
Assume for contradiction that F is not a feedback vertex set.
Then there is a directed cycle
x_1 -> x_2 -> ... -> x_t -> x_1
in G with no vertex in F.
Fix one edge (x_i, x_(i+1)) on the cycle.
Because x_(i+1) notin F, the leaf value R_(x_(i+1)) is never copied.
The first instruction of the chain for x_(i+1) destroys R_(x_(i+1)).
So the chain for x_i, which needs x_(i+1)^0 as a right child, must be executed before the chain for x_(i+1) begins.
Thus for every i,
x_i must be evaluated before x_(i+1).
Applying this around the whole cycle yields
x_1 < x_2 < ... < x_t < x_1,
impossible.
So F intersects every directed cycle.
Therefore F is a feedback vertex set of G with |F| <= k.
Correctness
G has a feedback vertex set of size at most k
iff
D can be evaluated on an unlimited-register two-address machine using at most
|E| + k
instructions.
Size Overhead
Let
Then:
- number of leaves =
n
- number of internal nodes =
m
- total nodes =
n + m
- total arcs =
2m
- every internal node has out-degree
2
- the only shared nodes are leaves
So the construction is polynomial and matches the restricted form stated in the hardness result.
Example
Source instance:
- vertices
{a, b, c}
- edges
{(a,b), (b,c), (c,a)}
k = 1
This is a directed 3-cycle, so a feedback vertex set of size 1 exists, e.g. {b}.
Constructed DAG:
- leaves
a^0, b^0, c^0 labeled by registers R_a, R_b, R_c
- chain nodes
a^1 with left child a^0 and right child b^0
b^1 with left child b^0 and right child c^0
c^1 with left child c^0 and right child a^0
Budget:
K = |E| + k = 3 + 1 = 4.
A valid target program is:
S_b <- R_b
R_b <- R_b op R_c
R_c <- R_c op R_a
R_a <- R_a op S_b
This computes all three roots using exactly one copy instruction.
Implementation Notes
- The current placeholder
K = |V| + k is wrong for this gadget; the correct bound is K = |E| + k.
- The gadget is a left-chain construction, not a one-node-per-vertex toy graph.
- The source graph is directed.
- Only leaves are shared; this is essential for the converse proof.
Source: FEEDBACK VERTEX SET (FeedbackVertexSet)
Target: CODE GENERATION WITH UNLIMITED REGISTERS (MinimumCodeGenerationUnlimitedRegisters)
Motivation: The previous body was only a placeholder. This issue gives the actual chain gadget behind the Aho-Johnson-Ullman reduction.
Reference: A. V. Aho, R. Sethi, "How Hard Is Compiler Code Generation?" (1977), summarizing the Aho-Johnson-Ullman construction for code generation with common subexpressions and its unlimited-register variant.
Target Machine Model
Use the standard two-address unlimited-register model:
leaves are values already stored in named registers
every internal node is a binary operation
if internal node
uhas left childaand right childb, then one instructionr <- r op scomputes
u, wherercurrently holds the value ofaandscurrently holds the value ofbthe left operand register is overwritten
copy instructions
r' <- rare allowed and cost one instruction
Because registers are unlimited, the only optimization issue is how many copy instructions are needed.
Reduction Algorithm
Input:
G = (V, E)kAssume
V = {x_1, ..., x_n}.For each vertex
x in V, let its outgoing edges be ordered arbitrarily as(x, y_1), ..., (x, y_d),where
d = outdeg(x).Construct a DAG
D = (N, A)with left/right arc partition(L, R)as follows.Leaves
For each source vertex
x in V, create one leafx^0whose value is initially stored in a distinct register
R_x.Internal chain nodes
For each source vertex
xwith outgoing neighborsy_1, ..., y_d, create internal nodesx^1, x^2, ..., x^d.For each
i = 1..d, add:(x^i, x^(i-1))toL(x^i, y_i^0)toRSo each
x^ihas exactly two outgoing arcs.Roots
For each vertex
xwithoutdeg(x) > 0, the root corresponding toxisx^d.Vertices with
outdeg(x) = 0create only the leafx^0; no internal node is needed for them.Budget
Let
m = |E|.Set
K = m + k.Output the instance
(D, L, R, K).Interpretation of the Gadget
xdestroys the original registerR_x, because the first instruction forx^1usesR_xas a left operand(u, x)of the source graph creates an occurrence of leafx^0as a right child in the chain foruxis evaluated before some predecessoru, thenucan still usex^0only ifR_xwas copied firstThus copies correspond to source vertices whose values are preserved across cyclic dependencies.
Forward Witness Map
Suppose
F subseteq Vis a feedback vertex set ofGwith|F| <= k.Define
E_F = { (u, v) in E : v notin F }.Because
Fintersects every directed cycle, the digraph(V, E_F)is acyclic.
Choose any topological ordering
<_Fof
(V, E_F).Now construct a target program as follows.
Step 1: copy the feedback vertices
For each
x in F, emit one copy instructionS_x <- R_xwhere
S_xis a fresh register.Step 2: evaluate the chains in topological order
Process the source vertices in the order
<_F.For a fixed vertex
xwith outgoing neighborsy_1, ..., y_d:if
d = 0, do nothingotherwise emit the
dinstructionsR_x <- R_x op T_1R_x <- R_x op T_2...
R_x <- R_x op T_dwhere
T_i = S_(y_i)ify_i in FT_i = R_(y_i)ify_i notin FThis computes the chain
x^1, x^2, ..., x^dusing exactly
doperation instructions.Why this is valid
If
y_i notin F, then edge(x, y_i)belongs toE_F, sox <_F y_i.Hence
xis processed beforey_i, and the original registerR_(y_i)has not yet been destroyed whenxneeds it as a right operand.If
y_i in F, thenxuses the preserved copyS_(y_i)instead.Instruction count
There are exactly
minternal nodes inD, so exactlymoperation instructions are emitted.There are exactly
|F|copy instructions.Hence the program length is
m + |F| <= m + k = K.Extraction Pass
Suppose there is a target program of length at most
K = m + kthat computes all roots ofD.Normalization lemma
Every internal node of
Dhas in-degree1.Hence no internal value is ever needed by two different parents.
Therefore, in a minimum-length program:
moperation instructions can be assumed to be a copy of a leaf registerR_xLet
F = { x in V : some copy of R_x is made }.Since the total program length is at most
m + k, we get|F| <= k.Why
Fis a feedback vertex setAssume for contradiction that
Fis not a feedback vertex set.Then there is a directed cycle
x_1 -> x_2 -> ... -> x_t -> x_1in
Gwith no vertex inF.Fix one edge
(x_i, x_(i+1))on the cycle.Because
x_(i+1) notin F, the leaf valueR_(x_(i+1))is never copied.The first instruction of the chain for
x_(i+1)destroysR_(x_(i+1)).So the chain for
x_i, which needsx_(i+1)^0as a right child, must be executed before the chain forx_(i+1)begins.Thus for every
i,x_imust be evaluated beforex_(i+1).Applying this around the whole cycle yields
x_1 < x_2 < ... < x_t < x_1,impossible.
So
Fintersects every directed cycle.Therefore
Fis a feedback vertex set ofGwith|F| <= k.Correctness
Ghas a feedback vertex set of size at mostkiff
Dcan be evaluated on an unlimited-register two-address machine using at most|E| + kinstructions.
Size Overhead
Let
n = |V|m = |E|Then:
nmn + m2m2So the construction is polynomial and matches the restricted form stated in the hardness result.
Example
Source instance:
{a, b, c}{(a,b), (b,c), (c,a)}k = 1This is a directed 3-cycle, so a feedback vertex set of size
1exists, e.g.{b}.Constructed DAG:
a^0, b^0, c^0labeled by registersR_a, R_b, R_ca^1with left childa^0and right childb^0b^1with left childb^0and right childc^0c^1with left childc^0and right childa^0Budget:
K = |E| + k = 3 + 1 = 4.A valid target program is:
S_b <- R_bR_b <- R_b op R_cR_c <- R_c op R_aR_a <- R_a op S_bThis computes all three roots using exactly one copy instruction.
Implementation Notes
K = |V| + kis wrong for this gadget; the correct bound isK = |E| + k.