Skip to content

[Rule] FEEDBACK VERTEX SET to MinimumCodeGenerationUnlimitedRegisters #908

@isPANN

Description

@isPANN

Source: FEEDBACK VERTEX SET (FeedbackVertexSet)
Target: CODE GENERATION WITH UNLIMITED REGISTERS (MinimumCodeGenerationUnlimitedRegisters)

Motivation: The previous body was only a placeholder. This issue gives the actual chain gadget behind the Aho-Johnson-Ullman reduction.

Reference: A. V. Aho, R. Sethi, "How Hard Is Compiler Code Generation?" (1977), summarizing the Aho-Johnson-Ullman construction for code generation with common subexpressions and its unlimited-register variant.

Target Machine Model

Use the standard two-address unlimited-register model:

  • leaves are values already stored in named registers

  • every internal node is a binary operation

  • if internal node u has left child a and right child b, then one instruction

    r <- r op s

    computes u, where r currently holds the value of a and s currently holds the value of b

  • the left operand register is overwritten

  • copy instructions

    r' <- r

    are allowed and cost one instruction

Because registers are unlimited, the only optimization issue is how many copy instructions are needed.

Reduction Algorithm

Input:

  • a directed graph G = (V, E)
  • an integer k

Assume
V = {x_1, ..., x_n}.

For each vertex x in V, let its outgoing edges be ordered arbitrarily as
(x, y_1), ..., (x, y_d),
where d = outdeg(x).

Construct a DAG D = (N, A) with left/right arc partition (L, R) as follows.

Leaves

For each source vertex x in V, create one leaf
x^0
whose value is initially stored in a distinct register
R_x.

Internal chain nodes

For each source vertex x with outgoing neighbors y_1, ..., y_d, create internal nodes
x^1, x^2, ..., x^d.

For each i = 1..d, add:

  • left arc (x^i, x^(i-1)) to L
  • right arc (x^i, y_i^0) to R

So each x^i has exactly two outgoing arcs.

Roots

For each vertex x with outdeg(x) > 0, the root corresponding to x is x^d.

Vertices with outdeg(x) = 0 create only the leaf x^0; no internal node is needed for them.

Budget

Let
m = |E|.

Set
K = m + k.

Output the instance
(D, L, R, K).

Interpretation of the Gadget

  • evaluating the chain for vertex x destroys the original register R_x, because the first instruction for x^1 uses R_x as a left operand
  • every incoming edge (u, x) of the source graph creates an occurrence of leaf x^0 as a right child in the chain for u
  • therefore, if x is evaluated before some predecessor u, then u can still use x^0 only if R_x was copied first

Thus copies correspond to source vertices whose values are preserved across cyclic dependencies.

Forward Witness Map

Suppose F subseteq V is a feedback vertex set of G with |F| <= k.

Define
E_F = { (u, v) in E : v notin F }.

Because F intersects every directed cycle, the digraph
(V, E_F)
is acyclic.

Choose any topological ordering
<_F
of (V, E_F).

Now construct a target program as follows.

Step 1: copy the feedback vertices

For each x in F, emit one copy instruction

S_x <- R_x

where S_x is a fresh register.

Step 2: evaluate the chains in topological order

Process the source vertices in the order <_F.

For a fixed vertex x with outgoing neighbors y_1, ..., y_d:

  • if d = 0, do nothing

  • otherwise emit the d instructions

    R_x <- R_x op T_1
    R_x <- R_x op T_2
    ...
    R_x <- R_x op T_d

where

  • T_i = S_(y_i) if y_i in F
  • T_i = R_(y_i) if y_i notin F

This computes the chain
x^1, x^2, ..., x^d
using exactly d operation instructions.

Why this is valid

If y_i notin F, then edge (x, y_i) belongs to E_F, so x <_F y_i.
Hence x is processed before y_i, and the original register R_(y_i) has not yet been destroyed when x needs it as a right operand.

If y_i in F, then x uses the preserved copy S_(y_i) instead.

Instruction count

There are exactly m internal nodes in D, so exactly m operation instructions are emitted.
There are exactly |F| copy instructions.

Hence the program length is
m + |F| <= m + k = K.

Extraction Pass

Suppose there is a target program of length at most K = m + k that computes all roots of D.

Normalization lemma

Every internal node of D has in-degree 1.
Hence no internal value is ever needed by two different parents.

Therefore, in a minimum-length program:

  • every internal node is computed exactly once
  • copying an internal value is useless
  • every extra instruction beyond the mandatory m operation instructions can be assumed to be a copy of a leaf register R_x

Let
F = { x in V : some copy of R_x is made }.

Since the total program length is at most m + k, we get
|F| <= k.

Why F is a feedback vertex set

Assume for contradiction that F is not a feedback vertex set.
Then there is a directed cycle

x_1 -> x_2 -> ... -> x_t -> x_1

in G with no vertex in F.

Fix one edge (x_i, x_(i+1)) on the cycle.
Because x_(i+1) notin F, the leaf value R_(x_(i+1)) is never copied.
The first instruction of the chain for x_(i+1) destroys R_(x_(i+1)).
So the chain for x_i, which needs x_(i+1)^0 as a right child, must be executed before the chain for x_(i+1) begins.

Thus for every i,
x_i must be evaluated before x_(i+1).

Applying this around the whole cycle yields

x_1 < x_2 < ... < x_t < x_1,

impossible.

So F intersects every directed cycle.
Therefore F is a feedback vertex set of G with |F| <= k.

Correctness

G has a feedback vertex set of size at most k
iff
D can be evaluated on an unlimited-register two-address machine using at most
|E| + k
instructions.

Size Overhead

Let

  • n = |V|
  • m = |E|

Then:

  • number of leaves = n
  • number of internal nodes = m
  • total nodes = n + m
  • total arcs = 2m
  • every internal node has out-degree 2
  • the only shared nodes are leaves

So the construction is polynomial and matches the restricted form stated in the hardness result.

Example

Source instance:

  • vertices {a, b, c}
  • edges {(a,b), (b,c), (c,a)}
  • k = 1

This is a directed 3-cycle, so a feedback vertex set of size 1 exists, e.g. {b}.

Constructed DAG:

  • leaves a^0, b^0, c^0 labeled by registers R_a, R_b, R_c
  • chain nodes
    • a^1 with left child a^0 and right child b^0
    • b^1 with left child b^0 and right child c^0
    • c^1 with left child c^0 and right child a^0

Budget:
K = |E| + k = 3 + 1 = 4.

A valid target program is:

  1. S_b <- R_b
  2. R_b <- R_b op R_c
  3. R_c <- R_c op R_a
  4. R_a <- R_a op S_b

This computes all three roots using exactly one copy instruction.

Implementation Notes

  1. The current placeholder K = |V| + k is wrong for this gadget; the correct bound is K = |E| + k.
  2. The gadget is a left-chain construction, not a one-node-per-vertex toy graph.
  3. The source graph is directed.
  4. Only leaves are shared; this is essential for the converse proof.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ruleA new reduction rule to be added.

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions