[Rule] FEEDBACK VERTEX SET to MinimumCodeGenerationUnlimitedRegisters

Source: FEEDBACK VERTEX SET (FeedbackVertexSet)
Target: CODE GENERATION WITH UNLIMITED REGISTERS (MinimumCodeGenerationUnlimitedRegisters)

Motivation: The previous body was only a placeholder. This issue gives the actual chain gadget behind the Aho-Johnson-Ullman reduction.

Reference: A. V. Aho, R. Sethi, "How Hard Is Compiler Code Generation?" (1977), summarizing the Aho-Johnson-Ullman construction for code generation with common subexpressions and its unlimited-register variant.

## Target Machine Model

Use the standard two-address unlimited-register model:

- leaves are values already stored in named registers
- every internal node is a binary operation
- if internal node `u` has left child `a` and right child `b`, then one instruction

  `r <- r op s`

  computes `u`, where `r` currently holds the value of `a` and `s` currently holds the value of `b`
- the left operand register is overwritten
- copy instructions

  `r' <- r`

  are allowed and cost one instruction

Because registers are unlimited, the only optimization issue is how many copy instructions are needed.

## Reduction Algorithm

Input:
- a directed graph `G = (V, E)`
- an integer `k`

Assume
`V = {x_1, ..., x_n}`.

For each vertex `x in V`, let its outgoing edges be ordered arbitrarily as
`(x, y_1), ..., (x, y_d)`,
where `d = outdeg(x)`.

Construct a DAG `D = (N, A)` with left/right arc partition `(L, R)` as follows.

### Leaves

For each source vertex `x in V`, create one leaf
`x^0`
whose value is initially stored in a distinct register
`R_x`.

### Internal chain nodes

For each source vertex `x` with outgoing neighbors `y_1, ..., y_d`, create internal nodes
`x^1, x^2, ..., x^d`.

For each `i = 1..d`, add:
- left arc `(x^i, x^(i-1))` to `L`
- right arc `(x^i, y_i^0)` to `R`

So each `x^i` has exactly two outgoing arcs.

### Roots

For each vertex `x` with `outdeg(x) > 0`, the root corresponding to `x` is `x^d`.

Vertices with `outdeg(x) = 0` create only the leaf `x^0`; no internal node is needed for them.

### Budget

Let
`m = |E|`.

Set
`K = m + k`.

Output the instance
`(D, L, R, K)`.

## Interpretation of the Gadget

- evaluating the chain for vertex `x` destroys the original register `R_x`, because the first instruction for `x^1` uses `R_x` as a left operand
- every incoming edge `(u, x)` of the source graph creates an occurrence of leaf `x^0` as a right child in the chain for `u`
- therefore, if `x` is evaluated before some predecessor `u`, then `u` can still use `x^0` only if `R_x` was copied first

Thus copies correspond to source vertices whose values are preserved across cyclic dependencies.

## Forward Witness Map

Suppose `F subseteq V` is a feedback vertex set of `G` with `|F| <= k`.

Define
`E_F = { (u, v) in E : v notin F }`.

Because `F` intersects every directed cycle, the digraph
`(V, E_F)`
is acyclic.

Choose any topological ordering
`<_F`
of `(V, E_F)`.

Now construct a target program as follows.

### Step 1: copy the feedback vertices

For each `x in F`, emit one copy instruction

`S_x <- R_x`

where `S_x` is a fresh register.

### Step 2: evaluate the chains in topological order

Process the source vertices in the order `<_F`.

For a fixed vertex `x` with outgoing neighbors `y_1, ..., y_d`:
- if `d = 0`, do nothing
- otherwise emit the `d` instructions

  `R_x <- R_x op T_1`
  `R_x <- R_x op T_2`
  ...
  `R_x <- R_x op T_d`

where
- `T_i = S_(y_i)` if `y_i in F`
- `T_i = R_(y_i)` if `y_i notin F`

This computes the chain
`x^1, x^2, ..., x^d`
using exactly `d` operation instructions.

### Why this is valid

If `y_i notin F`, then edge `(x, y_i)` belongs to `E_F`, so `x <_F y_i`.
Hence `x` is processed before `y_i`, and the original register `R_(y_i)` has not yet been destroyed when `x` needs it as a right operand.

If `y_i in F`, then `x` uses the preserved copy `S_(y_i)` instead.

### Instruction count

There are exactly `m` internal nodes in `D`, so exactly `m` operation instructions are emitted.
There are exactly `|F|` copy instructions.

Hence the program length is
`m + |F| <= m + k = K`.

## Extraction Pass

Suppose there is a target program of length at most `K = m + k` that computes all roots of `D`.

### Normalization lemma

Every internal node of `D` has in-degree `1`.
Hence no internal value is ever needed by two different parents.

Therefore, in a minimum-length program:
- every internal node is computed exactly once
- copying an internal value is useless
- every extra instruction beyond the mandatory `m` operation instructions can be assumed to be a copy of a leaf register `R_x`

Let
`F = { x in V : some copy of R_x is made }`.

Since the total program length is at most `m + k`, we get
`|F| <= k`.

### Why `F` is a feedback vertex set

Assume for contradiction that `F` is not a feedback vertex set.
Then there is a directed cycle

`x_1 -> x_2 -> ... -> x_t -> x_1`

in `G` with no vertex in `F`.

Fix one edge `(x_i, x_(i+1))` on the cycle.
Because `x_(i+1) notin F`, the leaf value `R_(x_(i+1))` is never copied.
The first instruction of the chain for `x_(i+1)` destroys `R_(x_(i+1))`.
So the chain for `x_i`, which needs `x_(i+1)^0` as a right child, must be executed before the chain for `x_(i+1)` begins.

Thus for every `i`,
`x_i` must be evaluated before `x_(i+1)`.

Applying this around the whole cycle yields

`x_1 < x_2 < ... < x_t < x_1`,

impossible.

So `F` intersects every directed cycle.
Therefore `F` is a feedback vertex set of `G` with `|F| <= k`.

## Correctness

`G` has a feedback vertex set of size at most `k`
iff
`D` can be evaluated on an unlimited-register two-address machine using at most
`|E| + k`
instructions.

## Size Overhead

Let
- `n = |V|`
- `m = |E|`

Then:
- number of leaves = `n`
- number of internal nodes = `m`
- total nodes = `n + m`
- total arcs = `2m`
- every internal node has out-degree `2`
- the only shared nodes are leaves

So the construction is polynomial and matches the restricted form stated in the hardness result.

## Example

Source instance:
- vertices `{a, b, c}`
- edges `{(a,b), (b,c), (c,a)}`
- `k = 1`

This is a directed 3-cycle, so a feedback vertex set of size `1` exists, e.g. `{b}`.

Constructed DAG:
- leaves `a^0, b^0, c^0` labeled by registers `R_a, R_b, R_c`
- chain nodes
  - `a^1` with left child `a^0` and right child `b^0`
  - `b^1` with left child `b^0` and right child `c^0`
  - `c^1` with left child `c^0` and right child `a^0`

Budget:
`K = |E| + k = 3 + 1 = 4`.

A valid target program is:

1. `S_b <- R_b`
2. `R_b <- R_b op R_c`
3. `R_c <- R_c op R_a`
4. `R_a <- R_a op S_b`

This computes all three roots using exactly one copy instruction.

## Implementation Notes

1. The current placeholder `K = |V| + k` is wrong for this gadget; the correct bound is `K = |E| + k`.
2. The gadget is a left-chain construction, not a one-node-per-vertex toy graph.
3. The source graph is directed.
4. Only leaves are shared; this is essential for the converse proof.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rule] FEEDBACK VERTEX SET to MinimumCodeGenerationUnlimitedRegisters #908

Target Machine Model

Reduction Algorithm

Leaves

Internal chain nodes

Roots

Budget

Interpretation of the Gadget

Forward Witness Map

Step 1: copy the feedback vertices

Step 2: evaluate the chains in topological order

Why this is valid

Instruction count

Extraction Pass

Normalization lemma

Why `F` is a feedback vertex set

Correctness

Size Overhead

Example

Implementation Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Rule] FEEDBACK VERTEX SET to MinimumCodeGenerationUnlimitedRegisters #908

Description

Target Machine Model

Reduction Algorithm

Leaves

Internal chain nodes

Roots

Budget

Interpretation of the Gadget

Forward Witness Map

Step 1: copy the feedback vertices

Step 2: evaluate the chains in topological order

Why this is valid

Instruction count

Extraction Pass

Normalization lemma

Why F is a feedback vertex set

Correctness

Size Overhead

Example

Implementation Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Why `F` is a feedback vertex set