Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: publish

on:
push:
tags: ["v*"]

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Tag matches pyproject version
run: |
tag="${GITHUB_REF_NAME#v}"
version="$(python -c 'import tomllib; print(tomllib.load(open("pyproject.toml","rb"))["project"]["version"])')"
if [ "$tag" != "$version" ]; then
echo "tag v$tag != pyproject version $version" >&2
exit 1
fi

- name: Build sdist + wheel
run: |
python -m pip install --upgrade build
python -m build

- name: Smoke-test the wheel from a temp dir
run: |
python -m venv /tmp/smoke
/tmp/smoke/bin/pip install --no-index dist/*.whl
cd /tmp
/tmp/smoke/bin/loop-engineer --version
/tmp/smoke/bin/loop scaffold /tmp/smoke-loop
/tmp/smoke/bin/loop doctor /tmp/smoke-loop
/tmp/smoke/bin/loop inspect /tmp/smoke-loop || true

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Let the inspect smoke test fail the release

In the tag publish workflow this masks every loop inspect failure, including the exact missing-bundled-script/import errors the wheel smoke test is meant to catch. The scaffolded contract currently produces a successful inspect exit, so on a v* tag this || true can let a broken wheel artifact upload and publish instead of stopping the release.

Useful? React with 👍 / 👎.


- uses: actions/upload-artifact@v4
with:
name: dist
path: dist/

publish:
needs: build
runs-on: ubuntu-latest
environment: pypi
permissions:
id-token: write
steps:
- uses: actions/download-artifact@v4
with:
name: dist
path: dist/

- uses: pypa/gh-action-pypi-publish@release/v1
30 changes: 30 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,36 @@ All notable changes to `loop-engineer` are documented here.
`WORKFLOW.md` and `README.md` are reworded to describe the mechanism; the 0.3.4
history is left intact.

## Unreleased

**PyPI substrate.** `loop-engineer` becomes a self-contained wheel that runs from
any directory — the CLI no longer depends on being executed from a source
checkout — and ships to PyPI on a version tag through trusted publishing, with no
token or secret stored in the repo.

### Added
- **Self-contained wheel** — the schemas, contract templates, and CLI-needed tool
scripts the loop reads at runtime are bundled into the wheel under
`loop/_bundle/` (via `[tool.hatch.build.targets.wheel.force-include]`) and
resolved through an `importlib.resources`-first resolver (`loop/_resources.py`)
that falls back to the repo-relative layout for editable installs / source
checkouts. `loop` invocations no longer break when run outside the repo tree.
- **`loop-engineer` console script** — a second `[project.scripts]` entry point
alongside `loop` (both map to `loop.__main__:main`), so `uvx loop-engineer`
funnels straight to the CLI under the PyPI project name.
- **Wheel self-containment acceptance test**
(`scripts/test_wheel_selfcontained.py`) — builds the wheel and asserts its zip
manifest carries the bundled `schemas/`, `templates/`, and `tools/` resources,
so a regression that drops a runtime resource from the wheel fails the suite
(env-guarded: skips when `pip`/`build` are unavailable locally, hard-fails the
build under CI).
- **Tag-triggered PyPI publish workflow** (`.github/workflows/publish.yml`) — on a
`v*` tag push it guards that the tag matches the `pyproject` version, builds the
sdist + wheel, smoke-tests the wheel from a throwaway venv (`loop-engineer
--version`, then `loop scaffold`/`doctor`/`inspect`), and publishes via PyPI
**trusted publishing** (`id-token: write`, the `pypi` environment,
`pypa/gh-action-pypi-publish`) — no API token or secret anywhere in the repo.

## 0.6.0 — 2026-07-03

"Metrics real": false-completion-rate (FCR) and repair-productivity (RP) graduate
Expand Down
10 changes: 7 additions & 3 deletions loop/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ def _print_json(report: dict) -> int:

def _run_metrics(argv: list[str]) -> int:
"""`metrics [--baseline] <target>` — parses its own flag, then delegates to
scripts/metrics.py (imported repo-relative, the QW8 editable-install path)."""
scripts/metrics.py (resolved bundle-first, repo-relative fallback)."""
unknown = [a for a in argv if a.startswith("-") and a != "--baseline"]
if unknown:
print(f"metrics: unknown option: {unknown[0]}", file=sys.stderr)
Expand All @@ -96,7 +96,9 @@ def _run_metrics(argv: list[str]) -> int:
file=sys.stderr,
)
return 2
scripts_dir = Path(__file__).resolve().parent.parent / "scripts"
from ._resources import tools_dir

scripts_dir = tools_dir()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Write baselines relative to the user checkout

When the wheel path is selected here, the imported metrics.py lives under site-packages/loop/_bundle/tools; that script still sets _REPO_ROOT = Path(__file__).resolve().parent.parent and --baseline writes _REPO_ROOT / docs/metrics-baseline.json. For uvx/wheel users running loop metrics --baseline <workspace>, the command therefore succeeds while writing into the installed package bundle (or failing on read-only site-packages) instead of updating the checkout's docs/metrics-baseline.json.

Useful? React with 👍 / 👎.

sys.path.insert(0, str(scripts_dir))
import metrics # type: ignore

Expand Down Expand Up @@ -159,7 +161,9 @@ def main(argv: list[str] | None = None) -> int:
# command == "inspect": keep the historical inspector script as the scoring
# UI over the same contract artifacts; import lazily to avoid making
# scripts/ a package.
scripts_dir = Path(__file__).resolve().parent.parent / "scripts"
from ._resources import tools_dir

scripts_dir = tools_dir()
sys.path.insert(0, str(scripts_dir))
import inspect_loop # type: ignore

Expand Down
43 changes: 43 additions & 0 deletions loop/_resources.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
"""Bundle-first resource resolution (S0).

A built wheel carries schemas/, templates/, and the CLI-needed tool scripts as
package data under loop/_bundle/ (see [tool.hatch.build.targets.wheel.force-include]
in pyproject.toml). An editable install / repo checkout has no _bundle, so each
resolver falls back to the historical repo-relative layout. Wheels install as
real directories, so Traversable -> Path is safe here.
"""

from __future__ import annotations

from importlib import resources
from pathlib import Path

_REPO_FALLBACKS = {"schemas": "schemas", "templates": "templates", "tools": "scripts"}


def _bundle_root() -> Path | None:
try:
return Path(str(resources.files("loop") / "_bundle"))
except Exception:
return None


def _data_dir(kind: str) -> Path:
root = _bundle_root()
if root is not None:
bundled = root / kind
if bundled.is_dir():
return bundled
return Path(__file__).resolve().parent.parent / _REPO_FALLBACKS[kind]


def schemas_dir() -> Path:
return _data_dir("schemas")


def templates_dir() -> Path:
return _data_dir("templates")


def tools_dir() -> Path:
return _data_dir("tools")
7 changes: 4 additions & 3 deletions loop/contract.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,9 +264,10 @@ def _check_stub_verify_scripts(paths: LoopPaths, issues: list[dict]) -> None:


def _schemas_dir() -> Path:
# Resolve schemas/ relative to the loop package's repo root, the same pattern
# the pyproject comment describes for scripts/ (editable install is supported).
return Path(__file__).resolve().parent.parent / "schemas"
# Bundle-first (wheel package data), repo-relative editable-install fallback.
from ._resources import schemas_dir

return schemas_dir()


def _load_schema(name: str) -> dict[str, Any]:
Expand Down
13 changes: 8 additions & 5 deletions loop/scaffold.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,12 @@
from pathlib import Path
from typing import Any

# Resolve the bundled templates/ relative to the repo root, the same way
# __main__ resolves scripts/ — so scaffold works from an editable install.
_TEMPLATES_DIR = Path(__file__).resolve().parent.parent / "templates"
from ._resources import templates_dir


def _templates_dir() -> Path:
return templates_dir()


_PLACEHOLDER_RE = re.compile(r"\{\{[A-Z0-9_]+\}\}")

Expand Down Expand Up @@ -114,14 +117,14 @@ def scaffold(target: str | Path) -> dict[str, Any]:
for template_name, rel in _FILLED_FILES.items():
dest = target / rel
dest.parent.mkdir(parents=True, exist_ok=True)
text = (_TEMPLATES_DIR / template_name).read_text(encoding="utf-8")
text = (_templates_dir() / template_name).read_text(encoding="utf-8")
dest.write_text(_fill(text, mapping), encoding="utf-8")
written.append(rel)

for template_name, rel in _VERIFY_SCRIPTS.items():
dest = target / rel
dest.parent.mkdir(parents=True, exist_ok=True)
shutil.copyfile(_TEMPLATES_DIR / template_name, dest)
shutil.copyfile(_templates_dir() / template_name, dest)
dest.chmod(0o755)
written.append(rel)

Expand Down
24 changes: 16 additions & 8 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,20 +22,28 @@ keywords = ["agent", "loop", "agentic", "verification", "harness", "orchestratio
yaml = ["pyyaml>=6"]
schemas = ["jsonschema>=4"]

# Console entry point. `loop.__main__:main` resolves the bundled scripts/ dir
# relative to its own __file__, so this is EDITABLE-INSTALL ONLY: `pip install -e .`
# keeps loop/ pointing at the repo (where scripts/ lives). A non-editable wheel
# does not ship scripts/, so `inspect`/`metrics` would not resolve — see the
# [tool.hatch.build.targets.wheel] note below.
# Console entry points. Both install modes work: `loop.__main__:main` resolves
# schemas/templates/tool-scripts bundle-first via loop/_resources.py (wheel
# package data under loop/_bundle/), with the repo checkout as the editable
# fallback. `loop-engineer` is the uvx-visible alias matching the PyPI project.
[project.scripts]
loop = "loop.__main__:main"
loop-engineer = "loop.__main__:main"

[project.urls]
Homepage = "https://github.com/SollanSystems/loop-engineer"
Repository = "https://github.com/SollanSystems/loop-engineer"

# Editable install is the supported mode: `python3 -m loop inspect` resolves the
# bundled scripts/ dir relative to the repo, so install with `pip install -e .`
# to run the CLI from any directory.
# The wheel is self-contained: schemas/, templates/, and the CLI-needed tool
# scripts ship as package data under loop/_bundle/ (resolved by loop/_resources.py,
# importlib.resources-first with the repo checkout as editable-install fallback).
[tool.hatch.build.targets.wheel]
packages = ["loop"]

[tool.hatch.build.targets.wheel.force-include]
"schemas" = "loop/_bundle/schemas"
"templates" = "loop/_bundle/templates"
"scripts/inspect_loop.py" = "loop/_bundle/tools/inspect_loop.py"
"scripts/metrics.py" = "loop/_bundle/tools/metrics.py"
"scripts/holdout_gate.py" = "loop/_bundle/tools/holdout_gate.py"
"scripts/anticheat_scan.py" = "loop/_bundle/tools/anticheat_scan.py"
34 changes: 34 additions & 0 deletions scripts/test_resources.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
"""PR1/S0: resource resolution must be importlib.resources-first (wheel) with the
repo-relative checkout as the editable-install fallback. In this checkout no
loop/_bundle exists, so every resolver must land on the repo directories."""

from pathlib import Path

REPO_ROOT = Path(__file__).resolve().parent.parent


def test_repo_checkout_resolves_to_repo_dirs():
from loop import _resources

assert _resources.schemas_dir() == REPO_ROOT / "schemas"
assert _resources.templates_dir() == REPO_ROOT / "templates"
assert _resources.tools_dir() == REPO_ROOT / "scripts"


def test_resolved_dirs_hold_the_expected_artifacts():
from loop import _resources

assert (_resources.schemas_dir() / "terminal.schema.json").is_file()
assert (_resources.templates_dir() / "manifest.yaml.tmpl").is_file()
for tool in ("inspect_loop.py", "metrics.py", "holdout_gate.py", "anticheat_scan.py"):
assert (_resources.tools_dir() / tool).is_file()


def test_bundle_wins_when_present(tmp_path, monkeypatch):
"""When loop/_bundle/<kind> exists (the wheel layout), it wins over the repo path."""
from loop import _resources

bundle = tmp_path / "_bundle" / "schemas"
bundle.mkdir(parents=True)
monkeypatch.setattr(_resources, "_bundle_root", lambda: tmp_path / "_bundle")
assert _resources.schemas_dir() == bundle
94 changes: 94 additions & 0 deletions scripts/test_wheel_selfcontained.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# scripts/test_wheel_selfcontained.py
"""S0 acceptance: a built wheel must be self-contained. Build the wheel, install
it into a fresh venv, and run scaffold/doctor/inspect from a temp cwd where the
repo checkout is not importable. Env-guarded: building needs pip + network for
the hatchling backend, so this skips in offline/pip-less local envs; in CI a
wheel build failure fails the test rather than skipping it.
"""

from __future__ import annotations

import json
import os
import subprocess
import sys
import venv
import zipfile
from pathlib import Path

import pytest

REPO_ROOT = Path(__file__).resolve().parent.parent


def _pip_available() -> bool:
proc = subprocess.run(
[sys.executable, "-m", "pip", "--version"], capture_output=True, text=True
)
return proc.returncode == 0


pytestmark = pytest.mark.skipif(
not _pip_available(), reason="pip unavailable in this interpreter (wheel build env guard)"
)


@pytest.fixture(scope="module")
def wheel_env(tmp_path_factory):
tmp = tmp_path_factory.mktemp("wheel")
build = subprocess.run(
[sys.executable, "-m", "pip", "wheel", "--no-deps", "-w", str(tmp), str(REPO_ROOT)],
capture_output=True, text=True,
)
if build.returncode != 0:
if os.environ.get("CI"):
pytest.fail(f"wheel build failed in CI: {build.stderr[-1500:]}")
pytest.skip(f"wheel build unavailable here (offline?): {build.stderr[-400:]}")
wheel = next(tmp.glob("loop_engineer-*.whl"))

names = zipfile.ZipFile(wheel).namelist()
for expected in (
"loop/_bundle/schemas/",
"loop/_bundle/templates/",
"loop/_bundle/tools/",
):
assert any(n.startswith(expected) for n in names), f"wheel missing {expected}"

venv_dir = tmp / "venv"
venv.EnvBuilder(with_pip=True).create(venv_dir)
py = venv_dir / ("Scripts" if sys.platform == "win32" else "bin") / "python"
install = subprocess.run(
[str(py), "-m", "pip", "install", "--no-index", str(wheel)],
capture_output=True, text=True,
)
assert install.returncode == 0, install.stderr
return py


def _run(py: Path, args: list[str], cwd: Path) -> subprocess.CompletedProcess[str]:
# cwd is OUTSIDE the repo, so the checkout is absent from sys.path.
return subprocess.run([str(py), "-m", "loop", *args], cwd=cwd, capture_output=True, text=True)


def test_scaffold_doctor_inspect_from_wheel_only(wheel_env, tmp_path):
workspace = tmp_path / "fresh-loop"

scaffolded = _run(wheel_env, ["scaffold", str(workspace)], cwd=tmp_path)
assert scaffolded.returncode == 0, scaffolded.stderr

doctored = _run(wheel_env, ["doctor", str(workspace)], cwd=tmp_path)
assert doctored.returncode == 0, doctored.stdout + doctored.stderr
assert json.loads(doctored.stdout)["ok"] is True

inspected = _run(wheel_env, ["inspect", str(workspace)], cwd=tmp_path)
report = json.loads(inspected.stdout)
assert report["verdict"] in ("weak", "ok", "strong")


def test_both_console_scripts_are_installed(wheel_env, tmp_path):
bindir = wheel_env.parent
for name in ("loop", "loop-engineer"):
exe = bindir / name
proc = subprocess.run([str(exe), "--version"], cwd=tmp_path, capture_output=True, text=True)
assert proc.returncode == 0, f"{name}: {proc.stderr}"
assert proc.stdout.strip()
Loading