feat: add --check-env preflight flag for OOM risk detection before quantization by sotanengel · Pull Request #14 · FujitsuResearch/OneCompression

sotanengel · 2026-05-18T10:24:32Z

Background

While quantizing a large-scale LLM (70B+ parameters), the process crashed midway through with an out-of-memory error after running for several hours. There was no way to know in advance whether the available GPU VRAM was sufficient — the failure only surfaced deep into the quantization loop, wasting significant compute time.

This PR introduces a --check-env preflight flag that detects OOM risk before quantization starts, based on the physical characteristics of the execution environment.

Summary

Adds --check-env CLI flag (also available as Runner.auto_run(check_env=True) in the Python API)
Loads the model architecture on a meta device (zero GPU/CPU memory) to count parameters
Collects hardware info: GPU VRAM (total & free), CPU RAM (via optional psutil), disk space
Estimates memory requirements at 2-bit, 4-bit, and 8-bit quantization using existing weight_memory_gb(), plus a calibration overhead factor
Classifies OOM risk as safe / warning / danger and prints a human-readable report
On danger: exits with code 1 (CLI) or raises RuntimeError (library API), stopping quantization before it wastes GPU time
On safe / warning: prints the report and continues with quantization as normal (preflight behavior)

Example output

============================================================
  OneComp Environment Check
============================================================

Hardware
  GPU count              : 1
  GPU name               : NVIDIA A100 80GB PCIe
  GPU VRAM (total)       : 80.0 GB
  GPU VRAM (free)        : 78.3 GB
  CPU RAM (total)        : 251.6 GB
  CPU RAM (avail)        : 230.1 GB
  Disk (avail)           : 320.4 GB  [/home/user/output]

Model: meta-llama/Llama-2-7b-hf
  Parameters             : 6,738,415,616
  FP16 footprint         : 12.54 GB

Memory Estimates
  2-bit quantized        :  1.96 GB
  4-bit quantized        :  3.77 GB
  8-bit quantized        :  7.28 GB
  Calib. overhead        :  1.88 GB  (15% of FP16)
  4-bit + overhead       :  5.65 GB

OOM Risk Assessment
  Risk level             : WARNING
  Detail                 : Free VRAM (78.3 GB) fits 4-bit quantized
                           weights but is tight (calibration overhead included).

  Recommended wbits      :  3.84  (VRAM-estimated)
============================================================

Risk thresholds

Level	Condition	Action
`safe`	`free_vram ≥ fp16_size × 1.2`	Report + continue
`warning`	`free_vram ≥ 4-bit size + calib overhead`	Report + continue
`danger`	otherwise	Report + stop (exit 1 / RuntimeError)

Changed files

File	Change
`onecomp/utils/vram_estimator.py`	New dataclasses (`EnvironmentSnapshot`, `ModelMemoryProfile`, `EnvCheckResult`) + `check_environment()` + `print_env_report()`
`onecomp/utils/__init__.py`	Export 5 new public symbols
`onecomp/cli.py`	`--check-env` argparse flag + preflight invocation
`onecomp/runner.py`	`check_env: bool = False` kwarg in `auto_run()`
`pyproject.toml`	Optional extras: `pip install ".[check-env]"` adds `psutil` for CPU RAM info

Test plan

onecomp <model_id> --check-env --no-eval — verify report is printed and quantization continues
onecomp <model_id> --check-env --total-vram-gb 1.0 — verify danger triggers exit code 1
onecomp <model_id> --check-env --total-vram-gb 6.0 --no-eval — verify warning continues
CPU-only environment with --total-vram-gb — verify graceful handling without CUDA
Without psutil installed — verify n/a fallback message appears
Runner.auto_run(..., check_env=True) — verify RuntimeError raised on danger

🤖 Generated with Claude Code

Add QuantizationProgressTracker and wire it through calibration, chunked calibration, multi-GPU phase 2, QEP general and arch-aware paths. Runner gains quantization_progress flag (default on). Includes unit tests for ETA formatting and thread-safe stepping. Co-authored-by: Cursor <cursoragent@cursor.com>

sotanengel · 2026-05-18T10:27:19Z

全く急ぎではないのですが、自分で使っていて気になった点があったためPRを出させていただきました。
もし不要な場合はCloseしていただいて大丈夫です🙇

Raise clear error for unsupported QEP quantizers See merge request onecomp/onecomp-lab!71

…ation-progress-eta feat: quantization progress logs with ETA

* refactoring : QuantizationProgressTracker * update CHANGELOG.md --------- Co-authored-by: FKKimura <50981196+FKKimura@users.noreply.github.com>

Adds a --check-env CLI flag that collects physical hardware characteristics (GPU VRAM, CPU RAM, disk space) and model memory estimates before quantization starts, then classifies OOM risk as safe/warning/danger. Exits with code 1 on danger; otherwise prints a report and proceeds with quantization. - onecomp/utils/vram_estimator.py: new EnvironmentSnapshot, ModelMemoryProfile, EnvCheckResult dataclasses; check_environment() and print_env_report() functions reusing existing weight_memory_gb() and estimate_target_bitwidth() - onecomp/utils/__init__.py: export 5 new public symbols - onecomp/cli.py: --check-env argparse flag with preflight invocation - onecomp/runner.py: check_env=False kwarg in auto_run() for library API use - pyproject.toml: optional extras [check-env] = ["psutil>=5.9"] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

FKKimura and others added 2 commits May 7, 2026 21:20

Define v1.1.1

85cba91

FKKimura and others added 5 commits May 18, 2026 13:31

Raise clear error for unsupported QEP quantizers

ea5dd45

Merge branch 'lab/fix-jointq-qep' into 'export/v1-1-1'

c7f98a2

Raise clear error for unsupported QEP quantizers See merge request onecomp/onecomp-lab!71

Merge pull request FujitsuResearch#13 from sotanengel/feature/quantiz…

77da402

…ation-progress-eta feat: quantization progress logs with ETA

Refactoring: quantization progress logs with ETA (FujitsuResearch#15)

6caa479

* refactoring : QuantizationProgressTracker * update CHANGELOG.md --------- Co-authored-by: FKKimura <50981196+FKKimura@users.noreply.github.com>

sotanengel force-pushed the feature/check-env-preflight branch from 29a83f5 to 1eab710 Compare May 19, 2026 22:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add --check-env preflight flag for OOM risk detection before quantization#14

feat: add --check-env preflight flag for OOM risk detection before quantization#14
sotanengel wants to merge 7 commits into
FujitsuResearch:mainfrom
sotanengel:feature/check-env-preflight

sotanengel commented May 18, 2026

Uh oh!

sotanengel commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sotanengel commented May 18, 2026

Background

Summary

Example output

Risk thresholds

Changed files

Test plan

Uh oh!

sotanengel commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants