Skip to content

Tom-Notch/USF

Repository files navigation

Unified Spherical Frontend:
Learning Rotation-Equivariant Representations of
Spherical Images from Any Camera

CVPR 2026 Project Page arXiv pre-commit test

Mukai (Tom Notch) YuMosam DabhiLiuyue (Louise) XieSebastian SchererLászló A. Jeni

Carnegie Mellon University, Robotics Institute

Headline Animation

Unified Spherical Frontend (USF) is a distortion-free lens-agnostic rotation-equivariant vision framework for modern perception

Usage

  • Configure environment variables

    cp .env.example .env

    Edit .env and fill in your own values (WandB API key, entity, Docker username, etc.). This file is gitignored and will not be committed. It is automatically loaded into os.environ when you import usf, so Hydra configs can reference them via ${oc.env:VAR}.

  • Docker available by docker compose up -d if you set DOCKER_USER=tomnotch in .env

    • scripts/ contains scripts to build/push/pull/run Docker/Singularity
  • Set up the environment locally (optional)

    1. Environment compartmentalization

      1. Install Conda/Miniconda/Mamba

      2. Create the conda environment (provides Python 3.12, uv, and system-level libraries):

        conda env create -f environment.yml
      3. Activate environment:

        conda activate usf
      4. Install Python packages with uv:

        uv pip install -e .            # core deps
        uv pip install -e '.[full]'    # optional deps (jupyter, pre-commit, etc.)

        Optional Manim for notebook/manim/: not part of [full] (PyPI needs native Cairo/Pango). See the note in requirements-full.txt, then uv pip install manim when those libraries are available on the host.

      5. Verify: pip check and python -c "import torch, torch_scatter, xformers, faiss"

        • If torch_scatter complains about GLIBC version mismatch, build from source:

          uv pip install --no-build-isolation --no-deps "git+https://github.com/rusty1s/pytorch_scatter.git@2.1.2"
  • Download datasets, create data/ folder and symlink everything under it

    ln -s path/to/your/dataset/folder/* data/
    Click to see folder structure
    ❯ tree -dhl ./data
    ./data
    ├── [  28]  MNIST -> /home/your_user_name/dataset/MNIST
    │   ├── [4.0K]  t10k-images-idx3-ubyte
    │   ├── [4.0K]  t10k-labels-idx1-ubyte
    │   ├── [4.0K]  train-images-idx3-ubyte
    │   └── [4.0K]  train-labels-idx1-ubyte
    ├── [  30]  PANDORA -> /home/your_user_name/dataset/PANDORA
    │   ├── [4.0K]  annotations
    │   └── [ 92K]  images
    └── [  36]  stanford2D3DS -> /home/your_user_name/dataset/stanford2D3DS
        └── [4.0K]  area_3
            ├── [4.0K]  3d
            │   └── [ 12K]  rgb_textures
            ├── [4.0K]  data
            │   ├── [448K]  depth
            │   ├── [460K]  global_xyz
            │   ├── [464K]  normal
            │   ├── [460K]  pose
            │   ├── [436K]  rgb
            │   ├── [472K]  semantic
            │   └── [484K]  semantic_pretty
            ├── [4.0K]  pano
            │   ├── [ 16K]  depth
            │   ├── [ 16K]  global_xyz
            │   ├── [ 20K]  normal
            │   ├── [ 16K]  pose
            │   ├── [ 16K]  rgb
            │   ├── [ 16K]  semantic
            │   └── [ 20K]  semantic_pretty
            └── [376K]  raw
  • As an example, run notebooks such as sampler.ipynb and network_layers.ipynb

  • To train MNIST classification model, just do this in your shell

    train task=mnist
    • You might need to follow the instruction to create or login to your WanDB account and set up a project using WanDB's web interface for the logging to work properly, otherwise, you can run mnist.ipynb for a local demo
    • Relevant config file: mnist.yaml
  • To train object detection

    train task=object_detection
  • To train semantic segmentation

    train task=semantic_segmentation
  • To visualize a (batch of) spherical image file

    visualize_spherical_image -p path/to/image.npz -f desired_fps -s point_size
    • Path can be relative, e.g. I do -p data/output.npz all the time
    • FPS and point size are optional
    • You should see an interactive Open3D visualization window, press h to see operations printed in shell
  • To generate lens normal map for a given camera, make sure you have the camera config YAML file ready, see rgb_0.yaml for example. Then run the following command

    generate_lens_normal_map -c your/camera/config.yaml

    This will generate a lens normal map .npz and .pdf in the lens_normal_map folder under the same folder of your camera config file.

  • For coordinate system conventions, read Spherical & Vector Convention.pdf

Development Environment Setup

Citation

If you find this work useful, please cite:

@inproceedings{yu2026usf,
  title     = {Unified Spherical Frontend: Learning Rotation-Equivariant Representations of Spherical Images from Any Camera},
  author    = {Yu, Mukai and Dabhi, Mosam and Xie, Liuyue and Scherer, Sebastian and Jeni, L{\'a}szl{\'o} A.},
  year      = {2026},
  month     = jun,
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  publisher = IEEE
}