fix: Add NVIDIA Blackwell (RTX 50xx, sm_120) GPU support by Hasham-dev · Pull Request #4155 · lllyasviel/Fooocus

Hasham-dev · 2026-03-04T00:44:44Z

Summary

Minimal fixes to support NVIDIA Blackwell architecture GPUs (RTX 5050/5060/5070/5080/5090, compute capability sm_120) on Fooocus.

bfloat16 UNet dtype for Blackwell GPUs (compute major >= 12) which have native bf16 tensor core support
Skip manual_cast for bf16 weights to avoid unnecessary dtype casting overhead
Fix numpy TypeError with bfloat16 tensors in modules/patch.py and extras/ip_adapter.py — numpy doesn't support bf16, so we convert to float32 before .numpy() calls

Changes (3 files, +13 -2 lines)

File	Change
`ldm_patched/modules/model_management.py`	Auto-detect Blackwell GPUs and use bf16 dtype; skip manual_cast for bf16
`modules/patch.py`	Fix bf16→numpy crash in `patched_unet_forward`
`extras/ip_adapter.py`	Fix bf16→numpy crash in IP-Adapter attention patcher

Testing

GPU: NVIDIA GeForce RTX 5070 (sm_120, 11.5GB VRAM)
CUDA: 12.8
PyTorch: 2.12.0.dev (nightly, cu128)
Result: Image generation works at ~3.2 it/s at 896x1152, including Image Prompt (IP-Adapter) mode
VAE note: Users will also need madebyollin/sdxl-vae-fp16-fix VAE for stable bf16 decoding (the default SDXL VAE overflows in bf16)

Fixes

Fixes NVIDIA GeForce RTX 5090 with CUDA capability sm_120 is not compatible with the current PyTorch installation. #3862 (RTX 5090 sm_120 not compatible)
Fixes [Feature Request]: # RTX 5060 (sm_120) Not Supported - CUDA Compatibility Issue #4123 (RTX 5060 sm_120 not supported)
Fixes [Bug]: ### Issue: CUDA error on RTX 5050 Laptop GPU (sm_120 not supported) **Description:** When running Fooocus on a system with an NVIDIA GeForce RTX 5050 Laptop GPU, I encounter the following error: #4141 (CUDA error on RTX 5050 sm_120)

Prerequisites

Users with Blackwell GPUs need PyTorch nightly with CUDA 12.8 support:

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128

- Use bfloat16 dtype for UNet on Blackwell GPUs (compute major >= 12) which have native bf16 tensor core support - Skip manual_cast for bfloat16 weights to avoid unnecessary casting - Fix numpy TypeError with bfloat16 tensors in patch.py and ip_adapter.py by converting to float32 before .numpy() calls Tested on RTX 5070 (sm_120, CUDA 12.8) with PyTorch nightly (cu128). Generates images at ~3.2 it/s including Image Prompt (IP-Adapter) mode. Fixes lllyasviel#3862, lllyasviel#4123, lllyasviel#4141

Hasham-dev requested a review from lllyasviel as a code owner March 4, 2026 00:44

peardox mentioned this pull request Mar 27, 2026

[Bug]: RuntimeError: CUDA error: no kernel image is available #4164

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Add NVIDIA Blackwell (RTX 50xx, sm_120) GPU support#4155

fix: Add NVIDIA Blackwell (RTX 50xx, sm_120) GPU support#4155
Hasham-dev wants to merge 1 commit into
lllyasviel:mainfrom
Hasham-dev:fix/blackwell-rtx50xx-support

Hasham-dev commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hasham-dev commented Mar 4, 2026

Summary

Changes (3 files, +13 -2 lines)

Testing

Fixes

Prerequisites

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant