Skip to content

Support TranslateGemma 4B text-only inference (Gemma 3 4B LM-only, no VIT) #888

@Selami79

Description

@Selami79

Summary

TranslateGemma 4B (google/translategemma-4b-it) is Google's translation-specific model built on Gemma 3 4B architecture. It is a text-only model (no vision encoder needed for inference), but gemma.cpp currently only supports the VLM variant of Gemma 3 4B.

Problem

When running TranslateGemma 4B with gemma.cpp:

  1. SBS conversion: convert_from_safetensors.py assumes PaliGemma VLM format — requires vision tower tensors that TranslateGemma doesn't have
  2. Loading: ConfigGemma3_4B() returns VLM config with vit_config.image_size=896, causing Tensor enc_norm_bias is required but not found in file error
  3. No LM-only dispatch: ConfigGemma3_4B_LM() exists in code but is never used as the primary config

What We Did (Workarounds)

We successfully ran TranslateGemma on gemma.cpp with these changes:

1. Convert script modifications

  • Skip vision_tower.* and multi_modal_projector.* tensors during loading
  • Fix vocab_size (262144 instead of PaliGemma's 257152+64 trim)
  • Add QK norm tensors (query_norm, key_norm) to layer config as BF16
  • Zero out vit_config in SBS metadata before writing

2. C++ changes needed

  • configs.cc: Dispatch GEMMA3_4B to ConfigGemma3_4B_LM() when no VIT tensors present
  • tensor_info.cc: Guard VIT tensor registration with if (config.vit_config.image_size > 0)
  • weights.h: Conditional VIT MatPtr initialization
  • python/configs.cc: Add missing Gemma 3 model enums (GEMMA3_1B, GEMMA3_4B, GEMMA3_12B, GEMMA3_27B)

3. Result

  • TranslateGemma 4B runs successfully on CPU with SFP 8-bit format
  • 4.3GB SBS file, translation works across 55+ languages
  • All 34 layers + QK norms correctly loaded

Feature Request

  1. Auto-detect LM-only vs VLM — when VIT tensors are absent in SBS, use ConfigGemma3_*_LM() instead of VLM config
  2. Update convert_from_safetensors.py to support text-only Gemma 3 models (not just PaliGemma)
  3. Add Gemma 3 4B/12B/27B to Python enum in python/configs.cc

Environment

  • gemma.cpp: latest main (April 2026)
  • CPU: AMD EPYC (AVX-512 VNNI)
  • OS: Ubuntu 24.04
  • Model: google/translategemma-4b-it

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions