Plugin for CodeProject.AI Server that uses Ollama for image and video description and visual question answering. Supports multiple vision models (default: Moondream) and video via frame sampling + summarization.
Author: Zlatko Lakisic · OmegaIT LLC · omega-it.solutions
License: MIT with Attribution — open source; derivative works must credit Zlatko Lakisic and OmegaIT LLC.
- CodeProject.AI Server installed and running.
- Ollama installed and running on the same machine (or reachable at
OLLAMA_HOST).- Linux / Docker: The installer can install Ollama into the module folder (per manual install) so it persists when the module lives on a volume. Binary:
module_dir/ollama/bin/ollama; models:module_dir/models. SetOLLAMA_MODELSto the module’smodelsfolder when starting Ollama at runtime. - Windows: The installer can install the standalone CLI (ollama-windows-amd64.zip) into the module folder (
module_dir\ollama); models go tomodule_dir\models. Otherwise install from ollama.com and ensureollamais in PATH.
- Linux / Docker: The installer can install Ollama into the module folder (per manual install) so it persists when the module lives on a volume. Binary:
- The moondream model pulled in Ollama (the installer does this automatically when Ollama is available).
For video, the same vision model is used to describe frames and to summarize; no second model is required. Optionally set OLLAMA_SUMMARY_MODEL to a different model (e.g.llama3.2) and runollama pull llama3.2if you prefer a separate summary model.
-
Copy this folder into the folder the server scans for modules.
On Windows (CodeProject.AI Server 2.9.x) the server uses modules, not “modules”:- Path:
C:\Program Files\CodeProject\AI\modules\OmegaOllamaMultiModalLLM\ - Copy the entire OmegaOllamaMultiModalLLM folder into that
modulesfolder. (There is no AnalysisLayer on many installs.)
- Path:
-
Run the CodeProject.AI setup so it installs the module:
- Windows: Open an elevated (Administrator) command prompt, go to the module folder, then run the setup that targets this module. For example:
(Or run the main Server setup from the AI folder if it installs all modules.)
cd "C:\Program Files\CodeProject\AI\modules\OmegaOllamaMultiModalLLM" ..\..\setup.bat
- Windows: Open an elevated (Administrator) command prompt, go to the module folder, then run the setup that targets this module. For example:
-
The installer will:
- Linux: If Ollama is not in the module folder, download and extract Ollama into
module_dir/ollama(manual install per docs.ollama.com/linux); models go tomodule_dir/models. Optional: setOLLAMA_USE_ROCM=1for AMD GPU before install. - Create a Python virtual environment for the module (if needed).
- Install dependencies from
requirements.txt(ollama, Pillow, CodeProject-AI-SDK). - Start Ollama if needed and run
ollama pull moondream(andollama pull llama3.2for video).
- Linux: If Ollama is not in the module folder, download and extract Ollama into
-
At runtime (Linux/Docker): If Ollama was installed into the module folder, start it with the module’s models dir so models persist:
OLLAMA_MODELS="$(pwd)/models" ./ollama/bin/ollama serve(Run from the module directory, or set
OLLAMA_MODELSto the module’smodelspath.) If Ollama is in system PATH, just ensure it is running andollama pull moondreamhas been run. -
Enable and start the Ollama MultiModal LLM by OmegaIT module in the CodeProject.AI dashboard.
To run changes directly without copying the folder each time, you can symlink this repo into CodeProject.AI’s modules folder:
- Windows (elevated/admin):
From the CodeProject.AImodulesfolder (e.g.C:\Program Files\CodeProject\AI\modules\), create a directory junction or symlink namedOmegaOllamaMultiModalLLMpointing to your clone of this folder (e.g.D:\Projects\ollama-integrations\OmegaOllamaMultiModalLLM).
CodeProject.AI will then load the module from your repo, so edits here take effect after restarting the module (or the server).
When the installer or dashboard asks for a self-test:
- Start the module in the CodeProject.AI dashboard (enable and start Ollama MultiModal LLM by OmegaIT if it isn’t already).
- Run the test in one of these ways:
- Explorer: Open the CodeProject.AI Explorer (e.g. from the dashboard), go to the Vision section, and use the Ollama MultiModal test if it appears there.
- Test page: In your browser go to
http://localhost:32168/explore/OmegaOllamaMultiModalLLM
(or the URL shown in the dashboard for this module’s explore page). Upload an image, optionally enter a prompt, and click Describe image. You should see the model’s description or an error message. - API: From a terminal:
curl -X POST "http://localhost:32168/v1/vision/omega-ollama/describe-image" -F "image=@path/to/image.jpg"
and check the JSON response for"success": trueand a"description"field.
If the module is running and Ollama has the moondream model, the self-test should return a short description of the image.
Module not showing in the dashboard? Side-loaded modules (like this one) may appear under Settings → Modules, or in an Installed / Local modules list rather than in the main “Install modules” catalog. Refresh the page, restart CodeProject.AI Server if needed, and confirm the folder OmegaOllamaMultiModalLLM is in the server’s modules directory (e.g. C:\Program Files\CodeProject\AI\modules\). Then restart the server and refresh the dashboard. The module appears as Ollama MultiModal LLM by OmegaIT (OmegaOllamaMultiModalLLM).
This module exposes an OpenAI-compatible endpoint so LLM Vision can use CodeProject.AI directly.
-
Ensure the module is running (Ollama MultiModal LLM by OmegaIT enabled and started in the CodeProject.AI dashboard).
-
In Home Assistant: Settings → Devices & services → LLM Vision → Add Entry (or Reconfigure). Choose OpenAI or Custom OpenAI (OpenAI-compatible). Set:
- Base URL: Your CodeProject.AI server URL, with no path (e.g.
http://codeproject-ai:32168orhttp://192.168.1.10:32168). LLM Vision will call/v1/chat/completionson that base. - API key: Leave empty or use any placeholder (this endpoint does not require a key).
- Model: Any label (e.g.
omega-ollama); the module uses your configured vision model.
- Base URL: Your CodeProject.AI server URL, with no path (e.g.
-
Use this provider in LLM Vision for camera snapshots, Frigate events, etc. Requests go to CodeProject.AI’s Ollama MultiModal LLM module.
Endpoint: POST /v1/chat/completions — accepts JSON in OpenAI chat format (e.g. messages with image_url and text) and returns OpenAI-style choices with the vision description.
- Image:
POST /v1/vision/omega-ollama/describe-image- Inputs:
image(file, required),prompt(string, optional).
- Inputs:
- OpenAI-compatible (LLM Vision):
POST /v1/chat/completions- Inputs: JSON body with
messages(array withimage_url+ text) and optionalmodel. Returns OpenAI-stylechoiceswith the vision description.
- Inputs: JSON body with
- Video:
POST /v1/vision/omega-ollama/describe-video- Inputs:
video(file, required),prompt(string, optional). - Outputs:
success(boolean),description(string),error(string when failed). Same shape as describe-image; processing is synchronous (may take 1–2 minutes for several frames).
- Inputs:
Example (image):
curl -X POST "http://localhost:32168/v1/vision/omega-ollama/describe-image" \
-F "image=@/path/to/image.jpg" \
-F "prompt=What is in this image?"Example (video):
curl -X POST "http://localhost:32168/v1/vision/omega-ollama/describe-video" \
-F "video=@/path/to/clip.mp4" \
-F "prompt=Describe this image in a few sentences."- Vision model (dashboard): In the module’s configuration in the CodeProject.AI dashboard, use the Vision model menu to choose which Ollama vision model to use. Default is Moondream (small, fast). Options include LLaVA 7B/13B/34B, Llama 3.2 Vision 11B/90B, and others. Ensure the chosen model is pulled (
ollama pull <model>). - Environment variable:
OLLAMA_VISION_MODEL– Ollama vision model name (default:moondream). Set in the module’s environment or inmodulesettings.jsonunderEnvironmentVariables; the dashboard menu overrides this when set. - OLLAMA_SUMMARY_MODEL – Optional. Model used to summarize video frame descriptions into one text. If unset, the vision model (e.g. Moondream) is used for both describing frames and summarizing, so one model is enough. Set to e.g.
llama3.2and runollama pull llama3.2if you want a separate summary model. - OLLAMA_VIDEO_INTERVAL_SEC – For video: sample one frame every N seconds (default:
5). Larger values = fewer frames and faster requests. - OLLAMA_VIDEO_MAX_FRAMES – Maximum number of frames to analyze per video (default:
6). Keep low so the request finishes within the server’s HTTP timeout (e.g. 4–6 frames).
Video: The module uses OpenCV (opencv-python-headless) to extract frames from video files. Supported formats include .mp4, .avi, .mov, .mkv, .webm. Frames are sampled at the configured interval, each frame is described with the vision model, then the text model summarizes. Processing is synchronous; use conservative frame count/interval to avoid server timeouts.
- Where inference runs: Image inference runs in the Ollama server process (a separate process from this module). Ollama uses the GPU automatically when CUDA is available; this module only sends requests to Ollama and does not run the model itself.
- Enable/Disable GPU button: The dashboard’s GPU toggle is respected: when Enable GPU is on and the system has an NVIDIA GPU (or PyTorch with CUDA), the module reports GPU as the inference device so the dashboard shows correct status. When the toggle is off, the module reports CPU. The actual GPU use is still determined by the Ollama server.
- Verifying GPU use: To confirm Ollama is using the GPU, run:
The PROCESSOR column should show e.g.
ollama ps
100% GPUwhen a model is loaded. The module’s status (in the dashboard or API) may also includeollamaGpuInUse: truewhen Ollama reports VRAM usage for loaded models.
- “Access is denied” / “Unable to install Python 3.10”: The CodeProject.AI installer needs to download Python and write to its install folder (e.g.
C:\Program Files\CodeProject\AI\). Fix it by either:- Run the installer as Administrator: Right‑click
setup.bat(or the shortcut that runs setup) → Run as administrator, then run module install again from the dashboard or setup. - Use a user-writable install path: Install CodeProject.AI Server to a location where your user has full rights (e.g.
C:\CodeProject\AIor a folder under your user profile) so the installer can createdownloads,runtimes, and module venvs without elevation.
- Run the installer as Administrator: Right‑click
- Module won’t start: Ensure Python 3.10+ and the module’s venv are set up (re-run setup as above). If you copied this folder from another path (e.g. after renaming from OllamaVision), the copied
binfolder may have an invalid venv; delete thebinfolder and run setup again so it creates a fresh virtual environment. - “Connection refused” or no response: Ensure Ollama is running (
ollama listshould work) and that the host/port match (defaulthttp://localhost:11434; useOLLAMA_HOSTif needed). - Model not found: Run
ollama pull moondream(and optionally setOLLAMA_VISION_MODELif you use a different model name). - Module not visible in dashboard: The server loads modules from modules, not from “modules”. Put the OmegaOllamaMultiModalLLM folder at
C:\Program Files\CodeProject\AI\modules\OmegaOllamaMultiModalLLM\, then restart the server and refresh the dashboard. See Module not showing in the dashboard? under Self-test above. The module appears as Ollama MultiModal LLM by OmegaIT (OmegaOllamaMultiModalLLM). - describe-video returns “The request timed out”: Video processing is synchronous and does several Ollama calls (one per frame plus summary). If the server’s HTTP timeout is short (e.g. 60–120s), reduce work: set
OLLAMA_VIDEO_MAX_FRAMES=4andOLLAMA_VIDEO_INTERVAL_SEC=8in the module’s environment. Alternatively increase the server’s request timeout if supported. Note: An async (long-running) mode was reverted because the current CodeProject.AI Server throws an unhandled exception when it receives a “Command is running in the background” response; video is therefore synchronous until the server supports that response shape.
