Summary
Enable users to include images in conversations for vision-capable models.
Why It's Needed
- UI/UX Feedback: "What's wrong with this screenshot?"
- Diagram Understanding: Analyze architecture diagrams
- Error Screenshots: Debug from error screenshots
- Design Implementation: "Implement this design mockup"
- Competitor Parity: Both OpenCode and Claude Code support images
Features
Image Input Methods
- File Path:
@screenshot.png or --files image.jpg
- Paste from Clipboard:
Ctrl+V in TUI
- Drag and Drop: Drop image file onto TUI
- URL:
@https://example.com/image.png
Supported Formats
- PNG, JPG, JPEG, GIF, WebP
- Max size: 20MB (configurable)
- Auto-resize if too large
Implementation
Message Format
interface ImageMessage {
role: "user"
content: [
{ type: "text", text: "What's in this image?" },
{
type: "image_url",
image_url: {
url: "data:image/png;base64,iVBORw0KGgo...",
detail: "auto" // or "low" | "high"
}
}
]
}
Image Processing
async function processImage(input: string): Promise<ImageContent> {
let buffer: Buffer
if (input.startsWith("http")) {
// Fetch from URL
const response = await fetch(input)
buffer = Buffer.from(await response.arrayBuffer())
} else {
// Read from file
buffer = await readFile(input)
}
// Resize if too large
if (buffer.length > MAX_SIZE) {
buffer = await resizeImage(buffer, MAX_DIMENSIONS)
}
const base64 = buffer.toString("base64")
const mimeType = detectMimeType(buffer)
return {
type: "image_url",
image_url: {
url: \`data:\${mimeType};base64,\${base64}\`
}
}
}
Clipboard Support (TUI)
// On Ctrl+V, check for image in clipboard
import clipboard from "clipboardy"
async function handlePaste() {
// Check for image data
const imageData = await getClipboardImage()
if (imageData) {
addImageToContext(imageData)
showImagePreview(imageData)
}
}
Provider Support
| Provider |
Vision Support |
| Copilot (GPT-4o) |
✅ |
| Copilot (GPT-4) |
✅ |
| Copilot (Claude) |
✅ |
| Ollama (llava) |
✅ |
| Ollama (bakllava) |
✅ |
| Ollama (others) |
❌ |
TUI Display
┌─────────────────────────────────────┐
│ You: │
│ What's wrong with this UI? │
│ │
│ ┌─────────────┐ │
│ │ 📷 image.png │ │
│ │ (256x128) │ │
│ └─────────────┘ │
└─────────────────────────────────────┘
Configuration
{
"images": {
"enabled": true,
"maxSize": 20971520,
"autoResize": true,
"maxDimensions": 2048,
"detail": "auto"
}
}
Acceptance Criteria
Effort Estimate
3 days
Dependencies
- sharp (image processing)
- clipboardy (clipboard access)
Summary
Enable users to include images in conversations for vision-capable models.
Why It's Needed
Features
Image Input Methods
@screenshot.pngor--files image.jpgCtrl+Vin TUI@https://example.com/image.pngSupported Formats
Implementation
Message Format
Image Processing
Clipboard Support (TUI)
Provider Support
TUI Display
Configuration
{ "images": { "enabled": true, "maxSize": 20971520, "autoResize": true, "maxDimensions": 2048, "detail": "auto" } }Acceptance Criteria
Effort Estimate
3 days
Dependencies