Skein is a self-hosted S3 gateway that sits in front of multiple storage providers. Your app talks to one endpoint, Skein figures out where to actually put the data. Skein handles routing, redundancy, usage tracking, and provider management transparently.
| Without Skein | With Skein |
|---|---|
| Migrating providers = code changes | Swap providers in config, zero code changes |
| No redundancy without paying AWS prices | Write to 2 or more providers simultaneously, automatic failover |
| Storage costs fixed to one vendor | Route large files to cheap bulk, hot files to fast tier |
| Manual usage tracking | Per-account dashboard with live storage and egress metrics |
| egress trafic | reduce egress for common objects used using cache |
git clone https://github.com/techgonia-devjio/skein
cd skein
cp example-config.yaml config/config.yamlEdit config/config.yaml — set your gateway credentials and add at least one provider account. Then:
docker compose up -dPoint any S3 client at http://localhost:9000:
aws s3 mb s3://my-bucket \
--endpoint-url http://localhost:9000 \
--no-verify-ssl
aws s3 cp ./file.zip s3://my-bucket/file.zip \
--endpoint-url http://localhost:9000The management dashboard is at http://localhost:9001. It shows live storage usage, egress, and account health — updates over SSE so there's no polling.
- S3-compatible API — works with anything that speaks S3 :P
- Multi-provider routing — different routing strategies to match different architecture
- Multi-redundancy — write every object to N providers simultaneously; reads fail over automatically
- Live dashboard — real-time usage, egress, and account health via Server-Sent Events
- Prometheus metrics —
skein_http_requests_total,skein_request_duration_seconds, per-account bytes counters - Structured logging — every request logged with
slogand aX-Request-Idtrace header - db agnostic — single-file SQLite for zero-ops single-server deployments; switch to Postgres or MySQL for multi-instance HA
- Disk/Redis cache — optional layered cache to reduce provider API calls and egress
- Diagnostics CLI —
skein diagnoseandskein account testto verify connectivity before going live
Skein has six routing strategies. You set one in the config and it applies to all writes.
| Strategy | What it does |
|---|---|
default |
Assigns each bucket to the account with the most free space. Simple and predictable. |
round-robin |
Spreads objects across all accounts in rotation. |
multi-redundancy |
Writes every object to N accounts simultaneously. Reads try each copy on failure. |
object-type |
Routes by Content-Type — images to one account, documents to another, etc. |
file-size |
Small files to fast/expensive storage, large files to cheap bulk. |
space-distribution |
Probabilistic routing weighted by available capacity. |
For most setups default or multi-redundancy is the right choice. The others are useful when you have a specific cost or compliance reason to separate data by type or size.
Example config for redundancy across two providers:
routing:
strategy: "multi-redundancy"
redundancy:
copies: 2If one provider goes down, reads fail over to the surviving copy. The client sees nothing.
Skein uses SQLite by default. For a single server it works fine and requires nothing else. If you need multiple Skein instances sharing state, switch to Postgres or MySQL:
database:
driver: postgres
dsn: "postgres://skein:pass@localhost:5432/skein?sslmode=disable"The schema is small — four tables tracking buckets, usage, and object placement.
There's an optional local cache that sits in front of provider reads. On a cache hit the request never leaves your server, which cuts egress and latency:
cache:
disk:
enabled: true
max_gb: 20Redis is also supported as a faster L1 layer in front of disk. Both can be enabled at once; Skein will check Redis first, then disk, then the provider.
skein server start start the gateway
skein account list list configured accounts and their usage
skein account test [id] test connectivity and latency to provider(s)
skein bucket list list virtual buckets
skein bucket create <name> create a bucket
skein bucket delete <name> delete a bucket
skein diagnose run a health check against config, DB, and all accounts
skein status show running server status
skein config-init write a starter config file
skein diagnose is useful before going live — it checks that each configured account is reachable and writable, reports current usage, and flags anything that looks wrong.
Every request goes through a logging middleware that writes structured JSON logs (log/slog) and records Prometheus metrics. Prometheus is exposed at http://localhost:9001/metrics. The main counters are skein_http_requests_total, skein_http_request_duration_seconds, and per-account upload/download byte counters.
Each response also gets an X-Request-Id header, which matches the corresponding log line.
Multipart upload is not implemented. The S3 protocol uses multipart for objects over roughly 8 MB when using the AWS SDK, which means large uploads through SDK clients will fail or time out depending on the client's threshold. Direct PUT works for any size. Multipart is the next thing on the list to do.
# unit + integration + acceptance + e2e
go test ./...
# or inside the dev container
docker exec skein-dev go test ./... -count=1The smoke tests run against a real provider and need credentials:
cp test/smoke/.env.smoke.example test/smoke/.env.smoke
# fill in your credentials
source test/smoke/.env.smoke
go test ./test/smoke/... -v -timeout 120sSee example-config.yaml for a fully commented config file covering all options. DOCS.md has the full REST API reference and provider setup notes.
MIT