Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@
- **Gateway OTLP metrics in the per-pod sidecar**: when `spec.monitoring.enabled=true`, the OTel Collector sidecar now exposes an OTLP/gRPC receiver on `127.0.0.1:4317` and the documentdb-gateway is configured (via `OTEL_EXPORTER_OTLP_ENDPOINT` and `OTEL_METRICS_ENABLED`) to push its `db_client_*` metrics there. The sidecar's existing prometheus exporter re-exports them alongside the existing `documentdb.postgres.up` sqlquery output, with per-pod attribution added by the collector's resource processor. No new CRD fields; this turns on automatically wherever monitoring was already enabled.
- **Two-Phase Extension Upgrade**: New `spec.schemaVersion` field separates binary upgrades (`spec.documentDBVersion`) from irreversible schema migrations (`ALTER EXTENSION UPDATE`). The default behavior gives you a rollback-safe window — update the binary first, validate, then finalize the schema. Set `schemaVersion: "auto"` for single-step upgrades in development environments. See the [upgrade guide](docs/operator-public-documentation/preview/operations/upgrades.md) for details.

### Behavioral Changes
- **Sidecar memory & CPU isolation**: `spec.resource.memory` and `spec.resource.cpu` are now treated as the total pod resource envelope. The operator reserves resources for the documentdb-gateway sidecar (memory default 18.75% of the envelope, capped at 32Gi) and, when `spec.monitoring.enabled` is true, the OTel collector sidecar (default memory limit 128Mi, CPU request 50m / limit 200m), then gives PostgreSQL the remainder and recomputes its memory-aware parameters (`shared_buffers`, etc.) accordingly. This isolates a gateway/collector leak so it is OOM-killed in its own container instead of crowding out PostgreSQL. The split is configurable per component via `spec.resource.{gateway,database,otel}` and fleet-wide via operator Helm values. The envelope is **optional**: you may omit `spec.resource.memory`/`cpu` for a dimension when it is set explicitly on both the gateway and the database (the effective envelope is then the sum); a partially specified dimension without an envelope is rejected by the validating webhook. Existing clusters adopt the new split (and a one-time rolling restart) on their next reconcile.

### Breaking Changes
- **CRD restructure into domain-grouped stanzas**: image, postgres and plugin fields have moved into dedicated groups. Migrate as follows: `spec.documentDBImage` → `spec.image.documentDB`, `spec.gatewayImage` → `spec.image.gateway`, `spec.postgresImage` → `spec.image.postgres`, `spec.sidecarInjectorPluginName` → `spec.plugins.sidecarInjectorName`. A new `spec.postgres` group exposes `uid`, `gid` and `postInitSQL` (the operator's mandatory bootstrap statements always run first; user statements are appended after). A new root-level `spec.imagePullSecrets` is propagated to the underlying CNPG cluster.
- **Validating webhook added**: A new `ValidatingWebhookConfiguration` enforces that `spec.schemaVersion` never exceeds the binary version and blocks `spec.documentDBVersion` rollbacks below the committed schema version. This requires [cert-manager](https://cert-manager.io/) to be installed in the cluster (it is already a prerequisite for the sidecar injector). Existing clusters upgrading to this release will have the webhook activated automatically via `helm upgrade`.
Expand Down
95 changes: 85 additions & 10 deletions docs/operator-public-documentation/postgresql-tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The operator manages PostgreSQL parameters through a layered merge system with c

## Resource Configuration

Configure CPU and memory for your DocumentDB pods using the `spec.resource` section:
Configure CPU and memory for your DocumentDB pods using the `spec.resource` section. The `memory` and `cpu` values are total pod envelopes, not only PostgreSQL resources:

```yaml
apiVersion: documentdb.io/preview
Expand All @@ -26,25 +26,100 @@ spec:
resource:
storage:
pvcSize: "50Gi"
memory: "8Gi" # Pod memory limit (Guaranteed QoS)
cpu: "4" # Pod CPU limit (Guaranteed QoS)
memory: "8Gi" # Total pod memory envelope
cpu: "4" # Total pod CPU envelope (carved like memory)
```

When `memory` is set, the operator uses **Guaranteed QoS** (requests = limits), as recommended by CloudNative-PG for database workloads. This ensures predictable performance and stable memory for PostgreSQL buffer management.
When top-level `memory` or `cpu` is set, the operator allocates that envelope across the PostgreSQL container, the documentdb-gateway sidecar, and, when monitoring is enabled, the OTel Collector sidecar. Each component gets its own container resource settings so sidecars reserve their memory/CPU and a sidecar memory leak is OOM-killed in that sidecar instead of crowding out PostgreSQL.

If `memory` is not specified (or set to `"0"`), no resource limits are applied and static fallback values are used for memory-sensitive parameters.
If neither an envelope nor any per-container value is specified for a dimension, no limits are applied for that dimension and static fallback values are used for memory-sensitive parameters when memory is unmanaged. See [The envelope is optional](#the-envelope-is-optional) below for omitting the envelope while still sizing containers.

!!! note
Changing `memory` (or `cpu`) triggers a rolling restart of the DocumentDB pods,
causing brief downtime. The pod is recreated with the new resource limits, and the
memory-aware PostgreSQL parameters (`shared_buffers`, `effective_cache_size`,
`work_mem`, `maintenance_work_mem`) are recomputed and applied at the same time.

## Sidecar Memory Isolation

The operator treats `spec.resource.memory` and `spec.resource.cpu` as total pod envelopes and carves out sidecar reservations before computing PostgreSQL settings:

- **documentdb-gateway**: by default, reserves 18.75% of the total memory envelope, capped at 32Gi; its configured CPU reservation is carved from the CPU envelope.
- **OTel Collector**: when `spec.monitoring.enabled` is true, defaults to a 48Mi memory request, a 128Mi memory limit, a 50m CPU request, and a 200m CPU limit (Burstable — the requests are reserved and the limits cap a telemetry burst).
- **PostgreSQL**: receives the remaining memory and CPU, and memory-aware parameters such as `shared_buffers` are recomputed from that database allocation.

Override individual containers with `spec.resource.gateway`, `spec.resource.database`, and `spec.resource.otel` when a cluster needs explicit sizing:

```yaml
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: sized-cluster
spec:
monitoring:
enabled: true
resource:
memory: "8Gi" # Total pod memory envelope
cpu: "4" # Total pod CPU envelope
gateway:
memory: "1Gi"
cpu: "500m"
database:
memory: "6Gi"
cpu: "3"
otel:
memory: "128Mi"
cpu: "50m"
```

Each per-component value is a Kubernetes quantity string and, when set, overrides the automatic split for that container.

### The envelope is optional

`spec.resource.memory` and `spec.resource.cpu` (the pod envelope) are optional. For each dimension independently:

- **Set the envelope** and let the operator divide it (gateway and OTel reserved, PostgreSQL gets the remainder).
- **Omit the envelope** and instead set that dimension on **both** `spec.resource.gateway` and `spec.resource.database` — the effective envelope is the sum of the containers (the OTel collector uses its default if you do not set it). For example:

```yaml
spec:
resource:
storage:
pvcSize: "50Gi"
# no top-level memory/cpu — derived from the containers below
gateway:
memory: "1Gi"
cpu: "500m"
database:
memory: "6Gi"
cpu: "3"
```

- **Omit the envelope and all container values** for a dimension to leave it unmanaged (no limits).

If you omit the envelope but only partially specify the containers (for example, you set `gateway.memory` but not `database.memory`), the resource is **rejected** by the validating webhook, because the sidecar reservation and PostgreSQL remainder for that dimension cannot be derived without the envelope. Likewise, an explicit envelope that the sidecar reservations exhaust — or that explicit per-container values exceed — is rejected.

Cluster-wide defaults are configured with the operator Helm chart:

```yaml
operator:
sidecarResources:
gatewayMemoryFraction: "0.1875"
gatewayMemoryCap: "32Gi"
gatewayCpuLimit: "" # optional; bounds gateway async worker threads
otelMemoryRequest: "48Mi"
otelMemoryLimit: "128Mi"
otelCpuRequest: "50m"
otelCpuLimit: "200m" # ceiling on the collector's CPU burst
```

Use per-cluster `spec.resource` overrides for individual workload needs; use Helm values to change fleet-wide defaults for clusters managed by the operator.

## Memory-Aware Defaults

When a memory limit is configured, these parameters are automatically computed:
When PostgreSQL has an effective database memory allocation, these parameters are automatically computed from that allocation:

| Parameter | Formula | Example (8Gi) |
| Parameter | Formula | Example (8Gi database allocation) |
|-----------|---------|---------------|
| `shared_buffers` | 25% of memory | 2GB |
| `effective_cache_size` | 75% of memory | 6GB |
Expand All @@ -53,7 +128,7 @@ When a memory limit is configured, these parameters are automatically computed:

### Sizing Reference

| Pod Memory | shared_buffers | effective_cache_size | work_mem | maintenance_work_mem |
| Database Memory | shared_buffers | effective_cache_size | work_mem | maintenance_work_mem |
|-----------|----------------|---------------------|----------|---------------------|
| (not set) | 256MB | 512MB | 16MB | 128MB |
| 2Gi | 512MB | 1536MB | 4MB | 204MB |
Expand Down Expand Up @@ -138,8 +213,8 @@ spec:

This configuration will produce the following effective parameters (among others):

- `shared_buffers`: 4GB (auto-computed from 16Gi)
- `effective_cache_size`: 12GB (auto-computed)
- `shared_buffers`: auto-computed from the PostgreSQL memory remaining after sidecar reservations
- `effective_cache_size`: auto-computed from the same effective database allocation
- `max_connections`: 500 (user override)
- `wal_level`: logical (protected, from ChangeStreams gate)
- `cron.database_name`: postgres (protected)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,25 @@ import (
"github.com/cloudnative-pg/cnpg-i-machinery/pkg/pluginhelper/validation"
"github.com/cloudnative-pg/cnpg-i/pkg/operator"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
)

const (
labelsParameter = "labels"
annotationParameter = "annotations"
gatewayImageParameter = "gatewayImage"
gatewayImagePullPolicyParameter = "gatewayImagePullPolicy"
gatewayMemoryRequestParameter = "gatewayMemoryRequest"
gatewayMemoryLimitParameter = "gatewayMemoryLimit"
gatewayCPURequestParameter = "gatewayCpuRequest"
gatewayCPULimitParameter = "gatewayCpuLimit"
documentDbCredentialSecretParameter = "documentDbCredentialSecret"
otelCollectorImageParameter = "otelCollectorImage"
otelConfigMapNameParameter = "otelConfigMapName"
otelMemoryRequestParameter = "otelMemoryRequest"
otelMemoryLimitParameter = "otelMemoryLimit"
otelCPURequestParameter = "otelCpuRequest"
otelCPULimitParameter = "otelCpuLimit"
prometheusPortParameter = "prometheusPort"
)

Expand All @@ -31,9 +40,17 @@ type Configuration struct {
Annotations map[string]string
GatewayImage string
GatewayImagePullPolicy corev1.PullPolicy
GatewayMemoryRequest string
GatewayMemoryLimit string
GatewayCPURequest string
GatewayCPULimit string
DocumentDbCredentialSecret string
OtelCollectorImage string
OtelConfigMapName string
OTelMemoryRequest string
OTelMemoryLimit string
OTelCPURequest string
OTelCPULimit string
PrometheusPort int32
}

Expand Down Expand Up @@ -67,6 +84,16 @@ func FromParameters(
gatewayImage := helper.Parameters[gatewayImageParameter]
credentialSecret := helper.Parameters[documentDbCredentialSecretParameter]
pullPolicy := parsePullPolicy(helper.Parameters[gatewayImagePullPolicyParameter])
validateQuantityParameters(helper, &validationErrors,
gatewayMemoryRequestParameter,
gatewayMemoryLimitParameter,
gatewayCPURequestParameter,
gatewayCPULimitParameter,
otelMemoryRequestParameter,
otelMemoryLimitParameter,
otelCPURequestParameter,
otelCPULimitParameter,
)

var prometheusPort int32
if portStr := helper.Parameters[prometheusPortParameter]; portStr != "" {
Expand All @@ -86,9 +113,17 @@ func FromParameters(
Annotations: annotations,
GatewayImage: gatewayImage,
GatewayImagePullPolicy: pullPolicy,
GatewayMemoryRequest: helper.Parameters[gatewayMemoryRequestParameter],
GatewayMemoryLimit: helper.Parameters[gatewayMemoryLimitParameter],
GatewayCPURequest: helper.Parameters[gatewayCPURequestParameter],
GatewayCPULimit: helper.Parameters[gatewayCPULimitParameter],
DocumentDbCredentialSecret: credentialSecret,
OtelCollectorImage: helper.Parameters[otelCollectorImageParameter],
OtelConfigMapName: helper.Parameters[otelConfigMapNameParameter],
OTelMemoryRequest: helper.Parameters[otelMemoryRequestParameter],
OTelMemoryLimit: helper.Parameters[otelMemoryLimitParameter],
OTelCPURequest: helper.Parameters[otelCPURequestParameter],
OTelCPULimit: helper.Parameters[otelCPULimitParameter],
PrometheusPort: prometheusPort,
}

Expand All @@ -115,6 +150,25 @@ func ValidateChanges(
return validationErrors
}

func validateQuantityParameters(
helper *common.Plugin,
validationErrors *[]*operator.ValidationError,
parameters ...string,
) {
for _, parameter := range parameters {
value := helper.Parameters[parameter]
if value == "" {
continue
}
if _, err := resource.ParseQuantity(value); err != nil {
*validationErrors = append(
*validationErrors,
validation.BuildErrorForParameter(helper, parameter, "invalid resource quantity: "+err.Error()),
)
}
}
}

// applyDefaults fills the configuration with the defaults
func (config *Configuration) applyDefaults() {
if len(config.Labels) == 0 {
Expand Down Expand Up @@ -166,7 +220,21 @@ func (config *Configuration) ToParameters() (map[string]string, error) {
result[annotationParameter] = string(serializedAnnotations)
result[gatewayImageParameter] = config.GatewayImage
result[gatewayImagePullPolicyParameter] = string(config.GatewayImagePullPolicy)
// Omit empty optional resource params to avoid noisy defaulting diffs.
setIfNotEmpty := func(key, val string) {
if val != "" {
result[key] = val
}
}
setIfNotEmpty(gatewayMemoryRequestParameter, config.GatewayMemoryRequest)
setIfNotEmpty(gatewayMemoryLimitParameter, config.GatewayMemoryLimit)
setIfNotEmpty(gatewayCPURequestParameter, config.GatewayCPURequest)
setIfNotEmpty(gatewayCPULimitParameter, config.GatewayCPULimit)
result[documentDbCredentialSecretParameter] = config.DocumentDbCredentialSecret
setIfNotEmpty(otelMemoryRequestParameter, config.OTelMemoryRequest)
setIfNotEmpty(otelMemoryLimitParameter, config.OTelMemoryLimit)
setIfNotEmpty(otelCPURequestParameter, config.OTelCPURequest)
setIfNotEmpty(otelCPULimitParameter, config.OTelCPULimit)

return result, nil
}
Original file line number Diff line number Diff line change
Expand Up @@ -83,12 +83,56 @@ func TestFromParameters(t *testing.T) {
t.Errorf("GatewayImagePullPolicy = %q, want IfNotPresent", config.GatewayImagePullPolicy)
}
})

t.Run("resource parameters from parameters", func(t *testing.T) {
helper := &common.Plugin{Parameters: map[string]string{
"gatewayMemoryRequest": "768Mi",
"gatewayMemoryLimit": "3Gi",
"gatewayCpuRequest": "500m",
"gatewayCpuLimit": "2",
"otelMemoryRequest": "64Mi",
"otelMemoryLimit": "128Mi",
"otelCpuRequest": "100m",
}}
config, errs := FromParameters(helper)
if len(errs) != 0 {
t.Fatalf("unexpected validation errors: %v", errs)
}
if config.GatewayMemoryRequest != "768Mi" {
t.Errorf("GatewayMemoryRequest = %q, want 768Mi", config.GatewayMemoryRequest)
}
if config.GatewayMemoryLimit != "3Gi" {
t.Errorf("GatewayMemoryLimit = %q, want 3Gi", config.GatewayMemoryLimit)
}
if config.GatewayCPURequest != "500m" {
t.Errorf("GatewayCPURequest = %q, want 500m", config.GatewayCPURequest)
}
if config.GatewayCPULimit != "2" {
t.Errorf("GatewayCPULimit = %q, want 2", config.GatewayCPULimit)
}
if config.OTelMemoryRequest != "64Mi" {
t.Errorf("OTelMemoryRequest = %q, want 64Mi", config.OTelMemoryRequest)
}
if config.OTelMemoryLimit != "128Mi" {
t.Errorf("OTelMemoryLimit = %q, want 128Mi", config.OTelMemoryLimit)
}
if config.OTelCPURequest != "100m" {
t.Errorf("OTelCPURequest = %q, want 100m", config.OTelCPURequest)
}
})
}

func TestToParametersRoundTrip(t *testing.T) {
original := &Configuration{
GatewayImage: "my-image:latest",
GatewayImagePullPolicy: corev1.PullNever,
GatewayMemoryRequest: "768Mi",
GatewayMemoryLimit: "3Gi",
GatewayCPURequest: "500m",
GatewayCPULimit: "2",
OTelMemoryRequest: "64Mi",
OTelMemoryLimit: "128Mi",
OTelCPURequest: "100m",
}
original.applyDefaults()

Expand All @@ -108,4 +152,25 @@ func TestToParametersRoundTrip(t *testing.T) {
if restored.GatewayImage != original.GatewayImage {
t.Errorf("round-trip gateway image = %q, want %q", restored.GatewayImage, original.GatewayImage)
}
if restored.GatewayMemoryRequest != original.GatewayMemoryRequest {
t.Errorf("round-trip gateway memory request = %q, want %q", restored.GatewayMemoryRequest, original.GatewayMemoryRequest)
}
if restored.GatewayMemoryLimit != original.GatewayMemoryLimit {
t.Errorf("round-trip gateway memory limit = %q, want %q", restored.GatewayMemoryLimit, original.GatewayMemoryLimit)
}
if restored.GatewayCPURequest != original.GatewayCPURequest {
t.Errorf("round-trip gateway cpu request = %q, want %q", restored.GatewayCPURequest, original.GatewayCPURequest)
}
if restored.GatewayCPULimit != original.GatewayCPULimit {
t.Errorf("round-trip gateway cpu limit = %q, want %q", restored.GatewayCPULimit, original.GatewayCPULimit)
}
if restored.OTelMemoryRequest != original.OTelMemoryRequest {
t.Errorf("round-trip otel memory request = %q, want %q", restored.OTelMemoryRequest, original.OTelMemoryRequest)
}
if restored.OTelMemoryLimit != original.OTelMemoryLimit {
t.Errorf("round-trip otel memory limit = %q, want %q", restored.OTelMemoryLimit, original.OTelMemoryLimit)
}
if restored.OTelCPURequest != original.OTelCPURequest {
t.Errorf("round-trip otel cpu request = %q, want %q", restored.OTelCPURequest, original.OTelCPURequest)
}
}
Loading
Loading