Architecture¶

This page explains how the operator works internally so you can reason about its behavior when things go wrong and decide which CRD fits which job.

If you want to use the operator, start with the Quick Start. Come back here when you need to understand why.

High-level picture¶

flowchart LR
    You[You apply a<br/>NextcloudInstance] --> Op[Nextcloud Operator<br/>builds Helm values,<br/>secrets, DB &amp; bucket]
    Op -->|HelmRelease| Flux[Flux CD<br/>installs the chart]
    Flux --> NC[Nextcloud<br/>pods · service ·<br/>ingress · storage]
    style Op fill:#6366f1,stroke:#4f46e5,color:#fff
    style Flux fill:#16a34a,stroke:#15803d,color:#fff

The operator does not install Nextcloud itself — it generates Helm values and hands them to Flux, which then installs the upstream Nextcloud Helm chart. This separation of concerns means you can use any upstream chart features via spec.helm.values as an escape hatch.

The handler model¶

The operator is built on kopf and uses three categories of handlers:

Handler type	Purpose	Example
`@kopf.on.create` / `@kopf.on.update` / `@kopf.on.delete`	CRUD lifecycle	Create secrets and HelmRelease when `NextcloudInstance` is created
`@kopf.on.field`	React to specific spec changes or annotations	`k8s.bnerd.com/reconcile` annotation triggers a full reconcile
`@kopf.on.timer`	Periodic reconciliation	Every 30s: check HelmRelease status and sync it to `NextcloudInstance.status`

All handlers live in operator/handlers/ and delegate actual work to utility modules in operator/utils/. This keeps handlers thin and makes the business logic testable.

NextcloudInstance create flow¶

When you kubectl apply a NextcloudInstance, the operator runs roughly this sequence:

sequenceDiagram
    participant U as User
    participant K as K8s API
    participant O as Operator
    participant P as Percona PG<br/>Operator
    participant F as Flux<br/>HelmController

    U->>K: kubectl apply NextcloudInstance
    K->>O: @kopf.on.create
    O->>O: Validate spec
    O->>K: status.phase = Pending
    O->>K: Create secrets (db, admin, redis, s3, mail)
    alt database.managed = true
        O->>P: Create PerconaPGCluster
        P-->>O: (wait, up to 20min)
        O->>K: Read PG secret, create NC db secret
    end
    alt s3 auto-create needed
        O->>O: boto3 create_bucket()
    end
    O->>O: Resolve spec.version → chart version
    O->>O: Build Helm values (profile → spec → overrides)
    O->>K: Create HelmRepository + HelmRelease
    O->>K: status.phase = Ready (waiting for HR)
    F->>K: Installs Nextcloud chart → Pods, Svc, Ingress
    Note over O: Timer (every 30s) syncs<br/>HelmRelease readiness → NCI status

Key details:

Secrets are created first — the HelmRelease references them by name, so they must exist before the chart install.
Managed databases use a nested controller — the operator waits for PerconaPGCluster to go ready (max 20 min), then extracts the connection info into a Nextcloud-format database secret. If the PG operator isn't installed, the handler raises a PermanentError with a clear message.
Version resolution is explicit — status.versionResolution records which chart version was chosen and why. See Version Management.
4-layer value cascade — Final values = built-in profile → custom NextcloudProfile CRD → instance spec → spec.helm.values. Last wins. See Configuration Profiles.

Update flow¶

Updates are diff-based:

Compare old.spec vs. new.spec for database, redis, s3, mail, admin.
Only re-create the secrets whose source fields actually changed.
Rebuild Helm values and patch the HelmRelease.
Flux notices the HelmRelease changed and reconciles the chart.

This avoids unnecessary pod restarts when unrelated fields change.

Delete flow¶

@kopf.on.delete fires.
Delete HelmRelease and HelmRepository. Flux uninstalls the chart (pods, services, ingress).
Delete operator-managed secrets.
PVCs are retained by default — deliberate, so that accidental deletion doesn't lose data. Delete the namespace to remove everything.
Finalizer completes; the CRD is removed.

Deletion errors are logged as warnings, not failures — the handler does not block deletion on cleanup issues. Use the k8s.bnerd.com/force-delete annotation to bypass remaining cleanup if needed.

Status state machine¶

stateDiagram-v2
    [*] --> Pending: created
    Pending --> Creating: validation passed
    Creating --> Ready: HelmRelease ready
    Creating --> Failed: PermanentError
    Ready --> Updating: spec changed
    Updating --> Ready: update complete
    Updating --> Failed: update failed
    Failed --> Creating: retry (TemporaryError) or fix
    Ready --> [*]: deleted

Status transitions are driven by:

Handler return values on create/update/delete
The 30-second timer that syncs downstream HelmRelease readiness into NextcloudInstance.status

Error model¶

Every handler uses one of two kopf exceptions:

Error type	Meaning	Retries?
`kopf.TemporaryError(msg, delay=30)`	Transient — API down, managed DB not ready yet, network blip	Yes, after `delay`
`kopf.PermanentError(msg)`	Structural — invalid spec, missing required dep, unknown version	No — status goes `Failed`

When you see a log line with TemporaryError, the operator will retry automatically. When you see PermanentError, it will not self-heal — you must fix the underlying cause and either update the spec or annotate with k8s.bnerd.com/reconcile to force a fresh attempt.

The 4-CRD cascade¶

flowchart LR
    Profile[NextcloudProfile<br/>cluster-scoped<br/>defaults]
    Pool[NextcloudPool<br/>cluster-scoped<br/>template.spec]
    NC[Nextcloud<br/>namespaced<br/>tenant spec]
    NCI[NextcloudInstance<br/>namespaced<br/>physical runtime]

    Profile -->|provides defaults| Pool
    Profile -->|provides defaults| NCI
    Pool -->|template for| NCI
    NC -->|assigned from pool<br/>or creates directly| NCI

    style NCI fill:#6366f1,stroke:#4f46e5,color:#fff

NextcloudInstance is always what actually runs. Every other CRD exists to produce or configure one.
NextcloudProfile provides reusable defaults (production, testing, development, or your own).
NextcloudPool pre-warms a set of unassigned NextcloudInstance resources for fast tenant onboarding.
Nextcloud is the tenant-facing façade. It either assigns an existing pool instance or creates a fresh one.

For the decision tree of which CRD to use when, see CRD Mental Model.

Secret naming convention¶

Given a NextcloudInstance named my-nextcloud, the operator creates:

Secret	Condition
`my-nextcloud-nextcloud-db`	Always
`my-nextcloud-nextcloud-admin`	Always
`my-nextcloud-nextcloud-redis`	If `spec.redis.enabled`
`my-nextcloud-nextcloud-s3`	If `spec.s3.enabled`
`my-nextcloud-nextcloud-mail`	If `spec.mail.enabled`
`my-nextcloud-nextcloud-s3backup`	If `spec.backups.data.enabled`
`my-nextcloud-nextcloud-recording`	If `spec.spreed.recording.enabled`

Plus:

HelmRelease: my-nextcloud-nextcloud
HelmRepository: my-nextcloud-nextcloud-repo

When you reference external secrets via credentialsSecret, the operator reads them, copies the credentials into its own secret (with the naming above), and the HelmRelease references that. See Secret Management.

Integration points¶

The operator talks to several external systems. Each is optional except where noted:

System	Use	Required?
Kubernetes API	CRDs, Secrets, Namespaces	Yes
Flux CD v2	`HelmRelease`, `HelmRepository`	Yes
Percona PG Operator	Managed PostgreSQL via `PerconaPGCluster`	Only for `database.managed: true`
S3 API (boto3)	Auto-create primary-storage bucket	Only for `s3.enabled` with auto-create
bnerd backup operator	Data backup via `S3Backup` CRD	Only for `backups.data.enabled`
HPB Signaling API	Backend registration for Talk HPB	Only when `SignalingServer` CRD is used
Recording Backend API	Backend registration for Talk recording	Only when `RecordingServer` CRD is used
Upstream Helm repo	Chart download (via Flux)	Yes