Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application-aware backups: pre / post hooks #569

Open
majodev opened this issue Jan 23, 2024 · 0 comments
Open

Application-aware backups: pre / post hooks #569

majodev opened this issue Jan 23, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@majodev
Copy link

majodev commented Jan 23, 2024

Describe the feature you'd like to have.

Similar to what k8up ("application-aware") and velero ("hooks") provide, I propose to include a mechanism to run pre/post-snapshot commands via snapscheduler's operator. Let's simply call these commands hooks.

What is the value to the end user? (why is it a priority?)

For example, this would allow to dump a database straight onto the disk or run other WAL-flush / fs-freeze / fs-flush tooling before the CSI snapshot is actually triggered (as part of a pre backup hook). After a snapshot is completed, a post hook may run e.g. for fs-unfreeze handling.

How will we know we have a good solution? (acceptance criteria)

General design

In my opinion and as a starting-point, this feature should be implemented by directly passing-off commands to already running containers/pods (kubectl exec from the operator). The major advantage of this approach is immediate support for all PV access modes, especially when it comes to ReadWriteOncePod and ReadWriteOnce (though this could be mitigated by using proper affinity settings for same host execution).

With this approach, there is no need to inject side-containers or run additional pods/containers (e.g. a proper k8s Job) for this feature. Pre and post hooks could be anything and only depend on the available tools/shell within the targeted running container, where it is executed. In contract to k8up/velero, these commands should not result in a stream / artifact, which needs to be processed by the operator. Only the exit code is inspected. Hooks could be anything, but will typically do something on a mounted disk.

Out of scope
  • The obvious limitation is that pre and post hooks will not work if the pod/container the operator is targeting is currently not running. This is however fine in my opinion. The option to mark their execution as optional might even be a nice-to-have feature (skipIfNoSelectorFound).
CRD changes

There are 3 viable approaches when I comes to defining pre/post hooks:

  • as a pod-lvl annotation,
  • as part of a new CRD,
  • as part of the SnapshotSchedule CRD.

I'm not a fan of having annotations on pods for this feature, as manipulating the statefulsets / deployments (which typically manage those pods) will cause restarts / service disruptions (this is how k8up and velero does it). This design makes it hard to experiment with already running applications in my opinion and thus hard to introduce in preexisting services.

Having a new CRD for this feature is overkill in my opinion, thus I would propose having this as part of the SnapshotSchedule CRD.

We could do the following:

---
apiVersion: snapscheduler.backube/v1
kind: SnapshotSchedule
metadata:
  name: hourly
  namespace: my-ns
spec:
  # [...]
  # pre hooks are executed within the container of a target pod in serial order before the actual snapshot is triggered
  preHooks:

    # full example dumping a database
    - podSelector: # required (LabelSelector)
        matchLabels:
          app: my-database
      command: # required (^= a non zero exit code will cause the snapshot to fail)
        - /bin/bash
        - -c
        - "pg_dump $DATABASE | gzip > /mnt/data-disk/dump.sql.gz"
      container: postgres # optional, defaults to first container in target pod
      namespace: my-ns # optional, defaults to same namespace as SnapshotSchedule
      timeoutSeconds: 60 # optional, defaults to 300 seconds (5 minutes)
      backoffLimit: 2 # optional, defaults to 0 (no retries, snapshot is immediately failed)
      skipIfNoSelectorFound: true # optional, defaults to false (^= the default will not create a snapshot if no pod matches the selector!)

    # minimal example freezing the fs
    - podSelector:
        matchLabels:
          app: nginx
      command: ["/sbin/fsfreeze", "--freeze", "/var/log/nginx"] 
      container: fsfreeze

  # post hooks are executed within the container of a target pod in serial order after the snapshot was triggered
  # Note that post hooks may fail, but this will not cause the already generated snapshot to vanish/fail!
  postHooks:

    # full example unfreezing the fs
    - podSelector:
        matchLabels:
          app: nginx
      command: ["/sbin/fsfreeze", "--unfreeze", "/var/log/nginx"] 
      container: fsfreeze
      namespace: my-ns # optional, defaults to same namespace as SnapshotSchedule
      # These fields are NOT supported for postHooks:
      # timeoutSeconds, backoffLimit, skipIfNoSelectorFound

This is just a rough first draft and still lacks some details (e.g. how are failure states represented), but I think it's a good starting point for discussion.

Alternatives

An alternative approach may be to allow to integrate full pod specifications (pre and post pods) into the SnapshotSchedule CRD, e.g. like k8up's pre backup pods. However I really think, this would get pretty complicated soon and will require a way more sophisticated job design.

Additional context

  • The fs freeze (fsfreeze -f [example-disk-location]) / unfreeze (fsfreeze -u [example-disk-location]) handling is a typical example of GCPs linux application consistant snapshots which must be defined per host (of course unacceptable in a k8s environment).
  • I actually like that the hooks could target pods/containers in other namespaces (and I also love snapschedulers SnapshotSchedule design, the CRD as namespaced resource). This would make it also possible to pass of actual hook handling to a centralized service in the cluster (in a specific namespace), therefore I think it's a good idea to include it.
  • I would need this at my dayjob (GKE clusters). While snapscheduler would already integrate with GCP disk snapshots (well its CSI snapshots after all), it currently provides no major benefit when compared to (GCP disk snapshots schedules). Having support for pre/post hooks at the CSI snapshot level would be super useful and also a quite unique feature to have that's hard to replicate outside of the cluster in a managed solution.
@majodev majodev added the enhancement New feature or request label Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant