Skip to content

Commit

Permalink
Merge branch 'main' into taxonomy-of-failsafe-levels
Browse files Browse the repository at this point in the history
  • Loading branch information
josephineSei authored Sep 25, 2024
2 parents 2a52226 + 4582aec commit 1fbab3a
Show file tree
Hide file tree
Showing 10 changed files with 490 additions and 45 deletions.
2 changes: 2 additions & 0 deletions .zuul.d/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
- job:
name: scs-check-adr-syntax
parent: base
nodeset: pod-fedora-40
pre-run: playbooks/pre.yaml
run: playbooks/adr_syntax.yaml
- job:
Expand All @@ -26,6 +27,7 @@
secrets:
- name: clouds_conf
secret: SECRET_STANDARDS
nodeset: pod-fedora-40
vars:
preset: default
pre-run:
Expand Down
22 changes: 15 additions & 7 deletions Standards/scs-0001-v1-sovereign-cloud-standards.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ embedded in the markdown header.
| Field name | Requirement | Description |
| --------------- | -------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
| `type` | REQUIRED | one of `Procedural`, `Standard`, `Decision Record`, or `Supplement` |
| `status` | REQUIRED | one of `Proposal`, `Draft`, `Stable`, `Deprecated`, or `Rejected` |
| `status` | REQUIRED | one of `Draft`, `Stable`, `Deprecated`, or `Rejected` |
| `track` | REQUIRED | one of `Global`, `IaaS`, `KaaS`, `IAM`, `Ops` |
| `supplements` | REQUIRED precisely when `type` is `Supplement` | list of documents that are extended by this document (e.g., multiple major versions) |
| `deprecated_at` | REQUIRED if `status` is `Deprecated` | ISO formatted date indicating the date after which the deprecation is in effect |
Expand Down Expand Up @@ -167,11 +167,11 @@ In addition, the following OPTIONAL sections should be considered:
## Process

The lifecycle of an SCS document goes through the following phases:
Proposal, Draft, Stable, Deprecated, and Rejected.
Draft, Stable, Deprecated, and Rejected.

```mermaid
graph TD
A[Proposal] -->|Pull Request| B[Draft]
A["Draft (Proposal)"] -->|Pull Request| B[Draft]
B -->|Pull Request| D[Stable]
B -->|Pull Request| E[Rejected]
D -->|Pull Request| F[Deprecated]
Expand All @@ -195,8 +195,15 @@ Supplements may be kept in Draft state, because they are not authoritative.
To propose a new SCS document,
a community participant creates a pull request on GitHub
against the [standards repository in the SovereignCloudStack organisation][scs-standards-repo].

The pull request MUST add exactly one SCS document,
In the beginning, the pull request will contain a draft of an SCS document and
the community participant should present it to the SCS community.
They may refer to the [SCS Community page](https://docs.scs.community/community/)
for an overview of applicable means of communication and online meetings
to get in touch with the SCS community.
Community participants are encouraged to present their proposal to the SCS community early on.
Note that the proposal draft's content does not need to be finished in any way at this stage.

The pull request for the proposal MUST add exactly one SCS document,
in the `Standards` folder.
In the proposal phase,
the document number MUST be replaced with `xxxx` in the file name,
Expand All @@ -209,15 +216,16 @@ for a Supplement of `scs-0100-v3-flavor-naming.md`,
the file name might be `scs-0100-w1-flavor-naming-implementation-testing.md` (note the `w1`!).

The metadata MUST indicate the intended `track` and `type` of the document,
and the `status` MUST be set to `Proposal`;
and the `status` MUST be set to `Draft`;
for a Supplement, the `supplements` field MUST be set
to a list of documents (usually containing one element).

Upon acceptance by the group of people identified by the `track`,
a number is assigned
(the next unused number)
and the proposer is asked
to rename the file to replace the `xxxx` with that number.
to rename the file to replace the `xxxx` with that number
before the merge of the pull request.

**Note:**
Documents on the `Design Record` track MAY be proposed or accepted directly into `Stable` state,
Expand Down
97 changes: 97 additions & 0 deletions Standards/scs-0117-v1-volume-backup-service.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
---
title: Volume Backup Functionality
type: Standard
status: Draft
track: IaaS
---

## Introduction

OpenStack offers a variety of resources where users are able to transfer and store data in the infrastructure.
A prime example of these resources are volumes which are attached to virtual machines as virtual block storage devices.
As such they carry potentially large amounts of user data which is constantly changing at runtime.
It is important for users to have the ability to create backups of this data in a reliable and effifcient manner.

## Terminology

| Term | Meaning |
|---|---|
| CSP | Cloud Service Provider, provider managing the OpenStack infrastructure |
| IaaS | Abbreviation for Infrastructure as a Service |
| Image | IaaS resource representing a snapshot of a block storage disk, can be used to create Volumes |
| Volume | IaaS resource representing a virtual block storage device that can be attached as a disk to virtual machines |

## Motivation

The [volume backup functionality of the Block Storage API](https://docs.openstack.org/cinder/latest/admin/volume-backups.html) is a feature that is not available in all clouds per default, e.g., in OpenStack.
The feature requires a backend to be prepared and configured correctly before it can be used.
In the Block Storage service, the backup storage backend is usually configured separately from the storage backend of the general volume service and may not be mandatory.
Thus, an arbitrary cloud may or may not offer the backup feature in the Block Storage API.

This standard aims to make this functionality the default in SCS clouds so that customers can expect the feature to be usable.

## Design Considerations

The standard should make sure that the feature is available and usable but should not limit the exact implementation (e.g. choice of backend driver) any further than necessary.

### Options considered

#### Only recommend volume backup feature, use images as alternative

As an alternative to the volume backup feature of the Block Storage API, images can also be created based on volumes and act as a backup under certain circumstances.
As an option, this standard could keep the actual integration of the volume backup feature optional and guide users how to use images as backup targets instead in case the feature is unavailable.

However, it is not guaranteed that the image backend storage is separate from the volume storage.
For instance, both could be using the same Ceph cluster.
In such case, the images would not count as genuine backups.

Although users are able to download images and transfer them to a different storage location, this approach might also prove unfeasible depending on the image size and the existence (or lack) of appropriate target storage on the user side.

Furthermore, incremental backups are not possible when creating images from volumes either.
This results in time-consuming backup operations of fully copying a volume everytime a backup is created.

#### Focus on feature availability, make feature mandatory

This option is pretty straightforward.
It would make the volume backup feature mandatory for SCS clouds.
This way users can expect the feature to be available and usable.

With this, users can leverage functionalities like incremental backups and benefit from optimized performance of the backup process due to the tight integration with the volume service.

However, it does not seem feasible to also mandate having a separate storage backend for volume backups at the same time due to potential infrastructure limitations at CSP-side making it hard or even impossible to offer.
As such, the actual benefit of backups in terms of reliability and security aspects would be questionable if a separate storage backend is not mandated and therefore not guaranteed.

This approach would focus on feature availability rather than backup reliability.

#### Focus on backup reliability, make separate backend mandatory

As an alternative, the volume backup feature availability could be made optional but in case a CSP chooses to offer it, the standard would mandate a separate storage backend to be used for volume backups.
This way, failures of the volume storage backend would not directly impact the availability and safety of volume backups, making them actually live up to their name.

In contrast to the above, this approach would focus on backup reliability rather than feature availability.

## Standard

This standard decides to go with the second option and makes the volume backup feature mandatory in the following way:

In an SCS cloud, the volume backup functionality MUST be configured properly and its API as defined per `/v3/{project_id}/backups` MUST be offered to customers.
If using Cinder, a suitable [backup driver](https://docs.openstack.org/cinder/latest/configuration/block-storage/backup-drivers.html) MUST be set up.

The volume backup target storage SHOULD be a separate storage system from the one used for volumes themselves.

## Related Documents

- [OpenStack Block Storage v3 Backup API reference](https://docs.openstack.org/api-ref/block-storage/v3/index.html#backups-backups)
- [OpenStack Volume Backup Drivers](https://docs.openstack.org/cinder/latest/configuration/block-storage/backup-drivers.html)

## Conformance Tests

Conformance tests include using the `/v3/{project_id}/backups` Block Storage API endpoint to create a volume and a backup of it as a non-admin user and subsequently restore the backup on a new volume while verifying the success of each operation.
These tests verify the mandatory part of the standard: providing the Volume Backup API.

There is a test suite in [`volume-backup-tester.py`](https://github.com/SovereignCloudStack/standards/blob/main/Tests/iaas/volume-backup/volume-backup-tester.py).
The test suite connects to the OpenStack API and executes basic operations using the volume backup API to verify that the functionality requested by the standard is available.
Please consult the associated [README.md](https://github.com/SovereignCloudStack/standards/blob/main/Tests/iaas/volume-backup/README.md) for detailed setup and testing instructions.

Note that these tests don't verify the optional part of the standard: providing a separate storage backend for Cinder volume backups.
This cannot be checked from outside of the infrastructure as it is an architectural property of the infrastructure itself and transparent to customers.
70 changes: 70 additions & 0 deletions Tests/iaas/volume-backup/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Volume Backup API Test Suite

## Test Environment Setup

### Test Execution Environment

> **NOTE:** The test execution procedure does not require cloud admin rights.
To execute the test suite a valid cloud configuration for the OpenStack SDK in the shape of "`clouds.yaml`" is mandatory[^1].
**The file is expected to be located in the current working directory where the test script is executed unless configured otherwise.**

[^1]: [OpenStack Documentation: Configuring OpenStack SDK Applications](https://docs.openstack.org/openstacksdk/latest/user/config/configuration.html)

The test execution environment can be located on any system outside of the cloud infrastructure that has OpenStack API access.
Make sure that the API access is configured properly in "`clouds.yaml`".

It is recommended to use a Python virtual environment[^2].
Next, install the OpenStack SDK required by the test suite:

```bash
pip3 install openstacksdk
```

Within this environment execute the test suite.

[^2]: [Python 3 Documentation: Virtual Environments and Packages](https://docs.python.org/3/tutorial/venv.html)

## Test Execution

The test suite is executed as follows:

```bash
python3 volume-backup-tester.py --os-cloud mycloud
```

As an alternative to "`--os-cloud`", the "`OS_CLOUD`" environment variable may be specified instead.
The parameter is used to look up the correct cloud configuration in "`clouds.yaml`".
For the example command above, this file should contain a `clouds.mycloud` section like this:

```yaml
---
clouds:
mycloud:
auth:
auth_url: ...
...
...
```

If the test suite fails and leaves test resources behind, the "`--cleanup-only`" flag may be used to delete those resources from the domains:

```bash
python3 volume-backup-tester.py --os-cloud mycloud --cleanup-only
```

For any further options consult the output of "`python3 volume-backup-tester.py --help`".

### Script Behavior & Test Results

> **NOTE:** Before any execution of test batches, the script will automatically perform a cleanup of volumes and volume backups matching a special prefix (see the "`--prefix`" flag).
> This cleanup behavior is identical to "`--cleanup-only`".
The script will print all cleanup actions and passed tests to `stdout`.

If all tests pass, the script will return with an exit code of `0`.

If any test fails, the script will halt, print the exact error to `stderr` and return with a non-zero exit code.

In case of a failed test, cleanup is not performed automatically, allowing for manual inspection of the cloud state for debugging purposes.
Although unnecessary due to automatic cleanup upon next execution, you can manually trigger a cleanup using the "`--cleanup-only`" flag of this script.
Loading

0 comments on commit 1fbab3a

Please sign in to comment.