Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cosmos] Control-Plane Over Data-Plane APIs (Database/Container CRUD) #1853

Open
wants to merge 7 commits into
base: feature/track2
Choose a base branch
from

Conversation

analogrelay
Copy link
Member

@analogrelay analogrelay commented Oct 14, 2024

This PR adds the "control-plane over data-plane" APIs (basically Database and Container CRUD) to the Cosmos Rust SDK.

These APIs require that you authenticate with a key. Using Entra ID to call these APIs will result in an error (instead, you're supposed to use ARM management APIs if you want to use Entra ID). So, in order to clarify that, I've placed a warning in the API docs:

image

This warning comes from a single shared markdown file, so we can edit it in one place.

In addition, I've placed these APIs behind a feature flag, control_plane. Enabling this feature flag automatically enables the key_auth feature flag, because you have to use key-based auth and it wouldn't make sense to have one without the other.

If a user were referencing our crate from their app, they'd enable these APIs using this syntax in the Cargo.toml file:

[dependencies]
azure_data_cosmos = { version = "...", features = [ "control_plane" ] }

We're hoping to guide users towards Entra ID as much as possible, hence putting key-based auth behind a feature flag. Removing these feature flags later should be a non-breaking change, so I'd rather do it now and seek customer feedback rather than trying to remove/limit these APIs later.

You can test these APIs out with new commands on the cosmos "meta-example" (we provide a single example with multiple subcommands rather than separate examples. The create and delete commands are broken up in to item, database and container subcommands. However, because of the feature flags, to see the database and container ones, you'll need to run the example like this:

$ cargo run --example cosmos --features control_plane -- [endpoint] delete database [db name]
# etc..

There's no way, that I can see, to "enable" a feature flag in an example, so you have to do it manually when running the example. Not ideal, but I think the only alternative is to move the example to its own crate.

I've tested these APIs manually but we're waiting on getting Test Proxy integrated into this repo before we can build fully automated integration tests. Hopefully that's coming soon!

@@ -56,6 +56,7 @@ base64 = "0.22"
bytes = "1.0"
cargo_metadata = "0.18.1"
clap = { version = "4.4.16", features = ["derive"] }
cfg-if = "1.0.0"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used cfg-if to simplify some conditional compilation in azure_data_cosmos. It's a relatively low-impact crate, but I'm open to removing it and going back to a series of #[cfg(...)]-attributed definitions if that's preferred. (cc @heaths )


/// Creates a new item.
#[cfg(feature = "control_plane")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly, it sees we need to conditionally-compile our use statements as well to avoid 'unused import' lints. I'd prefer it if clippy considered all features (maybe we just need to run it with --all-features 🤔 ) since extraneous imports don't impact the functionality, but alas.

sdk/cosmos/azure_data_cosmos/src/models/mod.rs Outdated Show resolved Hide resolved
@@ -9,6 +9,19 @@ mod query_options;
mod read_container_options;
mod read_database_options;

cfg_if::cfg_if! {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the usage of cfg_if. Without it, we'd have to repeat the #[cfg] attribute on each individual mod and pub use line.

@@ -21,7 +21,7 @@ use url::form_urlencoded;
#[cfg(feature = "key_auth")]
use azure_core::{credentials::Secret, hmac::hmac_sha256};

const AZURE_VERSION: &str = "2018-12-31";
const AZURE_VERSION: &str = "2020-07-15";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating containers with hierarchical partition keys requires this version. I grabbed the string from the JS SDK.

@analogrelay analogrelay changed the title [WIP] [Cosmos] Control Plane-Over-Data Plane APIs (Database/Container CRUD) [Cosmos] Control Plane-Over-Data Plane APIs (Database/Container CRUD) Oct 15, 2024
@analogrelay analogrelay marked this pull request as ready for review October 15, 2024 21:44
@analogrelay analogrelay changed the title [Cosmos] Control Plane-Over-Data Plane APIs (Database/Container CRUD) [Cosmos] Control-Plane Over Data-Plane APIs (Database/Container CRUD) Oct 15, 2024
@@ -0,0 +1,5 @@
<div class="warning">

This is a control-plane API and requires that you authenticate using a key. To use Entra ID to perform this operation, you must use the [Azure Resource Manager APIs](https://learn.microsoft.com/en-us/azure/templates/microsoft.documentdb/databaseaccounts).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR does the right thing for the point in time we're at, but just pointing out that I believe there is a plan to support these operations via Entra ID at some point in 2025, and also to integrate the dataplane RBAC with normal Azure RBAC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the intent is that this feature can be "removed" later. It will have to remain present, and continue to imply key_auth to avoid breaking users, but the #[cfg] attributes referencing it would go away and the methods related to it would just become available to everyone. At that point users would be able to remove the feature reference from their Cargo.toml (possibly replacing with key_auth if they still need that support)

/// Describes the properties used to create a new container.
#[derive(Clone, Debug, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct ContainerDefinition {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't there more things here - throughput options, VectorEmbeddingPolicy, etc?

Ok to start with this and add more if we can do it non-breaking though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first pass I was going off the public REST API docs but I'll start digging in to what other SDKs support here. If it's straightforward (most of the structures already exist for ContainerProperties) I'll add them in this PR!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the remaining things. Basically, I removed ContainerDefinition and am using ContainerProperties for both the Create request body and the Read response body. This mirrors what we do in Go.

The SystemProperties struct we use to contain things like _etag is still marked #[non_exhaustive] though, which means it's not constructable using struct syntax (SystemProperties {...} will produce an error outside the defining crate). However, the SystemProperties crate implements Default, so SystemProperties::default() can be used to construct an "empty" value.

I think this benefits us, since we don't want users to try to set those values anyway (even if they're ignored by the server). So, to construct a ContainerProperties, you have to use this syntax:

db_client.create_container(
    ContainerProperties {
        id: "MyContainer".to_string(),
        partition_key: "/partitionKey".into(),
        ..Default::default() // Set all remaining fields to their default
    }, None);

Using ..Default::default() is allowed, because it uses SystemProperties::default() (provided by it's own implementation of Default) to construct that object, which results in all the properties being None. We already want users to use this syntax to create all of our request models, so that we can safely add new properties later. So, this should work out fine.

I added a test which should serve to make sure we don't add non-optional properties to ContainerProperties in the future, and retain the same JSON serialization whenever a user uses the basic syntax above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throughput options

These are actually specified as headers and appear on the options type in other SDKs like Go, so they'll come later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. One thing that occurred to me is that we might need to do something extra work with deserialization of ContainerProperties and its nested types. To support older SDKs talking to newer versions of the service, we might need to be able to roundtrip ContainerProperties for things we don't know about in the SDK. The scenario is that I want to modify some property, so I query the current properties, make some change, and then set the properties. If we don't searialize everything we get back, we might lose options.

AFAIK, .NET handles this with [JsonExtensionData]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe serde does support a similar mechanism, I'll look in to it (but might merge this first, as you suggest in your review comment)

Copy link
Member

@Pilchie Pilchie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment about forward compatibility, but okay to merge this while we design how to deal with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants