feat: additional tasks for artifact size metrics plugin #36

0marperez · 2024-04-17T16:41:14Z

Description of changes:
Adds two new tasks to the plugin:

collectDelegatedArtifactSizeMetrics: Gets ASM from S3 after codebuild generates them
putArtifactSizeMetricsInCloudWatch: Puts ASM in cloudwatch

Some refactoring

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

lauzadis · 2024-04-17T16:47:05Z

build-plugins/build-support/build.gradle.kts

+publishing {
+    repositories {
+        mavenLocal()
+    }
+}
+


correctness: you shouldn't have to configure mavenLocal(), it's already done here I believe

You're right, I wasn't able to get local publishing working before. I must've misconfigured something but it works now

lauzadis · 2024-04-17T16:51:21Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+ */
+internal abstract class CollectDelegatedArtifactSizeMetrics : DefaultTask() {
+    /**
+     * The file where the artifact size metrics will be stored, defaults to /build/reports/metrics/artifact-size-metrics.csv


nit: remove leading slash in /build because it's actually a relative path. the way it's written makes me think the file will literally be written to /build/...

lauzadis · 2024-04-17T16:52:24Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+ * This task should typically be run after codebuild gathers metrics and puts them in S3 during CI but can also be used to
+ * query the metrics bucket if you modify the file key filter.
+ */
+internal abstract class CollectDelegatedArtifactSizeMetrics : DefaultTask() {


question: what is delegated here? I would clarify in the KDocs or remove it from the name

lauzadis · 2024-04-17T17:03:40Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+        val pullRequestNumber = if (project.hasProperty("pullRequest")) {
+            project.property("pullRequest")
+                .toString()
+                .let { releaseProperty ->
+                    releaseProperty.ifEmpty { // "-PpullRequest=" (no value set)
+                        null
+                    }
+                }
+        } else {
+            null
+        }


simplification: val pullRequestNumber = project.findProperty("pullRequest")?.toString()?.takeIf { it.isNotEmpty() }

same applies for releaseTag

lauzadis · 2024-04-17T17:04:50Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+        val pluginConfig = this.project.rootProject.extensions.getByType(ArtifactSizeMetricsPluginConfig::class.java)
+
+        val relevantArtifactSizeMetricsFileKeys = artifactSizeMetricsFileKeys.filter {
+            it?.startsWith("[TEMP]${pluginConfig.projectRepositoryName}-$identifier-") == true


question: isn't the [TEMP] prefix only used for temporary metrics for PRs? What if the users want to fetch metrics using a releaseTag?

Yeah, this task is mainly meant for CI. Querying the bucket would be something we might have to do from time to time and sort of a secondary use case I noticed. To do so just change the prefix filter manually and publish to maven local. The format for a release tag for example would be: "$repo-$releaseTag-release"

lauzadis · 2024-04-17T17:07:46Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+        }
+    }
+
+    private fun getFiles(keys: List<String?>): List<String> {


style: Why should a key be null at this point? We should filter out null keys earlier and have this be a known List<String>

lauzadis · 2024-04-17T17:08:48Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+                            },
+                        ) { file ->
+                            files.add(
+                                file.body?.decodeToString() ?: throw AwsSdkGradleException("Metrics file $file is missing a body"),


nit: the exception message should probably use key instead of file, Metrics file $key ...

lauzadis · 2024-04-17T17:11:42Z

...tlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/PutArtifactSizeMetricsInCloudWatch.kt

+    fun put() {
+        val currentTime = Instant.now()
+        val pluginConfig = this.project.rootProject.extensions.getByType(ArtifactSizeMetricsPluginConfig::class.java)
+        val release = project.property("release").toString().let {


naming (consistency): releaseTag

lauzadis · 2024-04-17T17:13:04Z

...tlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/PutArtifactSizeMetricsInCloudWatch.kt

+                                value = artifactSize
+                                dimensions = listOf(
+                                    Dimension {
+                                        name = "Release"


naming suggestion: instead of "Release" I think "Version" is more clear

lauzadis · 2024-04-17T17:13:50Z

...tlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/PutArtifactSizeMetricsInCloudWatch.kt

+
+                    cloudWatch.putMetricData {
+                        namespace = "Artifact Size Metrics"
+                        metricData = listOf(


question: Why does the same metric get included 3 times with different dimensions? Can it just be declared once with all the dimensions configured?

ianbotsf · 2024-04-17T19:05:13Z

...tlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/PutArtifactSizeMetricsInCloudWatch.kt

+        val currentTime = Instant.now()
+        val pluginConfig = this.project.rootProject.extensions.getByType(ArtifactSizeMetricsPluginConfig::class.java)
+        val release = project.property("release").toString().let {
+            check(it.isNotEmpty()) { "The release property is empty. Please specify a value." } // "-Prelease=" (no value set)


Suggestion: The hint about "-Prelease=" (no value set) would be even better in the error message itself.

ianbotsf · 2024-04-17T19:06:18Z

.../main/kotlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/AnalyzeArtifactSizeMetrics.kt

@@ -46,6 +46,8 @@ internal abstract class AnalyzeArtifactSizeMetricsTask : DefaultTask() {
        hasSignificantChangeFile.convention(project.layout.buildDirectory.file(OUTPUT_PATH + "has-significant-change.txt"))
    }

+    private val pluginConfig = this.project.rootProject.extensions.getByType(ArtifactSizeMetricsPluginConfig::class.java)


Nit: Unnecessary this

ianbotsf · 2024-04-17T19:07:58Z

.../main/kotlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/AnalyzeArtifactSizeMetrics.kt

                GetObjectRequest {
-                    bucket = "artifact-size-metrics-aws-sdk-kotlin" // TODO: Point to artifact size metrics bucket
-                    key = "artifact-size-metrics.csv" // TODO: Point to artifact size metrics for latest release
+                    bucket = S3_ARTIFACT_SIZE_METRICS_BUCKET
+                    key = "${pluginConfig.projectRepositoryName}-latest-release.csv"
                },


Question: What component will put this object? How will we track historical release metrics in addition to the latest release metrics?

This is done in a GitHub workflow that I haven't PR'd.
After every release and after calculating metrics:

aws s3 cp artifact-size-metrics.csv s3://${{ secrets.ARTIFACT_METRICS_BUCKET }}/$REPOSITORY-${{ github.event.release.tag_name }}-release.csv aws s3 cp artifact-size-metrics.csv s3://${{ secrets.ARTIFACT_METRICS_BUCKET }}/$REPOSITORY-latest-release.csv

This stores the historical metrics and creates/overrides the latest metrics file

ianbotsf · 2024-04-17T20:57:40Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+    private fun getFiles(keys: List<String?>): List<String> {
+        val files = mutableListOf<String>()
+
+        runBlocking {
+            S3Client.fromEnvironment().use { s3 ->
+                keys.forEach { k ->
+                    k?.let {
+                        s3.getObject(
+                            GetObjectRequest {
+                                bucket = S3_ARTIFACT_SIZE_METRICS_BUCKET
+                                key = k
+                            },
+                        ) { file ->
+                            files.add(
+                                file.body?.decodeToString() ?: throw AwsSdkGradleException("Metrics file $file is missing a body"),
+                            )
+                        }
+                    }
+                }
+            }
+        }
+
+        return files
+    }


Question: In practice, how many objects will we be fetching serially?

For the SDK, which would have the most artifacts it's up to us. We can select how many artifacts to generate metrics for. If we decide we want all the services it would be 764+ (382 services * 2 because of the closures. Plus a few other non service artifacts).

The objects should be pretty light weight (a few kb from what I've seen in the test objects I've created).

I think if the answer is more than 3 or 4 we should be parallelizing these calls.

ianbotsf · 2024-04-17T21:00:26Z

...tlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/PutArtifactSizeMetricsInCloudWatch.kt

+                metrics.forEach { metric ->
+                    val split = metric.split(",").map { it.trim() }
+                    val artifactName = split[0]
+                    val artifactSize = split[1].toDouble()


Question: Why toDouble()? Won't these sizes be in whole bytes?

The value is generated as a Long but the cloudwatch putMetric operation requires a metrics value to be a Double. Unless we have an absurdly large artifact I think it should be fine to convert to Double without losing precision.

Ah I see. Maybe leave a comment on this line here about why we're going straight from String to double.

ianbotsf · 2024-04-17T21:01:22Z

...tlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/PutArtifactSizeMetricsInCloudWatch.kt

+        val metrics = metricsFile
+            .get()
+            .asFile
+            .readLines()
+            .drop(1) // Ignoring header
+
+        runBlocking {
+            CloudWatchClient.fromEnvironment().use { cloudWatch ->
+                metrics.forEach { metric ->
+                    val split = metric.split(",").map { it.trim() }
+                    val artifactName = split[0]
+                    val artifactSize = split[1].toDouble()


Question: In practice, how many metrics will we be sending to CloudWatch serially?

From this comment, the maximum we will see (for now) is ~764. I think this shouldn't cause any performance issues. I should confirm this though and run some tests.

PutMetricData requests can contain up to 1MB of data and 1000 metrics. I think we should batch this by collecting metrics up to 1000 and then calling putMetricData. Maybe we'll always limit the number of services to a handful but we can still be more efficient in how we call CloudWatch.

…of AWS operations

ianbotsf · 2024-04-18T22:10:00Z

...lin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/CollectDelegatedArtifactSizeMetrics.kt

+    private fun getFiles(keys: List<String>): List<String> = runBlocking {
+        val files = mutableListOf<Deferred<String>>()

-        runBlocking {
-            S3Client.fromEnvironment().use { s3 ->
-                keys.forEach { k ->
-                    k?.let {
+        S3Client.fromEnvironment().use { s3 ->
+            keys.forEach { k ->
+                files.add(
+                    async {
                        s3.getObject(
                            GetObjectRequest {
                                bucket = S3_ARTIFACT_SIZE_METRICS_BUCKET
                                key = k
                            },
                        ) { file ->
-                            files.add(
-                                file.body?.decodeToString() ?: throw AwsSdkGradleException("Metrics file $file is missing a body"),
-                            )
+                            file.body?.decodeToString() ?: throw AwsSdkGradleException("Metrics file $k is missing a body")
                        }
-                    }
-                }
+                    },
+                )
            }
+            return@runBlocking files.awaitAll()
        }
-
-        return files
    }


Nit: You don't need a mutable list reference to track the deferred results, you can just use map:

private fun getFiles(keys: List<String>) = S3Client.fromEnvironment().use { s3 -> keys .map { k -> async { s3.getObject( GetObjectRequest { bucket = S3_ARTIFACT_SIZE_METRICS_BUCKET key = k }, ) { it.body?.decodeToString() ?: throw AwsSdkGradleException("Metrics file $k is missing a body") } } }.awaitAll() }

For clarity you can even extract the getObject call to a helper function:

private suspend fun S3Client.getObjectAsText(key: String) = getObject( GetObjectRequest { bucket = S3_ARTIFACT_SIZE_METRICS_BUCKET key = k }, ) { it.body?.decodeToString() ?: throw AwsSdkGradleException("Metrics file $file is missing a body") } private fun getFiles(keys: List<String>) = S3Client.fromEnvironment().use { s3 -> keys.map { k -> async { s3.getObjectAsText(k) } }.awaitAll() }

lauzadis · 2024-04-19T14:04:20Z

...c/main/kotlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/ArtifactSizeMetricsPlugin.kt

@@ -85,8 +85,44 @@ private fun Project.registerRootProjectArtifactSizeMetricsTask(
 }

 open class ArtifactSizeMetricsPluginConfig {
+    /**
+     * Changes the prefix used to get artifact size metrics in the
+     * "collectDelegatedArtifactSizeMetrics" task. For developer use only


nit: I'd say this whole repository is "For developer use only", the comment is kind of unnecessary

lauzadis · 2024-04-19T14:05:50Z

...c/main/kotlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/ArtifactSizeMetricsPlugin.kt

    var artifactPrefixes: Set<String> = emptySet()
+
+    /**
+     * Same as artifactPrefixes but considers the whole closure


nit: Can you repeat the KDocs here instead of redirecting to artifactPrefixes

lauzadis · 2024-04-19T14:10:37Z

...tlin/aws/sdk/kotlin/gradle/plugins/artifactsizemetrics/PutArtifactSizeMetricsInCloudWatch.kt

-            check(it.isNotEmpty()) { "The release property is empty. Please specify a value." } // "-Prelease=" (no value set)
-            it
+        val pluginConfig = project.rootProject.extensions.getByType(ArtifactSizeMetricsPluginConfig::class.java)
+        val releaseTag = project.property("release").toString().also {


correctness: .property(...) will throw an exception if the property is not set. I think you want to use findProperty and then handle the potential null / empty value.

0marperez added 9 commits April 8, 2024 10:40

Added task to combine delegated artifact size metrics from S3

d6e36c2

Put ASM in CloudWatch task

ac5b72a

Refactoring

8a0a4a9

Minor bug fixes in CollectDelegatedArtifactSizeMetrics

c4c38f8

Minor bug fixes for AnalyzeArtifactSizeMetrics

b9ca904

Add repo name to plugin config

817d40c

Documentation updates & formatting

86710ca

Self review

30aed78

Remove testing bucket name

025e42b

lauzadis reviewed Apr 17, 2024

View reviewed changes

ianbotsf reviewed Apr 17, 2024

View reviewed changes

0marperez added 7 commits April 18, 2024 10:25

Correctnes and improvement changes. Pending possible parallelization …

048cf97

…of AWS operations

Batch put cloudwatch metrics

53cf5da

Parallelize object GETs and chunk cloudwatch PUTs

0095c4c

Comment explaining metric chunking

8ac4398

Make it easier to query bucket with plugin config

b514d90

Docs and safety checks

742a582

Simplifications

c24f883

ianbotsf approved these changes Apr 18, 2024

View reviewed changes

lauzadis approved these changes Apr 19, 2024

View reviewed changes

Team feedback

55cba6e

0marperez merged commit fb38865 into main Apr 19, 2024
4 of 5 checks passed

0marperez deleted the artifact-size-metrics branch April 19, 2024 15:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: additional tasks for artifact size metrics plugin #36

feat: additional tasks for artifact size metrics plugin #36

0marperez commented Apr 17, 2024 •

edited

Loading

lauzadis Apr 17, 2024

0marperez Apr 17, 2024

lauzadis Apr 17, 2024

lauzadis Apr 17, 2024

lauzadis Apr 17, 2024

lauzadis Apr 17, 2024

0marperez Apr 17, 2024

lauzadis Apr 17, 2024

lauzadis Apr 17, 2024

lauzadis Apr 17, 2024

lauzadis Apr 17, 2024

lauzadis Apr 17, 2024

ianbotsf Apr 17, 2024

ianbotsf Apr 17, 2024

ianbotsf Apr 17, 2024

0marperez Apr 18, 2024

ianbotsf Apr 17, 2024

0marperez Apr 18, 2024

ianbotsf Apr 18, 2024

ianbotsf Apr 17, 2024

0marperez Apr 18, 2024

ianbotsf Apr 18, 2024

ianbotsf Apr 17, 2024

0marperez Apr 18, 2024

ianbotsf Apr 18, 2024

ianbotsf Apr 18, 2024

lauzadis Apr 19, 2024

lauzadis Apr 19, 2024

lauzadis Apr 19, 2024

feat: additional tasks for artifact size metrics plugin #36

feat: additional tasks for artifact size metrics plugin #36

Conversation

0marperez commented Apr 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0marperez commented Apr 17, 2024 •

edited

Loading