Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define base encoding in x509.serial_number #2383

Merged
merged 4 commits into from
Oct 15, 2024
Merged

Define base encoding in x509.serial_number #2383

merged 4 commits into from
Oct 15, 2024

Conversation

haesbaert
Copy link
Contributor

@haesbaert haesbaert commented Sep 17, 2024

This proposes enforcing the base encoding of x509.serial_number to be 16.

The current definition is too loose and leaves room for interpretation, base 16 is also more common and it will help users correlate the value with existing tools. This change was prompted by elastic/sdh-beats#5089 (internal link) where a user expected to see it in the same format as other tools.

This narrows the definition of x509.serial_number to be encoded in hexadecimal, otherwise we end up with integrations choosing their own encoding, as noted below, Zeek uses base 16 while the rest of beats is using base 10.

Particularly relevant comment from @andrewkroh (on an internal issue):

I think tls.client.x509.serial_number and tls.server.x509.serial_number would better serve our users if they were hex-encoded. It would provide better interoperability with other tools. I found one example in the Zeek integration that is already using hex (there may be other integrations).

The ECS description is not as "strong" as I would prefer, I think its original intent was for uppercase hex when it said "if this value is alphanumeric, it should be formatted without colons and uppercase characters." So I'm of the opinion we should treat this as a bug.

The source of these values should be in the packetbeat/protos/tls Go code.

The tls.detailed.* description explicitly says it is base-10 so I don't see a reason to change those.

Information Gathering / Research

Field Definitions

name type external source description
tls.client.x509.serial_number keyword ecs data_stream/tls/fields/ecs.yml:137:3 Unique serial number issued by the certificate authority. For consistency, if this value is alphanumeric, it should be formatted without colons and uppercase characters.
tls.detailed.client_certificate_chain.issuer.serial_number keyword data_stream/tls/fields/protocol.yml:188:19
tls.detailed.client_certificate_chain.serial_number keyword data_stream/tls/fields/protocol.yml:210:15 Base 10 representation of the certificate serial number.
tls.detailed.client_certificate_chain.subject.serial_number keyword data_stream/tls/fields/protocol.yml:188:19
tls.detailed.server_certificate_chain.issuer.serial_number keyword data_stream/tls/fields/protocol.yml:188:19
tls.detailed.server_certificate_chain.serial_number keyword data_stream/tls/fields/protocol.yml:210:15 Base 10 representation of the certificate serial number.
tls.detailed.server_certificate_chain.subject.serial_number keyword data_stream/tls/fields/protocol.yml:188:19
tls.server.x509.serial_number keyword ecs data_stream/tls/fields/ecs.yml:213:3 Unique serial number issued by the certificate authority. For consistency, if this value is alphanumeric, it should be formatted without colons and uppercase characters.

x509 (RFC 5280) tells us that serial number must be integers and may contain as many as 20 bytes.

openssl emits hex

$ openssl x509 -in googlecert.pem -noout -serial
serial=244E52D96B551F960A00000000F2BAF4

wireshark shows hex

serialNumber: 0x045d7ad2b40288f1d046fff688c14e0b0bb5

References

Narrow the definition of x509.serial_number to be encoded in hexadecimal,
otherwise we end up with integrations choosing their own encoding, as noted
below, Zeek uses base 16 while the rest of beats is using base 10.

Related to elastic/sdh-beats#5089.
Reasoning in: elastic/sdh-beats#5089 (comment)
@haesbaert haesbaert requested a review from a team as a code owner September 17, 2024 20:19
Copy link

Documentation changes preview: https://ecs_bk_2383.docs-preview.app.elstc.co/diff

@@ -52,8 +52,8 @@
type: keyword
short: Unique serial number issued by the certificate authority.
description: >
Unique serial number issued by the certificate authority. For consistency, if this value is alphanumeric, it should be
formatted without colons and uppercase characters.
Unique serial number issued by the certificate authority. For consistency, this should be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is should, it implies this encoding is still optional, which might not really completely resolve the original problem. Is that what you intend? But if the wording is changed to something stronger, it could be considered a breaking change.

If you want it to be enforced, what do you think of having should in ECS v8, and changing to must in the next major release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, that making it a MUST would be breaking for v8 so I think it needs to remain a SHOULD.

About changing it in the next major, sounds like a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with that, I'm not familiar with the process at all.
For the next release, shouldn't we restrict it even more? I'd prefer to have something that is strict on case, semi-colons and base.

Comment on lines +55 to +56
Unique serial number issued by the certificate authority. For consistency, this should be
encoded in base 16 and formatted without colons and uppercase characters.
Copy link
Member

@andrewkroh andrewkroh Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"without colons and uppercase characters" is ambiguous. Does it mean "without colons and without uppercase"? Or does it mean "without colons and with uppercase"? 😄

Suggested change
Unique serial number issued by the certificate authority. For consistency, this should be
encoded in base 16 and formatted without colons and uppercase characters.
Unique serial number issued by the certificate authority. For consistency, this should be
encoded in base 16 and formatted as uppercase characters without colons.

(edited to leave "base 16" as is)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! Good point, my brain didn't spot that.
I think it should still be base 16, not base-16, as it's consistent with the other cases that use space, as in:
https://github.com/elastic/ecs/blob/main/schemas/dns.yml#L98
grep -r 'base-' doesn't return any other uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all agree, I'd bring thid PR to 8.x with the wording changes from @andrewkroh, and then open a second PR to do the breaking/must changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. If we haven't already created an 8.x branch then we should do that before you merge the second PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I'm just a bit confused, we have branches 8.{11,12,13,14}, when you say 8.x does this mean 8.{12,13,14} ?
Do we change ECS definitions (even if should), for current and older releases (<= 8.11).
In other words, where should I branch from?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what other repos like beats/kibana/elasticsearch and doing... We should create an 8.x from main. And then future 8.{16,17,18} branches will come from 8.x. We should backport to 8.x unless the change is a something specifically for 9.0 (like a breaking change).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, in the meantime @mjwolf volunteered to take care of this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an 8.x branch now. I think the best way to handle this is to merge this PR to main first (and I'll handle the backport to the 8.x branch), and then afterwards create another PR with the changes intended for 9.x only that will stay in main only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haesbaert with this PR merged, and backported to 8.x, you can now create another PR with the changes intended for 9.x only. You can target it to main, and a backport-skip label

@@ -52,8 +52,8 @@
type: keyword
short: Unique serial number issued by the certificate authority.
description: >
Unique serial number issued by the certificate authority. For consistency, if this value is alphanumeric, it should be
formatted without colons and uppercase characters.
Unique serial number issued by the certificate authority. For consistency, this should be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, that making it a MUST would be breaking for v8 so I think it needs to remain a SHOULD.

About changing it in the next major, sounds like a good idea.

@mjwolf mjwolf merged commit 4fa0abd into main Oct 15, 2024
4 checks passed
@mjwolf mjwolf deleted the x509.serial_number branch October 15, 2024 20:51
mjwolf pushed a commit that referenced this pull request Oct 15, 2024
Narrow the definition of x509.serial_number to be encoded in hexadecimal,
otherwise we end up with integrations choosing their own encoding, as noted
below, Zeek uses base 16 while the rest of beats is using base 10.
haesbaert added a commit that referenced this pull request Nov 6, 2024
We made 8.x a `should` for the same field in
4fa0abd.

As discussed in #2383 (comment)
we are making this a `must` for 9.x.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants