Skip to content

Commit

Permalink
Add some information on locale database to the ES docs (#113587)
Browse files Browse the repository at this point in the history
  • Loading branch information
thecoop committed Sep 30, 2024
1 parent 7b3d726 commit 53d9c3c
Show file tree
Hide file tree
Showing 8 changed files with 105 additions and 20 deletions.
8 changes: 8 additions & 0 deletions docs/reference/ingest/processors/date.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,11 @@ the timezone and locale values.
}
--------------------------------------------------
// NOTCONSOLE

[WARNING]
====
// tag::locale-warning[]
The text strings accepted by textual date formats, and calculations for week-dates, depend on the JDK version
that Elasticsearch is running on. For more information see <<custom-date-format-locales,custom date formats>>.
// end::locale-warning[]
====
52 changes: 44 additions & 8 deletions docs/reference/mapping/params/format.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,38 @@ down to the nearest day.
[[custom-date-formats]]
==== Custom date formats

Completely customizable date formats are supported. The syntax for these is explained
https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter docs].
Completely customizable date formats are supported. The syntax for these is explained in
https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/time/format/DateTimeFormatter.html[DateTimeFormatter docs].

[[custom-date-format-locales]]
===== Differences in locale information between JDK versions

There can be some differences in date formats between JDK versions and different locales. In particular,
there can be differences in text strings used for textual date formats, and there can be differences
in the results of week-date calculations.

There can be differences in text strings used by the following field specifiers:

* `B`, `E`, `G`, `O`, `a`, `v`, `z` of any length
* `L`, `M`, `Q`, `q`, `c`, `e` of length 3 or greater
* `Z` of length 4

If the text format changes between Elasticsearch or JDK versions, it can cause significant problems
with ingest, output, and re-indexing. It is recommended to always use numerical fields in custom date formats,
which are not affected by locale information.

There can also be differences in week-date calculations using the `Y`, `W`, and `w` field specifiers.
The underlying data used to calculate week-dates can vary depending on the JDK version and locale;
this can cause differences in the calculated week-date for the same calendar dates.
It is recommended that the built-in week-date formats are used, which will always use ISO rules
for calculating week-dates.

In particular, there is a significant change in locale information between JDK releases 22 and 23.
Elasticsearch will use the _COMPAT_ locale database when run on JDK 22 and before,
and will use the _CLDR_ locale database when run on JDK 23 and above. This change can cause significant differences
to the textual date formats accepted by Elasticsearch, and to calculated week-dates. If you are using
affected specifiers, you may need to modify your ingest or output integration code to account
for the differences between these two JDK versions.

[[built-in-date-formats]]
==== Built In Formats
Expand Down Expand Up @@ -256,31 +286,37 @@ The following tables lists all the defaults ISO formats supported:
`week_date` or `strict_week_date`::

A formatter for a full date as four digit weekyear, two digit week of
weekyear, and one digit day of week: `xxxx-'W'ww-e`.
weekyear, and one digit day of week: `YYYY-'W'ww-e`.
This uses the ISO week-date definition.

`week_date_time` or `strict_week_date_time`::

A formatter that combines a full weekyear date and time, separated by a
'T': `xxxx-'W'ww-e'T'HH:mm:ss.SSSZ`.
'T': `YYYY-'W'ww-e'T'HH:mm:ss.SSSZ`.
This uses the ISO week-date definition.

`week_date_time_no_millis` or `strict_week_date_time_no_millis`::

A formatter that combines a full weekyear date and time without millis,
separated by a 'T': `xxxx-'W'ww-e'T'HH:mm:ssZ`.
separated by a 'T': `YYYY-'W'ww-e'T'HH:mm:ssZ`.
This uses the ISO week-date definition.

`weekyear` or `strict_weekyear`::

A formatter for a four digit weekyear: `xxxx`.
A formatter for a four digit weekyear: `YYYY`.
This uses the ISO week-date definition.

`weekyear_week` or `strict_weekyear_week`::

A formatter for a four digit weekyear and two digit week of weekyear:
`xxxx-'W'ww`.
`YYYY-'W'ww`.
This uses the ISO week-date definition.

`weekyear_week_day` or `strict_weekyear_week_day`::

A formatter for a four digit weekyear, two digit week of weekyear, and one
digit day of week: `xxxx-'W'ww-e`.
digit day of week: `YYYY-'W'ww-e`.
This uses the ISO week-date definition.

`year` or `strict_year`::

Expand Down
10 changes: 9 additions & 1 deletion docs/reference/mapping/types/date.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,14 @@ on those dates so they should be avoided.
// end::decimal-warning[]
====

[WARNING]
====
// tag::locale-warning[]
The text strings accepted by textual date formats, and calculations for week-dates, depend on the JDK version
that Elasticsearch is running on. For more information see <<custom-date-format-locales,custom date formats>>.
// end::locale-warning[]
====

[[multiple-date-formats]]
==== Multiple date formats

Expand Down Expand Up @@ -126,7 +134,7 @@ The following parameters are accepted by `date` fields:

The locale to use when parsing dates since months do not have the same names
and/or abbreviations in all languages. The default is the
https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html#ROOT[`ROOT` locale],
https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html#ROOT[`ROOT` locale].

<<ignore-malformed,`ignore_malformed`>>::

Expand Down
19 changes: 18 additions & 1 deletion docs/reference/migration/migrate_8_16.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,22 @@ coming::[8.16.0]
[[breaking-changes-8.16]]
=== Breaking changes

There are no breaking changes in {es} 8.16.
The following changes in {es} 8.16 might affect your applications
and prevent them from operating normally.
Before upgrading to 8.16, review these changes and take the described steps
to mitigate the impact.

[discrete]
[[breaking_816_locale_change]]
==== JDK locale database change

{es} 8.16 changes the version of the JDK that is included from version 22 to version 23. This changes
the locale database that is used by Elasticsearch from the _COMPAT_ database to the _CLDR_ database.
This can result in significant changes to custom textual date field formats,
and calculations for custom week-date date fields.

For more information see <<custom-date-format-locales,custom date formats>>.

If you run {es} 8.16 on JDK version 22 or below, it will use the _COMPAT_ locale database
to match the behavior of 8.15. However, please note that starting with {es} 9.0,
{es} will use the _CLDR_ database regardless of JDK version it is run on.
19 changes: 16 additions & 3 deletions docs/reference/setup/install.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
[[hosted-elasticsearch-service]]
=== Hosted Elasticsearch Service

{ecloud} offers all of the features of {es}, {kib}, and Elastic’s {observability}, {ents}, and {elastic-sec} solutions as a hosted service
available on AWS, GCP, and Azure.
{ecloud} offers all of the features of {es}, {kib}, and Elastic’s {observability}, {ents}, and {elastic-sec} solutions as a hosted service
available on AWS, GCP, and Azure.

To set up Elasticsearch in {ecloud}, sign up for a {ess-trial}[free {ecloud} trial].

Expand All @@ -17,7 +17,7 @@ To set up Elasticsearch in {ecloud}, sign up for a {ess-trial}[free {ecloud} tri
If you want to install and manage {es} yourself, you can:

* Run {es} using a <<elasticsearch-install-packages,Linux, MacOS, or Windows install package>>.
* Run {es} in a <<elasticsearch-docker-images,Docker container>>.
* Run {es} in a <<elasticsearch-docker-images,Docker container>>.
* Set up and manage {es}, {kib}, {agent}, and the rest of the Elastic Stack on Kubernetes with {eck-ref}[{eck}].

TIP: To try out Elasticsearch on your own machine, we recommend using Docker and running both Elasticsearch and Kibana. For more information, see <<run-elasticsearch-locally,Run Elasticsearch locally>>. Please note that this setup is *not suitable for production use*.
Expand Down Expand Up @@ -98,6 +98,19 @@ the bundled JVM are treated as if they were within {es} itself.
The bundled JVM is located within the `jdk` subdirectory of the {es} home
directory. You may remove this directory if using your own JVM.

[discrete]
[[jdk-locale]]
=== JDK locale database

The locale database used by {es}, used to map from various date formats to
the underlying date storage format, depends on the version of the JDK
that {es} is running on. On JDK version 23 and above, {es} will use the
_CLDR_ database. On JDK version 22 and below, {es} will use the _COMPAT_
database. This may mean that the strings used for textual date formats,
and the output of custom week-date formats, may change when moving from
a previous JDK version to JDK 23 or above. For more information, see
<<custom-date-format-locales,custom date formats>>.

[discrete]
[[jvm-agents]]
=== JVM and Java agents
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ public enum ReferenceDocs {
FLOOD_STAGE_WATERMARK,
X_OPAQUE_ID,
FORMING_SINGLE_NODE_CLUSTERS,
JDK_LOCALE_DIFFERENCES,
// this comment keeps the ';' on the next line so every entry above has a trailing ',' which makes the diff for adding new links cleaner
;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

package org.elasticsearch.common.time;

import org.elasticsearch.common.ReferenceDocs;
import org.elasticsearch.common.logging.DeprecationCategory;
import org.elasticsearch.common.logging.DeprecationLogger;
import org.elasticsearch.core.Predicates;
Expand Down Expand Up @@ -405,18 +406,18 @@ static void checkTextualDateFormats(String format) {
deprecationLogger.warn(
DeprecationCategory.PARSING,
"cldr_date_formats_" + format,
"Date format [{}] contains textual field specifiers that could change in JDK 23."
+ " For more information, see https://ela.st/jdk-23-locales",
format
"Date format [{}] contains textual field specifiers that could change in JDK 23. See [{}] for more information.",
format,
ReferenceDocs.JDK_LOCALE_DIFFERENCES
);
}
if (CONTAINS_WEEK_DATE_SPECIFIERS.test(format)) {
deprecationLogger.warn(
DeprecationCategory.PARSING,
"cldr_week_dates_" + format,
"Date format [{}] contains week-date field specifiers that are changing in JDK 23."
+ " For more information, see https://ela.st/jdk-23-locales",
format
"Date format [{}] contains week-date field specifiers that are changing in JDK 23. See [{}] for more information.",
format,
ReferenceDocs.JDK_LOCALE_DIFFERENCES
);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,6 @@
"MAX_SHARDS_PER_NODE": "size-your-shards.html#troubleshooting-max-shards-open",
"FLOOD_STAGE_WATERMARK": "fix-watermark-errors.html",
"X_OPAQUE_ID": "api-conventions.html#x-opaque-id",
"FORMING_SINGLE_NODE_CLUSTERS": "modules-discovery-bootstrap-cluster.html#modules-discovery-bootstrap-cluster-joining"
"FORMING_SINGLE_NODE_CLUSTERS": "modules-discovery-bootstrap-cluster.html#modules-discovery-bootstrap-cluster-joining",
"JDK_LOCALE_DIFFERENCES": "mapping-date-format.html#custom-date-format-locales"
}

0 comments on commit 53d9c3c

Please sign in to comment.