Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of time coordinates, especially leap seconds, define utc and tai calendars and leap_seconds in units_metadata #542

Open
JonathanGregory opened this issue Sep 16, 2024 · 5 comments · May be fixed by #541
Labels
enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format

Comments

@JonathanGregory
Copy link
Contributor

Summary

This proposal aims to reorganise and clarify the existing text, mostly in section 4.4, about time coordinates, with no change in meaning. It includes a new subsection on leap seconds and their implications for the CF standard calendar, with examples and a diagram, and defines a new use of the units_metadata attribute to remove ambiguity in the interpretation of leap seconds in the standard calendar. It introduces two new CF calendars: utc for UTC with leap seconds properly accounted for, and tai for atomic clock time, used for some satellite data.

Benefits

Several previous lengthy but inconclusive CF discussions have shown that the treatment of leap seconds is unclear and unsatisfactory. In this proposal we hope to provide an acceptable solution to these difficulties.

Moderator

None yet

Associated pull request

#541

Detailed Proposal

A huge amount of hard thought has been spent on previous long discussions about CF calendars and leap seconds (including #148, discuss issue #297, Discussion #304). The last of these went quiet in April.

Since then, we (@davidhassell and @JonathanGregory) have been working on a proposal, on which we'd now like to invite comments. If you are interested, please look at our modified text, especially section 4.4 on time coordinates. You can find this in any of the following:

The main changes are these:

  • Reorganisation and clarification of the existing text, with no change in its meaning. We have put the text about units into its own subsection, including writing down the format of the reference date/time and time zone, which wasn't shown except by an example. We have put the detailed text and examples concerning the none and paleoclimate calendars into their own subsections as well, so that the subsection on calendars is limited to giving the definition of each calendar.

  • Opening statements defining date/times and time coordinates, and an explanation in the subsection on calendars of how they relate to time intervals. These points have been contentious in the past, so we feel it's best to state plainly how they should be understood in CF (according to this proposal).

  • A new subsection on leap seconds, which explains in detail their implications for the CF standard calendar. Difficulties arise because that calendar is, and has always been, used in practice both for data that truly does not have UTC leap seconds in its time axis (e.g. a model which uses the real-world Gregorian calendar with every day having 86400 seconds) and for data which does, or should, have leap seconds but they are ignored in the time coordinates (e.g. observational data recorded with UTC time). Rather than deprecating or prohibiting one or other of these variants, we propose a new convention for the units_metadata attribute to distinguish them, so that they can be handled correctly by the data-user. The units_metadata attribute was recently added to CF to handle the difficulty of degrees_celsius being used in two different ways that require different treatment by data-users, after a very long and difficult discussion. We are hoping that it can work the same magic with leap seconds.

  • A worked example and a diagram for leap seconds. The diagram was inspired by the graph posted by @ChrisBarker-NOAA. We've also produced a table illustrating how a selection date/times and coordinates are related across many CF calendars, inspired by Lars's table. We propose to put this in an appendix to the convention, if this proposal is accepted. Thanks, Lars and Chris, for the ideas.

  • Two new calendars: utc for UTC with leap seconds properly accounted for, and tai for atomic clock time, used for some satellite data. The latter has been requested in previous discussions. The former hasn't explicitly been requested, but many comments imply that it would be preferred to standard for some purposes.

Previous discussions on these matters have evoked disagreements on principle which turned out to be irreconcilable by discussion in the issue, and no conclusion was reached. To avoid that outcome, we'd like to try a different method with the present proposal. If you find something in this proposal which you feel you couldn't possibly accept, even with modification, please say so in this issue. If anyone feels like that, we will convene a group to discuss the disagreements by video meeting, like we've done with a couple of other difficult issues. The group would be charged with reaching a resolution soon enough for some version of this proposal to be accepted for the next release, probably with a deadline in November. If that can't be done, we'll have to start again when someone has a new idea in future.

On the other hand, any suggestions, comments or concerns on clarity, presentation and details of the convention can probably be resolved by discussion in this usual way on this issue. We look forward to hearing what you think!

@JonathanGregory and @davidhassell

@JonathanGregory
Copy link
Contributor Author

In discussion 304, @ChrisBarker-NOAA has given his support to this proposal (thanks, Chris). He writes:

My only real concern is that the UTC calendar is an "attractive nuisance", and there is very little software that handles it properly, and many people use "UTC" imprecisely. But the text is very clear about the leap seconds, so buyer beware, I guess.

Please could anyone who wants to comment on this proposal do so here in this issue, rather than in discussion 304. Thanks.

@JonathanGregory
Copy link
Contributor Author

@ChrisBarker-NOAA has also made some comments on the PR (#541). I'm copying them here, because discussion of "substantive" points in a PR is awkward to follow subsequently. It's easier to have a single record in the issue. Marking typos etc. in a PR is fine, because they don't need discussion or reply.

I've usually seen this spelled datetime or date-time, rather than date/time. I think those forms are a little better. I'm not sure why, but date/time reads to me a bit like date or time, rather than a compound word.

I agree that "date/time" isn't ideal because "/" means "or", but I don't have a strong view on what we should write. We used "date/time" because it appears like that elsewhere in the convention document, especially chapter 7. If there is a consensus on a preferred way to write it, or a different term to use, we could change it throughout the document.

Regarding the sentence, "To mark this distinction, the canonical unit given for quantities used for time coordinates is s since 1958-1-1", just curious -- why 1958? I actually saw this in a file in the wild recently, and was wondering where in the heck it came from! I guess I'd expect 1970-1-1 [as that's the most common epoch used] as canonical, but it's not vital.

UTC and TAI have a complicated history, as described by wikipedia. My understanding is that, to summarise it simply, TAI began in 1958-1-1, with the modern definition of a second in terms of the caesium atomic clock. In 1972 UTC was rebased on TAI, in such a way that they were treated as coincident at 1958-1-1, with 10 leap seconds having been added by 1972. Hence it's convenient to regard UTC as beginning in 1958 as well as TAI. There is a sentence of explanation elsewhere in the CF text, which Chris discovered later. I will put something at the point where this remark was made as well.

[Where we discuss the definition of year and month: insert] "A day is exactly 24 hours (86400 sec). It is not a calendar day." I suggest this because in, e.g. the Python datetime library, a day is a calendar day, rather than 24 hours. I think that only makes a difference during a DST transition, which CF doesn't allow anyway (I hope!) -- but it wouldn't hurt to be extra clear here.

That's fine, thanks. I will insert it. The time zone definitions are plus/minus numbers hours (and minutes), not names - no automatic transitions are implied by them!

[Where we discuss time zones, replace "time zone" with] "time zone offset" -- time zone is the administrative thing, and has a name, and maybe DST transitions -- the timezone offset is the clear and simple.

OK, thanks.

[Concerning the new utc calendar, we have proposed "Date/times in the future are not allowed in this calendar, because it is unknown when future leap seconds will occur." Chris comments: ] I think some warning is given before a leap second is introduced -- so we could go a bit in the future (wikipedia says " leap seconds are announced only six months in advance.") -- but I can't find a formal reference for that -- so I guess ruling out the future altogether is probably wise.

In practice I'm sure it's OK if data-writers produce data for the future which they know it will be correct because of advance warning. The checker will give an error if it finds a date which is the future when the checker is run, but the future becomes the past at the rate of 1 second per second, and the same file will not give an error once that has happened! Should this be a recommendation not to write future UTC, rather than a prohibition?

Thanks for these comments, Chris. I have resolved them in the PR.

@JonathanGregory
Copy link
Contributor Author

Dear Chris

I have made changes (in the PR, html and pdf) following your suggestions. Two of them were more complicated that I had expected. Here are the new versions of various paragraphs:

In 4.4.1

UDUNITS defines a minute as 60 seconds, an hour as 3600 seconds and a day as 86400 seconds. These are not calendar units. When civil clock time changes at the start and end of summer in many countries, the day according to its calendar date lasts for 23 or 25 hours, but the UDUNITS and CF day is always 24 hours. When a leap second is inserted into UTC, the minute, hour and day affected differ by one second from their usual durations according to clock time, but the UDUNITS and CF minute, hour and day do not; they are fixed units of measure.

The default time zone offset is zero. In a time zone with zero offset, time (approximately) equals mean solar time for 0 degrees_east of longitude. (Although this may be exact in a model, in reality the time with zero time zone offset differs by some seconds from mean solar time; see the discussion of UTC and leap seconds in <<4.4.2>>.) If both time and time zone offset are omitted the time is 00:00:00 (midnight, the start of the day). Thus, units = "days since 1990-1-1" means the same as units = "days since 1990-1-1 0:0:0".

For example, seconds since 1992-10-8 15:15:42.5 -6:00 indicates seconds since October 8th, 1992 at 3 hours, 15 minutes and 42.5 seconds in the afternoon, in a time zone where the date/time is six hours behind the default. Subtracting the time zone offset from a given date/time converts it to the equivalent date/time with zero time zone offset e.g. 1989-12-31 18:00:00 -6 identifies the same instant as 1990-1-1 0:0:0.

In 4.4.2

In the real world, the international basis of civil timekeeping is Coordinated Universal Time (UTC). Leap seconds are adjustments occasionally made in UTC, in order to keep it close to mean solar time at 0 degrees_east i.e. the time zone with the default (zero) time zone offset in UDUNITS and CF (see <<4.4.1>>).

Do they look OK?

Cheers

Jonathan

@ChrisBarker-NOAA
Copy link
Contributor

These look greatt -- thanks!

Where are we at with:

I agree that "date/time" isn't ideal because "/" means "or", but I don't have a strong view on what we should write. We used "date/time" because it appears like that elsewhere in the convention document, especially chapter 7. If there is a consensus on a preferred way to write it, or a different term to use, we could change it throughout the document.

I vote for either "datetime" or "date-time" -- but yes, it should be the same everywhere, so if this is too much churn, we can leave it as is.

Maybe wait to see if anyone else has a preference?

@JonathanGregory
Copy link
Contributor Author

Dear @chris-little

Thanks for reviewing the PR. I am glad you found it clear. You commented

You might want to consider removing the word midnight, or replace it with midnight at 0 degrees longitude. It is a bit UK-centric. The ISO 8601 standard removed that word from its content some years ago.

Thanks for this point. I have qualified "midnight" with "at 0 degrees_east" in all the places I could find. It's updated in the PR, but I haven't updated the HTML and PDF.

Best wishes

Jonathan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format
Projects
None yet
2 participants