Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overall dataset structure #10

Closed
lpilz opened this issue Nov 15, 2021 · 3 comments
Closed

Overall dataset structure #10

lpilz opened this issue Nov 15, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@lpilz
Copy link
Collaborator

lpilz commented Nov 15, 2021

As discussed in fmaussion/salem#205 and #2, we want xwrf to provide physical variables on an unstaggered grid. I want to start a discussion on how to implement this.

My idea would be to provide the user with a cf-compliant file after using open_dataset on a wrfout file. However, this raises a couple of questions for me:

  • How is the dependency of variables like "air_pressure" on "P" and "PB" managed? Do we hold the NetCDF4DataStore in the very background and reference its naked variables lazily or do we also want to expose the raw data via something like ds.raw and reference these variables (possibly even the whole netcdf4 dataset -> easier but also possibly more memory intensive)?
  • It would be a bit more efficient if we were to first compute the diagnostic and then destagger it - is it the case that diagnostics only depend on similarly staggered variables? (I'd guess so, but not sure)
  • Which diagnostics do we want to provide and do we want to expose them in a DataTree eventually?

One suggestion might be:

DataTree("root")
|-- DataNode("2d_variables")
|   |-- DataArrayNode("sea_surface_temperature")
|   |-- DataArrayNode("surface_temperature")
|   |-- DataArrayNode("surface_air_pressure")
|   |-- DataArrayNode("air_pressure_at_sea_level")
|   |-- DataArrayNode("air_temperature_at_2m") (?)
|   ....
|-- DataNode("3d_variables")
    |-- DataArrayNode("air_temperature")
    |-- DataArrayNode("air_pressure")
    |-- DataArrayNode("northward_wind")
    |-- DataArrayNode("eastward_wind")
    ....

I just started thinking about this, so I will have missed a lot of implementation detail which needs to be discussed, but maybe this can be a start.

@lpilz lpilz added the enhancement New feature or request label Nov 15, 2021
@kmpaul kmpaul added this to xWRF Dec 16, 2021
@kmpaul kmpaul added this to the Project xWRF milestone Dec 21, 2021
@kmpaul kmpaul changed the title Diagnostic variables Overall dataset structure Jan 21, 2022
@jthielen jthielen removed this from the Project xWRF milestone Sep 9, 2022
@jthielen
Copy link
Collaborator

jthielen commented Sep 20, 2022

Revisiting this issue after v0.0.1 release (and features essentially frozen for upcoming v0.0.2) to both document for informational purposes (in case anyone stumbles upon this issue later) and discuss components we want to have on a future roadmap:

  • How is the dependency of variables like "air_pressure" on "P" and "PB" managed? Do we hold the NetCDF4DataStore in the very background and reference its naked variables lazily or do we also want to expose the raw data via something like ds.raw and reference these variables (possibly even the whole netcdf4 dataset -> easier but also possibly more memory intensive)?

With our accessor approach (xref #44), diagnostic fields are computed (eagerly after loading in just needed fields by default, fully lazy if dask chunked) and added to the output dataset, with redundant component fields dropped by default (but can be maintained if desired, and non-redundant components like grid-relative winds are never dropped).

  • It would be a bit more efficient if we were to first compute the diagnostic and then destagger it - is it the case that diagnostics only depend on similarly staggered variables? (I'd guess so, but not sure)

After #100,

  • earth-relative wind fields (which aren't really ever useful without destaggering?) have destaggering internal to the diagnostic calculation
  • air_potential_temperature and air_pressure are not staggered to begin with
  • geopotential and geopotential_height are left staggered in vertical (to be destaggered in a later step, or just transformed to a more analysis-friendly vertical coordinate directly)
  • Which diagnostics do we want to provide and do we want to expose them in a DataTree eventually?

This is still TBD! My humble opinion is that this would be a great optional feature (perhaps a to_datatree() if separate from postprocess() or postprocess_and_organize() if combined) once DataTree no longer labels itself as WIP (though, xradar is going full speed ahead on adopting DataTree as it is right now, so perhaps we could too).

Should we update the title of this issue accordingly? Or create a new issue focused on DataTree integration into xWRF?

@lpilz
Copy link
Collaborator Author

lpilz commented Sep 21, 2022

This is still TBD! My humble opinion is that this would be a great optional feature (perhaps a to_datatree() if separate from postprocess() or postprocess_and_organize() if combined) once DataTree no longer labels itself as WIP (though, xradar is going full speed ahead on adopting DataTree as it is right now, so perhaps we could too).

I agree with keeping it separate from postprocess(). The API I would prefer would look like ds.xwrf.postprocess().xwrf.organize(), but I think this is dependent on what kind of functionality we need in organize() - if it's not possible to separate out, I think postprocess_and_organize() is fine.

Should we update the title of this issue accordingly? Or create a new issue focused on DataTree integration into xWRF?

This issue is soo ancient and so much has happened since I opened it, I think we should create a new DataTree in xwrf issue.

@jthielen
Copy link
Collaborator

This issue is soo ancient and so much has happened since I opened it, I think we should create a new DataTree in xwrf issue.

Done! #110

Repository owner moved this from 🌳 Todo to ✅ Done in Xdev Project Board Sep 21, 2022
@jthielen jthielen moved this to Done in xWRF Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants