Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cloud utilities for multirows, datalinks #22

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

zoghbi-a
Copy link
Contributor

@zoghbi-a zoghbi-a commented Jul 10, 2023

With the goal of breaking the other PR #18 into smaller pieces. The code here processes the cloud information, and it uses the download functions in PR #21 (so that should be approved first).

The main parts:

  • Function to look for and process the cloud information: _process_json_column, _process_cloud_datalinks and _process_ucd_column. These return a dictionary elements: e.g. {'url':..} for prem or {'uri':.., 'bucket_name':.., 'key': ...} for aws that can be passed to the relevant download functions.
  • The user facing function get_data_product2. This is similar to what we have now but handles datalinks correctly, and multiple Record's (QueryResult) and Row's (Table). Once review is done, the name should be get_data_product.
  • get_data_product takes a product (pyvo Record or astropy Row/Table) and a provider (prem, aws etc) and a returns a ProviderHandler instance (a list that has a download method).
  • get_data_product2 takes also another option mode if the user wants to select json, datalink (others can be added).
  • ProviderHandler has also a get_links method that can be used to figure out the url/uri in case the user want to use the cloud cutout for example.
  • All the extra code is compared to Add download utilities #21 is in _fornax.py so not to interfere with fornax.py, but once review is done, the first should replace the latter.

An example use case is:

# Query data provider 
chandra_url = 'https://heasarc.gsfc.nasa.gov/xamin_aws/vo/sia?table=chanmaster'
chandra_service = pyvo.dal.sia.SIAService(chandra_url)
query_result = chandra_service.search(pos=(pos.ra.deg, pos.dec.deg), size=10*u.arcsec)

# prem data
chandra_prem = fornax.get_data_product2(chandra_result[0], 'prem')
chandra_file = chandra_prem.download()

# aws data
chandra_aws = fornax.get_data_product2(query_result[0], 'aws')
chandra_file = chandra_aws.download()

@zoghbi-a zoghbi-a changed the title Add cloud utilities Add cloud utilities for multirows, datalinks Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant