Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud computing tutorial #21

Merged
merged 24 commits into from
Aug 15, 2024
Merged

Cloud computing tutorial #21

merged 24 commits into from
Aug 15, 2024

Conversation

abarciauskas-bgse
Copy link
Contributor

@abarciauskas-bgse abarciauskas-bgse commented Aug 12, 2024

@abarciauskas-bgse
Copy link
Contributor Author

I will look into the clean notebooks failure.

@abarciauskas-bgse abarciauskas-bgse added the preview Create a website preview label Aug 12, 2024
Copy link
Contributor

github-actions bot commented Aug 12, 2024

@abarciauskas-bgse
Copy link
Contributor Author

@scottyhq I'm having trouble getting cells not to run in this tutorial. There are 2 notebooks with code cells:

  1. book/tutorials/cloud-computing/04-cloud-optimized-icesat2.ipynb - this one should definitely be included in the navigation and tutorial but I don't think we want the cells to execute (it would be fine if we did but there is a geopandas import error, and other imports would probably fail too)
  2. book/tutorials/cloud-computing/atl08_parquet_files/atl08_parquet.ipynb - This notebook should definitely not be run as it includes code to generate geoparquet. Ideally this would not show up in the table of contents at all but just accessible via the link in the 04-cloud-optimized-icesat2.ipynb. But if I remove it from the TOC then the link in 04-cloud-optimized-icesat2.ipynb shows up as broken. As a fallback option, I could remove this NB from the TOC and just link to the github URL for that directory.

Let me know what you think and thanks in advance for your help.

@scottyhq
Copy link
Member

this one should definitely be included in the navigation and tutorial but I don't think we want the cells to execute

The way we've approached this in the past is adding it's path unnder exclude_patterns in the config. Be sure to commit the rendered notebook if you do this

execute_notebooks: 'force'
exclude_patterns:
- "**/geospatial-advanced.ipynb"

Ideally this would not show up in the table of contents at all but just accessible via the link in the 04-cloud-optimized-icesat2.ipynb.

Looks like you can use a {download} admonition. Not sure where this is documented, but the jupyterbook docs show this:
https://github.com/jupyter-book/jupyter-book/blob/bb13ee76f266ca0b94b31bc22e5480b60a52002d/docs/file-types/notebooks.ipynb#L11

@abarciauskas-bgse
Copy link
Contributor Author

I noticed the exclude pattern, but also that this line is commented out: https://github.com/ICESAT-2HackWeek/website-2024/blob/main/.github/workflows/ensure_clean_notebooks.py#L33 so I wasn't sure if I should use it. I'll try it with that line un-commented.

@scottyhq
Copy link
Member

/Users/runner/work/website-2024/website-2024/book/tutorials/cloud-computing/03-cloud-optimized-data-access.ipynb:10011: WARNING: local id not found in doc 'tutorials/cloud-computing/02-cloud-data-access': 'cloud-vs-local-storage' [myst.xref_missing]

I'm not sure why this link isn't working in your third notebook. Maybe it is sensitive to Caps or maybe it only works to do this within the same notebook... Feel free to try and fix or just link to the notebook without the section

[Cloud Data Access Notebook: Cloud vs Local Storage](./02-cloud-data-access.ipynb#cloud-vs-local-storage).

@abarciauskas-bgse
Copy link
Contributor Author

abarciauskas-bgse commented Aug 14, 2024

@scottyhq the link, when I try to use it in hub.cryointhecloud.com is working which is why I am also confounded by this error. I was wrong it's not working, it just seemed like it was because of how I was working between notebooks 🤦🏽

Copy link
Member

@scottyhq scottyhq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic tutorial @abarciauskas-bgse thank you !'m on a plane back to seattle with limited wifi and the lonboard plot still worked well :)

"metadata": {},
"outputs": [],
"source": [
"dataset = pq.ParquetDataset(\"atl08_parquet\", partitioning=\"hive\", filters=[('year', '>=', 2021),\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: could add some notes on what partitioning="hive" is

"df['geometry'] = df['geometry'].apply(wkb.loads)\n",
"\n",
"\n",
"gdf = gpd.GeoDataFrame(df, geometry='geometry')\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"gdf = gpd.GeoDataFrame(df, geometry='geometry')\n",
"gdf = gpd.GeoDataFrame(df, geometry='geometry', crs='EPSG:4326')\n",

(to avoid UserWarning: No CRS exists on data. from lonboard)

"h_canopy = gdf_filtered['h_canopy']\n",
"h_canopy_normalized = (h_canopy - min_bound) / (max_bound - min_bound)\n",
"\n",
"layer = ScatterplotLayer.from_geopandas(gdf_filtered, radius_min_pixels=0.5)\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: note on how radius_min_pixels works (I noticed that zooming in on the map points get very small)

@abarciauskas-bgse
Copy link
Contributor Author

Thanks @scottyhq so much for your review ❤️ I have addressed your comments and made a few more minor changes (added 4 introductory sentences to the 00 notebook, modified one image and added one image to the 03 notebook), but given these are minor changes I am going to merge.

I am hoping to create a bonus notebook with more cloud computing resources 🤞🏽 if I have time.

@abarciauskas-bgse abarciauskas-bgse merged commit ff33f45 into main Aug 15, 2024
4 checks passed
@abarciauskas-bgse abarciauskas-bgse deleted the cloud-computing-tutorial branch August 15, 2024 23:01
@abarciauskas-bgse abarciauskas-bgse restored the cloud-computing-tutorial branch August 17, 2024 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Create a website preview
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants