Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Create a blog post to explain how to read an IPUMS extract using the IPUMS.jl package. #28

Open
7 tasks
00krishna opened this issue May 20, 2024 · 2 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@00krishna
Copy link
Collaborator

00krishna commented May 20, 2024

Issue Description

This task involves writing a blog post that explains how to read an IPUMS extract into Julia. The post assumes that the user has already downloaded the extract--whether through the download API of IPUMS.jl or through a download from the website. The user then extracts that data to a directory, then reads the DDI file using the read_ddi function. The post should explain how the user can look at a summary of the metadata using the appropriate IPUMS.jl function. Next, the post explains how to use the load_extract function to load the IPUMs data into a Julia DataFrame. Once the data is in a DataFrame, the post should show some simple manipulations of the DataFrame, such as the describe function which will show some metadata--such as the mean, minimum, maximum--for each column in the DataFrame.

Difficulty: Beginner

Time: 6 - 8 hours

Requirements

  • Explain how to download the extract -- but just refer to previous blog post.
  • Demonstrate how to use the read_ddi function to load the DDI file.
  • Show how to get a summary of the extract level and variable level metadata using the function in IPUMS.jl
  • Show how to load the DAT file using the load_extract function.
  • Once the data is loaded, show how to see the metadata for the DataFrame and them columns of the DataFrame.
  • Show how to create a simple plot of a variable from the dataset using the Makie.jl package.
  • Open and submit a PR for the post.

Expected Outcomes

The expected outcome is a markdown blog post with the appropriate code and explanation, as per the list above.

Additional Notes

Additional information and code examples are available in the docstrings/documentation for the IPUMS.jl package. That is the best source for the Julia code.

Other Resources

Julia Slack:

  • documentation channel - you should post here first
  • helpdesk channel - this would be to get more attention to your issue but maybe not as precise as you need.
  • health-and-medicine channel - this is where most of JuliaHealth is located these days.

Julia Discourse - I would advise posting here if you have an issue that you feel is long or requires a lot of time to explain as you might lose it within Julia Slack. Consider cross-posting your forum post to the Julia Slack in helpdesk and/or documentation.

@00krishna 00krishna added documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels May 20, 2024
@TheCedarPrince
Copy link
Member

So this one is a bit curious as I could see this being split into a documentation PR but also a blog post. Actually, after some thought, I would be fine with some duplication between this being a documentation PR but then duplicated somewhat within #29 . I think it is great to have these common use cases within the documentation itself. So, I think this issue should be split into a docs issue as well.

@TheCedarPrince
Copy link
Member

Also, could you find an example of what should be the expected outcome for this particular issue? I definitely want this to be scoped out precisely -- the dataframes scoping is great already. Also, are you opposed to using Makie instead of Plots for this blog?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants