Boucke Building 216 | Monday, Wednesday 2:30pm - 3:45pm
Anton Nekrutenko
[email protected]
Wartik 505
Office hours by appointment only
❗ When contacting instructor use the above e-mail and include "BMMB554" in the subject line (simply click on e-mail address).
This course does not use Canvas. Canvas is a convoluted system with too many features and undefined purpose. Instead, this course is served from GitHub. Each new lecture will be created as a new entry under "Lectures" section of the website.
❗ Do not contact me through Canvas! I will not check my Inbox there. Instead, contact me via e-mail as described above.
- Attendance (25%)
- Homework (30%)
- Final Project (45%)
A short homework will be given once a week. We will discuss project requirements before the spring break.
We will cover a breadth of topics. The course will be divided into several blocks:
# | What | Why |
---|---|---|
1.1 | Introduction to this course | Who you are and what you want to accomplish |
1.2 | Jupyter and GitHub | Throughout this course Jupyter will be our primary framework for data analysis and GitHub will be used for version control |
1.3 | Python | Understanding transactional Python for data analysis |
1.4 | NumPy and SciPy | Libraries for numerical magic |
1.5 | MatPlotLib | Plotting data in Python |
1.6 | Pandas and SQL | Operating on large datasets in Python |
1.7 | Galaxy | Data processing as scale |
# | What | Why |
---|---|---|
2.1 | Fundamentals | Probability, Descriptive statistics, Correlation analysis, and Logarithms |
2.2 | Statistical analyses | Distributions, Sampling, Significance, Permutation, Bayes Theorem |
2.3 | Visualization | Useful versus meaningless |
2.4 | Linear algebra | Matrix operations, Eigenvalues and Eigenvectors |
2.5 | Discrete data and modeling | Understanding the data and going upward |
2.6 | Clustering | Stratifying the data |
2.7 | Testing | How to test your hypotheses |
2.8 | Multivariate analysis | Finding association among multiple variables |
# | What | Why |
---|---|---|
3.1 | DNA (and RNA) sequencing | From Sanger to Nanopores |
3.2 | Variation | Finding and interpreting genetic differences |
3.3 | Transcriptomics | Measuring gene expression |
3.4 | Transcriptomics | Measuring shapes |
3.5 | DNA/Protein interactions | Assessing gene regulation and genome architecture |
3.6 | Metagenomics | Analysis of complex mixtures |
3.7 | About counts | NGS data is count data. There are common themes in read count analysis |
# | What | Why |
---|---|---|
4.1 | Alignment | Fundamental concepts, Global and local alignment |
4.2 | Aligning many sequences quickly | Mapping in the age of billion-read datasets |
4.3 | Bloom filters | Searching in the age of Exobyte databases |
4.4 | Assembly | Reconstructing genomes and transcriptomes |
We will get into specifics of the final project before spring break.
In an examination setting, unless the instructor gives explicit prior instructions to the contrary, violations of academic integrity shall consist of any attempt to receive assistance from written or printed aids, from any person or papers or electronic devices, or of any attempt to give assistance, whether the student doing so has completed his or her own work or not. Other violations include, but are not limited to, any attempt to gain an unfair advantage in regard to an examination, such as tampering with a graded exam or claiming another's work to be one's own. Other assessments (including ANGEL-administered quizzes and assessments as well as homework assignments) are expected to represent your own independent work unless specifically stated otherwise. Failure to comply will lead to sanctions against the student in accordance with the Policy on Academic Integrity in the Eberly College of Science. The Eberly College of Science Code of Mutual Respect and Cooperation (www.science.psu.edu/climate/Code-of-Mutual-Respect-final.pdf) embodies the values that we hope our faculty, staff, and students possess and will endorse to make The Eberly College of Science a place where every individual feels respected and valued, as well as challenged and rewarded. The Eberly College of Science is committed to the academic success of students enrolled in the College's courses and undergraduate programs. When in need of help, students can utilize various College and University wide resources for learning assistance (http://www.science.psu.edu/advising/success). Penn State welcomes students with disabilities into the University's educational programs. If you have a disability-related need for reasonable academic adjustments in this course, contact the Office for Disability Services (ODS) at 814-863-1807 (V/TTY). For further information regarding ODS, please visit the Office for Disability Services Web site at http://equity.psu.edu/ods/. In order to receive consideration for course accommodations, you must contact ODS and provide documentation (see the documentation guidelines). If the documentation supports the need for academic adjustments, ODS will provide a letter identifying appropriate academic adjustments. Please share this letter and discuss the adjustments with your instructor as early in the course as possible. You must contact ODS and request academic adjustment letters at the beginning of each semester.