After reading tons of literature and forum questions, I've determined a path I need to follow which is pretty much what most of those answers have in common. This plan essentially covers three parts, math, computer science, and practical knowledge (e.g. of tools). Theoretical part is focused on gaining math skills and knowledge in probability, statistics, optimization, and machine learning theory. Practical part, on the other hand, has its focus on programming skills and getting known with the analysis tools. Finally, the computer science part is geared towards obtaining skills that are helpful in working as a practical data scientist, for example, effective implementation of an algorithm. Unlike most of the resources and guides, this guide is to be very concise with regards to literature and resources, abundance of which is overwhelming and confusing.
If there is one thing I have learned in the last few years after following this plan is practice, practice, and practice. Practice until you know how to apply theory you have learnt; theory without practice is worthless. Do the finetuning of pretrained models, deep dive into specifics such as NLP, f.e., and see how you can use the most recent research in your predictions; whatever it is - as long as you practice it, you are taking that thought process you need to the next level.
You can always learn the tools to help you in your DS journey, but building that thinking process and developing an experimental mindset is what I bet on mostly. You can often find a few tricks here and there, and so instead of continuously learning them, isn't it better to learn how to generate them yourself?
Take time to revise your learning and get away from studying by giving yourself a little break every once in a while. As I recently read in one of the articles on self-development, look back and revamp on the knowledge you have gained and succeeded, not where you have come up short as this is what we're inclined to more naturally.
Completed items marked with ✔️.
- Math
- Applied Part
- Machine Learning
- Deep Learning
- Natural Language Processing
- Computer Science and Software Engineering
- Other Interesting Courses
- Useful Resources
- Books
- Interview Questions and Brain Teasers
Items to be followed in the order provided
- ✔️ Single variable calculus Course page
- ✔️ Multi variable calculus Course page
- Linear algebra Course page
- ✔️ Introduction to Combinatorics Andrey Raygorodskiy on Coursera (to be taken in parallel with multi-variable calculus)
- ✔️ Introduction to Probability Andrey Raygorodskiy on Coursera
- Probability and Statistics Course page
- Selected topics from the Probability and Random Variables course MIT Spring 2015
- Introduction to Stochastic Processes MIT Sprint 2015 ## TODO find materials on Markov Models
- Matrix Methods in Data Analysis, Signal Processing, and Machine Learning Course by Gilbert Strang
- Introduction to Graph Theory Andrey Raygorodskiy on Coursera
- Mathematics for Computer Science MIT / Fall 2010 or MIT / Spring 2015
- Differential Equations Spring 2010
- Logic
- Introduction to Numerical Analysis MIT's 18.330
- Analytical geometry (determine the resources)
- Algebra I MIT's 18.701
- Algebra II MIT's 18.702
- Number Theory MIT's 18.781
- Analysis I MIT's 18.100B
- Analysis II MIT's 18.101
- Introduction to Functional Analysis MIT's 18.102
- Convex Optimization Stanford Course by Stephen Boyd
- ✔️ Acquaintance with Numpy Numpy Tutorial on Scipy-Lectures
- ✔️ Pandas tutorial Official tutorial
- ✔️ Matplotlib Intro to Matplotlib
- ✔️ Machine Learning Open Course (https://github.com/Yorko/mlcourse.ai)
- More hard core machine learning math from Yandex (https://academy.yandex.ru/handbook/ml/)
- CS231n: Convolutional Neural Networks for Visual Recognition (http://cs231n.github.io/)
- Fast.ai - sequence of 4 practice-oriented courses Courses page
- Good theoretical overview of ML fundamentals (in Russian)
- Machine Learning lectures by K.Voroncov videos in Russian
- ✔️ Deepearning.ai (specialization on Coursera https://www.deeplearning.ai/)
- Best NLP competitions on Kaggle to learn from: video by Abhishek Thakur
- Understanding Unicode and Charsets: bare minimum
This is where I refer to a collection by jwasham's coding-interview-university
Special courses to take listed separately:
- Introduction to Computational Thinking and Data Science MIT's 6.0002
- Algorithms: Design and Analysis Stanford Course
- Machine Learning Stanford Course by Andrew Ng
- Data Structures and Algorithms Specialization (a sequence of 6 courses; specialization on Coursera https://www.coursera.org/specializations/data-structures-algorithms)
- Object-Oriented Programming and Design Patterns in Python (https://www.coursera.org/learn/oop-patterns-python/)
- C++ learning by doing
I have found these sources to be useful especially used in conjunction with each other:- C++ video lectures - very good explanation (in Russian)
- C++ lessons (in Russian)
- Java
- Cool in-depth coverage of Java Core: Golovach Courses (in Russian)
- Effective Java (3rd. ed) - Joshua Bloch
- Clean Code - Robert C. Martin
- The Clean Coder - Robert C. Martin
- Optimizing Java - Benjamin J.Evans, James Gough & Chris Newland
- Test-Driven Development - Kent Beck
- The Art of Unit Testing - Roy Osherove
- Optimizing Java: Practical Techniques for Improving JVM Application Performance - Benjamin J. Evans, Chris Newland, James Gough
- Learn to work with the command line
- Shell scripting lessons
- System design - engineering approach very cool collection
- Real world systems explained by those who build them Architecturenotes
- Infrastructure explained blogpost
Some of the courses with the useful material to get a grasp of:
- Topics in Mathematics with Applications in Finance (topics include stochastic calculus, stochastic differential equations, time series analysis, and direct applications to finance) MIT's 18.S096
- Linear Algebra and Learning from Data Book by Gilbert Strang
- Good collection of data science resources awesome-datascience
- Growing set of Kaggle kernels with emphasis on practice awesome-kaggle-kernels
- Vector Derivatives Notes
- Colah's blog on machine learning
- Clarification of statistical concepts -- in addition, there are stats videos available.
- The Elements of Statistical Learning by Hastie, Tibshirani, Friedman