Data Science & Machine Learning with Python

This is a repository that contains a list of data mining and machine learning algorithms with python using Anaconda platform. There's also an entire section on machine learning with Apache Spark in order to scale up these techniques to big data analyzed on a computing cluster.

Covered techniques

The following techniques used by real data scientists in the tech industry:

Regression analysis
K-Means Clustering
Principal Component Analysis
Train/Test and cross validation
Bayesian Methods
Decision Trees and Random Forests
Multivariate Regression
Multi-Level Models
Support Vector Machines
Reinforcement Learning
Collaborative Filtering
K-Nearest Neighbor
Bias/Variance Tradeoff
Ensemble Learning
Term Frequency / Inverse Document Frequency
Experimental Design and A/B Tests

Projects

In order to practice these techniques I've built the following projects:

Movie recommendation system using actual user rating data
Search engine works for Wikipedia data
Spam classifier

Getting started with python

This is a tutorial designed for software programmers who need to learn Python programming language from scratch.

Statistics and Probability Refresher

Mean, median, mode and introducing numpy, scipy and matplotlib
Standard deviation, population and sample variance
Data distributions

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.ipynb_checkpoints		.ipynb_checkpoints
1-mean-median-mode.ipynb		1-mean-median-mode.ipynb
2-population-sample-variance.ipynb		2-population-sample-variance.ipynb
3-data-distributions.ipynb		3-data-distributions.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Science & Machine Learning with Python

Covered techniques

Projects

Getting started with python

Statistics and Probability Refresher

About

Releases

Packages

Languages

a-djebali/data-science-python

Folders and files

Latest commit

History

Repository files navigation

Data Science & Machine Learning with Python

Covered techniques

Projects

Getting started with python

Statistics and Probability Refresher

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages