### Module 0: Introduction & Outline

#### About

#### What is Data Science?

### Module 1: Required Background

#### Math: Stats, Calculus, Linear Algebra

#### Programming: Basics, Data Structures, Algorithms

#### Databases: Relational Algebra, SQL

#### Important Concepts: Regular expressions, Information Entropy, Distance measurements, OLAP, ETL, BI VS BA and CAP

#### Tools Introduction: WEKA, Python, R

#### Tutorials:

#### Beginners guide to using servers for data science

#### Installing Python Data Science Stack

#### R tutorial

#### SQL Tutorial

#### Big Data Tools Introduction: Hadoop, Hive, Pig, and many more!

#### Note: there is a big data section below, this is just an introduction

### Module 2: Data Science Framework

#### Ask > Acquire > Assimilate > Analyze > Answer > Act

#### What questions to ask

#### What is Data Mining and Analysis Overview

#### Feature Engineering

### Module 3: Aquire Data

#### Downloading Data, Scraping Data, Logging Data, Streaming

### Module 4: Assimilate Data

#### Processing Data: Extract/Transform/Load, Data Cleaning, Outlier Detection, Filtering, Iputation, Dimensionality Reduction, Normalization and Transformation

#### Aggrigation: Exploratory data analysis

### Module 5: Analyze Data

#### Analyse Framework: Describe, Discover, Predict, Advise

### Module 6: Describe

#### Exploratory Data Analysis

#### Clustering: K means clustering, x means clustering, topic modeling

### Module 7: Discover

#### Clustering

#### Association Rule Mining

#### Hypothesis Testing

### Module 8: Predict

#### Model Evaluation: Evaluation metrics

#### Model Selection: Cross validation

#### Learning Curves: Bias vs Variance Trade-off

#### Parameter Tuning: Grid search

#### Ensembling: Combining Models

#### Boosting: Creating Data

### Module 9: Regression

#### Tutorials: Simple Linear, generalized, non-linear, multi regression (coming soon!)

### Module 10: Classification

#### Naive Bayes: Bayes theorem, naive bayes classifier

#### Decision Trees: Entropy based Decision Tree, C4.5, boosted trees, ensembled trees, Random forests

#### Rule Based Learning: One R, Prism, Trees and Rules

#### Logistic Regression

#### Support Vector Machines

#### K Nearest Neighbor

#### Hidden Markov Model

#### Bayesian Network

#### Artificial Neural Networks

#### Module 11: Deep learning

natural language processing: text mining, topic modeling, sentiment analysis

bioinformatics