1. What is Data Science
Demand of Data Science
Venn Diagram
Pipeline
Roles
Team
Knowledge Check
2. Field of study
Big Data overview
Programming involvement in Data Science
Statistics
Knowledge check
3. Ethics
Ethical issues
Knowledge check
4. Data Sources (Getting Data)
Data Metrics
Existing data
APIs
Scraping
Creating Data
Knowledge check
5. Data Exploration (Cleaning Data)
Exploratory graphs
Exploratory statistics
Knowledge check
6. Programming
Spreadsheets
R programming
Python
SQL
Web formats
Knowledge check
7.Mathematics
Algebra
Systems of equations
Calculus
Big O
Bayes probability
Knowledge check
8. Applied Statistics
Hypothesis
Confidence
Problems
Validating
Knowledge check
9. Machine Learning
Linear Regression with one and multiple variables.
Linear regression predicts a real-valued output based on an input value. We discuss the application of linear regression to housing price prediction, present the notion of a cost function, and introduce the gradient descent method for learning.
Cost function
Gradient descent
Normal Equations
Logistic regression. What if your input has more than one value? In this module, we show how linear regression can be extended to accommodate multiple input features.
Cost Function
Gradient descent solution.
Neural Networks. Neural networks is a model inspired by how the brain works. It is widely used today in many applications: when your phone interprets and understand your voice commands, it is likely that a neural network is helping to understand your speech;
Back propagation
Application of Neural Network
Support Vector Machines (SVM). Support vector machines, or SVMs, is a machine learning algorithm for classification. We introduce the idea and intuitions behind SVMs and discuss how to use it in practice.
Large Margin classification
Kernels
UNSUPERVISED
Clustering
Gaussian Mixture Models
HMM
10. R Programming
Writing code and setting your working directory
Getting started and R nuts and Bolts
R console Input and evaluation
Data types – R Objects and attributes
Data types – Vectors and Lists
Data types – Matrices
Data types – Factors
Data types – Missing values
Data types – Data frames
Data types – Names Attributes
Data types – summary
Reading Tabular Data
Reading large tables
Textual data formats
Connections: Interfaces to outside world
Subsettings – Basics
Subsettings – Lists
Subsettings – Matrices
Subsettings – Partial Matching
Subsettings – Removing Missing values
Vectorized Operations
10. Communicating
Interpretability
Actionable insights
Visualization for presentation
Reproducible research
Knowledge check
Conclusion and final test