Table of Contents

16/01/2019 Off By Achyuthuni Sri Harsha

Review of math fundamentals


  1. Probability and discrete distribution
  2. Vectors
  3. Matrices

Data Processing

Web scraping
  1. Handling Google Maps Location Data (in-time problem)
  2. Class size paradox and web scraping (using Amrita University placement data)
Data cleaning and imputation
  1. Null value imputation using KNN (mtcars data)

Python tutorials

Python for data science
  1. Getting started with python
  2. Data Visualisation with python
  3. Visualisation of tabular data
  4. A Not-so-Quick-but-Conceptual guide to Python  | Intermediate | Part 1
  5. A not so quick but conceptual guide to python notebook intermediate | part-2
  6. All you ‘really’ need to know | Python Notebook | Advanced – Pandas
  1. NetworkX introduction
  2. Introduction to network science
  3. Network Centrality
  4. Shortest path
  5. Network Flow problems
  6. Community detection
  7. Bipartite matching
  1. ML deployment in Flask
  2. Handling databases using python
  3. ORM

Exploratory Data Analytics

  1. Univariate Analysis (in-time problem)
  2. Multivariate Analysis (in-time problem)
  3. Multicollinearity (in-time problem)
  4. Time Series EDA (in-time problem)
  5. Combined: EDA in python
  6. Visualising tabular data

Factor analysis

  1. Curse of dimensionality
  2. Exploratory factor analysis

Inferential data analytics (Hypothesis testing)

  1. z-test and t-test (in-time problem)
  2. ANOVA test (smart cities data)
  3. Chi-Square Goodness of fit test (in-time problem)
  4. Chi-Square test of independence

Prediction algorithms (Supervised learning)

  1. Classification
    1. Logistic Regression
    2. CHAID decision trees
    3. CART classification
  2. Regression
    1. Part and partial correlation
    2. Linear regression (Boston housing problem)
  3. Machine Learning (Simulation on shiny
    1. Handling Imbalanced Classes
    2. Feature engineering (Python)
    3. Streaming Machine Learning (Blog post on Rolls Royce Data Labs website)
    4. ML using scikit-learn

Prescriptive Analytics (Optimization)

  1. Linear Programming and Sensitivity analysis (basic)
  2. Inventory planning model (with CPLEX code)
  3. Gradient descent for non-linear optimization (Adoption of a new product)
  4. Analytic Hierarchy Process for multi-criterion optimization (Selecting a phone)
  5. Bass forecasting model (Python)

Reinforcement Learning (Stochastic modelling)

  1. Recommendation system (associate mining)
  2. Markov Chains introduction (Customer Lifetime Value)

Time series forecasting

  1. Introduction to stationarity
  2. Stationarity hypothesis tests (in-time problem)
  3. Forecasting using ARIMA (in-time problem)
  4. ARIMA in python
  5. Seasonal time series


  1. Hierarchical Clustering (Market segmentation using wine data)
  2. K means clustering (Customer segmentation using credit card data)

Deep Learning

  1. Artificial Neural Network – part 1
  2. The math behind ANN

Other interesting posts

  1. Why are basics important in data sciences
  2. Boy-girl paradox

Higher education in Data Science

  1. Review on IIMB Business Analytics and Intelligence course
  2. Part-time data science masters – why and options
  3. What to look at when choosing part-time masters (as part of Imperial college student blogs)
  4. Why should you study MSc in Business Analytics part-time (as part of Imperial college student blogs)

Papers and publications

  1. Parametric Study of Cantilever Plates Exposed to Supersonic and Hypersonic Flows 
  2. Personal analytics: TIme management using google maps