Skip to main content
Brian Sunter

Machine Learning Study Guide

An outline of topics to study for understanding machine learning fundamentals.

A structured outline for learning machine learning, from mathematical foundations to practical implementation.

Mathematics

Linear algebra

Core concepts for understanding how ML algorithms manipulate data.

  • Vector spaces
  • Linear transformations
  • Matrices and determinants
  • Eigenvectors and eigenvalues
  • Systems of linear equations

Vector operations:

  • Addition, subtraction, scalar multiplication
  • Dot product and cross product

Advanced topics:

  • Linear independence and span
  • Orthogonality and projections
  • Eigendecomposition
  • Singular value decomposition (SVD)
  • Principal component analysis (PCA)

Calculus

Essential for understanding optimization and how models learn.

  • Limits and continuity
  • Derivatives and integrals
  • Partial derivatives
  • Gradient descent
  • Vector fields

Optimization methods:

  • Convex optimization
  • Newton’s method
  • Conjugate gradient

Probability and statistics

Fundamentals:

  • Random variables and events
  • Probability density functions
  • Distributions (normal, binomial, Poisson)

Recommended: “Introduction to Probability” by Blitzstein and Hwang

Advanced topics:

  • Markov chains
  • Hidden Markov models
  • Bayesian inference
  • Estimation and hypothesis testing

Recommended: “The Elements of Statistical Learning” by Hastie, Tibshirani, and Friedman

Regularization:

  • L1 regularization (Lasso)
  • L2 regularization (Ridge)
  • Sparsity

Computer science foundations

  • Algorithms and complexity
  • Data structures
  • Software engineering practices

Machine learning

Supervised learning

Learning from labeled data.

  • Linear regression
  • Logistic regression
  • Support vector machines (SVM)
  • Decision trees and random forests
  • Neural networks

Unsupervised learning

Finding patterns in unlabeled data.

  • Clustering (k-means, hierarchical)
  • Dimensionality reduction
  • Feature selection
  • Anomaly detection

Deep learning

  • Artificial neural networks (ANN)
  • Convolutional neural networks (CNN)
  • Recurrent neural networks (RNN)
  • Long short-term memory (LSTM)
  • Autoencoders
  • Transformers

Ensemble methods

  • Bagging and boosting
  • Random forests
  • Gradient boosting (XGBoost, LightGBM)

Data pipeline

Preprocessing

  • Data cleaning
  • Handling missing values
  • Normalization and scaling

Feature engineering

  • Feature extraction
  • Feature selection
  • Dimensionality reduction

Model evaluation

  • Train/test splits
  • Cross-validation
  • Metrics (accuracy, precision, recall, F1, AUC)

Tools and libraries

  • Python: Primary language for ML
  • NumPy/Pandas: Data manipulation
  • scikit-learn: Traditional ML algorithms
  • TensorFlow/PyTorch: Deep learning frameworks
  • Keras: High-level neural network API

Next steps

See my AI learning resources for courses, books, and tutorials on these topics.