Best resources for studying machine learning
Overview
I’m very eager to start studying AI, but with so much to learn, I’m not sure where to start. Do I need a lot of math? What kind? Which areas should I focus on? How can I make sense of all the topics? What tools should I use? Additionally, how is the field developing, and what direction is it headed in?
I’m a software engineer with some math under my belt, and my goal is to gain a thorough understanding of AI so I can apply it to my work. I’m particularly interested in generative AI, natural language processing, and building intelligent agents. I also want to gain the necessary math skills to understand the fundamentals, but I don’t want to get too bogged down in some of the advanced mathematical details.
To help me on this journey, I’ve collected and summarized the best courses and books to help me get started on the right foot. I included the topics covered by each resource, so the list got pretty long, but it gives me a better idea of what to prioritize and how things fit together.
These materials are roughly in order I plan to study them, though some of the materials at the end are pretty advanced. I want to start by getting a practical overview, then go deep on the math and fundamentals.
I plan on starting with some of the most popular courses, like Andrew Ng’s Deeplearning.ai courses and the fast.ai courses.
Next, I plan to take a few other high quality courses like the 3blue1brown math YouTube courses, the HuggingFace course, and Andrej Karpathy’s Neural Network courses.
Then I plan to study a variety of O’Reilly books focused on practical topics.
Finally, I plan to study more in depth materials, such as a Berkeley AI course, some math textbooks, and some of the famous AI textbooks.
Deeplearning AI Intro Course
Introductory course by Andrew Ng covering practical machine learning topics using Python
Time: 2.5 months (5 hours/week)
Topics
Supervised learning
Linear regression
Logistic regression
Neural networks
Decision trees
Tree Ensembles
Unsupervised learning
Clustering
Dimensionality reduction
Recommender systems
Anomaly detection
Tools
Python
numpy
scikit learn
Tensorflow
XGBoost
Best Practices
Regularization to Avoid Overfitting
Evaluating and tuning models
Improving performance
Deeplearning.ai Deep Learning Course
Practical intermediate deep learning course by Andrew Ng
Topics
Tensorflow
Artificial Neural Networks
Convolutional Neural Networks
Recurrent Neural Networks
Transformers
Python Programming
Deep Learning
Backpropagation
Optimization
Hyperparameter Tuning
Machine Learning
Transfer Learning
Multi-Task Learning
Object Detection and Segmentation
Facial Recognition System
Gated Recurrent Unit (GRU)
Long Short Term Memory (LSTM)
Attention Models
Natural Language Processing
Practical Deep Learning Fast.ai
A free course designed for people with some coding experience, who want to learn how to apply deep learning and machine learning to practical problems.
Topics
Deployment
Neural net foundations
Natural Language (NLP)
From-scratch model
Random forests
Collaborative filtering
Convolutions (CNNs)
Deeplearning.ai Natural Language Course
How to design NLP applications that perform question-answering and sentiment analysis, create tools to translate languages, summarize text, and even build chatbots.
Time: 4 months (6 hours/week)
Topics
Sentiment Analysis
Transformers
Attention Models
Machine Translation
Word2vec
Word Embeddings
Locality-Sensitive Hashing
Vector Space Models
Parts-of-Speech Tagging
N-gram Language Models
Autocorrect
Sentiment with Neural Networks
Siamese Networks
Natural Language Generation
Named Entity Recognition (NER)
Reformer Models
Neural Machine Translation
Chatbots
T5 + BERT Models
Deeplearning.io Tensorflow Data and Deployment Course
Learn how to get your machine learning models into the hands of real people on all kinds of devices. Start by understanding how to train and run machine learning models in browsers and in mobile applications. Learn how to leverage built-in datasets with just a few lines of code, learn about data pipelines with TensorFlow data services, use APIs to control data splitting, process all types of unstructured data and retrain deployed models with user data while maintaining data privacy.
Time: 4 months (3 hours/week)
Topics
Tensorflow
Object Detection
JavaScript
Convolutional Neural Network
Tensorflow.js
Tensorflow Lite
Mathematical Optimization
Extraction, Transformation And Loading (ETL)
Data Pipelines
Deeplearning.io Generative Adversarial Networks Course
Introduction to image generation with GANs, charting a path from foundational concepts to advanced techniques through an easy-to-understand approach.
Time: 3 months (8 hours/week)
Topics
Generator
Image-to-Image Translation
Glossary of Computer Graphics
Discriminator
Generative Adversarial Networks
Controllable Generation
WGANs
Conditional Generation
Components of GANs
DCGANs
Bias in GANs
StyleGANs
Deeplearning.io Tensorflow Advanced
Expand your knowledge of the Functional API and build exotic non-sequential model types. You will learn how to optimize training in different environments with multiple processors and chip types and get introduced to advanced computer vision scenarios such as object detection, image segmentation, and interpreting convolutions. You will also explore generative deep learning including the ways AIs can create new content from Style Transfer to Auto Encoding, VAEs, and GANs.
Time: 5 months (6 hours/week)
Topics
Model Interpretability
Custom Training Loops
Custom and Exotic Models
Generative Machine Learning
Object Detection
Functional API
Custom Layers
Custom and Exotic Models with Functional API
Custom Loss Functions
Distribution Strategies
Basic Tensor Functionality
GradientTape for Optimization
Deeplearning.io MLOps Course
How to conceptualize, build, and maintain integrated systems that continuously operate in production.
Time: 4 months (5 hours/week)
Topics
Data Pipelines
Model Pipelines
Deploy Pipelines
Managing Machine Learning Production systems
ML Deployment Challenges
Project Scoping and Design
Concept Drift
Model Baseline
Human-level Performance (HLP)
TensorFlow Extended (TFX)
ML Metadata
Data transformation
Data augmentation
Data validation
AutoML
Precomputing predictions
Fairness Indicators
Explainable AI
Model Performance Analysis
TensorFlow Serving
Model Monitoring
General Data Protection Regulation (GDPR)
Model Registries
Deeplearning.io Data Science on AWS Course
Develop the practical skills to effectively deploy your data science projects and overcome challenges at each step of the ML workflow using Amazon SageMaker.
Time: 3 months (5 hours/week)
Topics
Automated Machine Learning (AutoML)
Natural Language Processing with BERT
ML Pipelines and ML Operations (MLOps)
A/B Testing, Model Deployment, and Monitoring
Data Labeling at Scale
Data Ingestion
Exploratory Data Analysis
Statistical Data Bias Detection
Multi-class Classification with FastText and BlazingText
Feature Engineering and Feature Store
Model Training, Tuning, and Deployment with BERT
Model Debugging, Profiling, and Evaluation
ML Pipelines and MLOps
Artifact and Lineage Tracking
Distributed Model Training and Hyperparameter Tuning
Cost Savings and Performance Improvements
Human-in-the-Loop Pipelines
Huggingface Course
This course will teach you about natural language processing (NLP) using libraries from the Hugging Face ecosystem — 🤗 Transformers, 🤗 Datasets, 🤗 Tokenizers, and 🤗 Accelerate — as well as the Hugging Face Hub.
Topics
Transformer Models
Fine-tuning a pretrained model
Sharing models and tokenizers
Datasets library
Tokenizers Library
Building and sharing demos
Optimizing for production
Huggingface Diffusion Models Class
👩🎓 Study the theory behind diffusion models
🧨 Learn how to generate images and audio with the popular 🤗 Diffusers library
🏋️♂️ Train your own diffusion models from scratch
📻 Fine-tune existing diffusion models on new datasets
🗺 Explore conditional generation and guidance
🧑🔬 Create your own custom diffusion model pipelines
Topics
pytorch
difussers and diffusion models
Fine tuning
Stable Difussion
Huggingface Deep Reinforcement Learning Course
📖 Study Deep Reinforcement Learning in theory and practice.
🧑💻 Learn to use famous Deep RL libraries such as Stable Baselines3, RL Baselines3 Zoo, Sample Factory and CleanRL.
🤖 Train agents in unique environments such as SnowballFight, Huggy the Doggo 🐶, MineRL (Minecraft ⛏️), VizDoom (Doom) and classical ones such as Space Invaders and PyBullet.
💾 Share your trained agents with one line of code to the Hub and also download powerful agents from the community.
🏆 Participate in challenges where you will evaluate your agents against other teams. You’ll also get to play against the agents you’ll train.
Topics
Q-Learning
Policy Gradient with PyTorch
Actor Critic Methods
Proximal Policy Optimization
Multi-Agents
Decision Transformers
offline Reinforcement Learning
Andrej Karpathy Neural Networks Zero to Hero Course
This is the most step-by-step spelled-out explanation of backpropagation and training of neural networks. It only assumes basic knowledge of Python and a vague recollection of calculus from high school.
Topics
Backpropagation
pytorch
Multi-layer perceptron
Loss function
Gradient descent optimization
Bigrams
Vector normalization
Tensor broadcasting
Model smoothing
One-hot encodings
Vectorized loss
Embeddings
Hidden layers
Negative log likelihood loss
Cross entropy
Overfitting
Learning rate
Character embeddings
Sampling from models
Google colab
TanH activation function
Batch normalization
Forward pass activation statistics
Backward pass gradient
Kaiming init
Parameter activation
Gradient statistics
Batchnorm
3blue1brown YouTube courses
Neural Networks from the Ground Up
The basics of neural networks, and the math behind how they learn
Topics
Neural Networks
Gradient Descent
Backpropagation
Essence of Linear Algebra
An introduction to visualizing what matrices are really doing
Topics
Vectors
Linear Combinations
Span
Basis Vectors
Linear Transformation
Matrices
Matrix Multiplication
Three dimensional linear transformations
Determinant
Inverse Matrices
Column Space
Null Space
Nonsquare Matrices
Dot Product
Duality
Cross Products
Cramer’s Rule
Change of basis
Eigenvectors and Eigenvalues
Abstract Vector spaces
Essence of Calculus
Visual introductions to the core ideas of derivatives, integrals, limits and more
Topics
Derivative
Chain Rule
Product Rule
Euler’s Number
Implicit Differentiation
Limits
L’Hôpital’s rule
Epsilon Delta
Integration
Fundamental Theorem of Calculus
Higher Order Derivatives
Taylor Series
Probability
An assortment of introductory ideas in probability
Topics
Bayes Theorem
Binomial Distribution
Probability Density Functions
Hands-On Machine Learning with Scikit-Learn and TensorFlow
By using concrete examples, minimal theory, and two production-ready Python frameworks—scikit-learn and TensorFlow you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn simple linear regression and progressing to deep neural networks. With exercises in each chapter to help you apply what you’ve learned, all you need is programming experience to get started.
Topics
Types of Machine Learning Systems
Supervised/Unsupervised Learning
Batch and Online Learning
Instance-Based Versus Model-Based Learning
Challenges of Machine Learning
End-to-End Machine Learning Project
Classification
Binary Classifier
Performance Measures
Cross validation
Confusion matrix
Multiclass classification
Training Models
Linear Regressions
Gradient Descent
Polynomial Regression
Learning Curves
Regularized Linear Models
Logistic Regression
Support Vector Machines
Linear SVM Classification
Soft Margin Classification
Nonlinear SVM Classification
Decision Function and Predictions
Training Objective
Quadratic Programming
The Dual Problem
Kernelized SVMs
Online SVMs
Decision Trees
Ensemble Learning and Random Forests
Voting Classifiers
Bagging and Pasting
Bagging and Pasting in Scikit-Learn
Out-of-Bag Evaluation
Random Patches and Random Subspaces
Random Forests
Extra-Trees
Feature Importance
Boosting
AdaBoost
Gradient Boosting
Dimensionality Reduction
PCA
Projection
Manifold Learning
Kernel PCA
LLE
Unsupervised Learning
Clustering
Gaussian Mixtures
Introduction to Artificial Neural Networks with Keras
From Biological to Artificial Neurons
Implementing MLPs with Keras
Fine-Tuning Neural Network Hyperparameters
Training Deep Neural Networks
Vanishing/Exploding Gradients Problems
Reusing Pretrained Layers
Faster Optimizers
Avoiding Overfitting Through Regularization
Custom Models and Training with TensorFlow
Using TensorFlow like NumPy
Customizing Models and Training Algorithms
TensorFlow Functions and Graphs
Loading and Preprocessing Data with TensorFlow
Data API
TFRecord
Preprocessing the Input Features
TF Transform
Deep Computer Vision Using Convolutional Neural Networks
Convolutional Layers
Pooling Layers
CNN Architectures
Implementing a ResNet-34 CNN Using Keras
Object Detection
Semantic Segmentation
Processing Sequences Using RNNs and CNNs
Recurrent Neurons and Layers
Training RNNs
Forecasting a Time Series
Handling Long Sequences
Natural Language Processing with RNNs and Attention
Generating Shakespearean Text Using a Character RNN
Sentiment Analysis
An Encoder–Decoder Network for Neural Machine Translation
Attention Mechanisms
Transformers
Representation Learning and Generative Learning Using Autoencoders and GANs
Stacked Autoencoders
Generative Adversarial Networks
Reinforcement Learning
Policy Search
Neural Network Policies
Policy Gradients
Q-Learning
TF-Agents Library
Training and Deploying TensorFlow Models at Scale
Serving a TensorFlow Model
Deploying a Model to a Mobile or Embedded Device
Training Models Across Multiple Devices
SQL for Data Analysis
You’ll learn how to use both common and exotic SQL functions such as joins, window functions, subqueries, and regular expressions in new, innovative ways—as well as how to combine SQL techniques to accomplish your goals faster, with understandable code.
Topics
Databases
Preparing Data for Analysis
Data cleaning
Deduplication
Nulls
Shaping data
Time Series Data
Dates and time
Trends
Windows
Seasonality
Cohort Analysis
Retention
Related Cohort Analysis
Cross section analysis
Text Analysis
Anomaly Detection
Experiment Analysis
Complex Data Sets
Practical Statistics for Data Scientists
Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.
Topics
Rectangular Data
Data Frames
Estimates
Mean, median, mode, variability, percentile
Distribution and sampling
Bias, central limit theorem, standard error, resampling, confidence interval, normal distribution
Normal, long tail, t, binomial, poisson, weibuill distributions
Statistical Experiments and Significance Testing
A/B testing, Hypothesis testing, null hypothesis
Pvalue, alpha, t-test, anova, chi-square, multi arm bandit
Regression and Prediction
Simple Linear regression
Multiple linear regression
Confidence and prediction intervals
Classification
Naive bayes
Discriminant analysis
Logistic regression
Imbalanced data
Statistical Machine Learning
KNN
Tree models
Bagging and Random Forest
Boosting
Unsupervised Learning
Principal Components Analysis
k-means clustering
Hierarchal clustering
Model Clustering
Scaling and categorical variables
Essential Math for Data Science
Master the math needed to excel in data science, machine learning, and statistics. In this book author Thomas Nield guides you through areas like calculus, probability, linear algebra, and statistics
Topics
Calculus
Probability
Linear algebra
Vectors
Matrices
Matrix decomposition
statistics
p-values
Statistical significance
Linear regression
Logistic regression
Neural networks
SymPy
NumPy
scikit-learn
Data Science Career
Data Science from Scratch
Get comfortable with the math and statistics at the core of data science, and with the hacking skills you need to get started as a data scientist. Packed with New material on deep learning, statistics, and natural language processing
Topics
Python
Matplotlib
Linear Algebra
Vectors
Matrices
Statistics
Probability
Bayes Theorem
Distribution
Central Limit Theorem
Hypothesis and Inference
p-value
Confidence intervals
p-hacking
Bayesian inference
Gradient Descent
Scraping Data
Working with Data
Dataclasses
Rescaling
Cleaning
Rescaling
Dimensionality reduction
Machine learning
Modeling
Overfitting
Bias-variance
Feature extraction
k-nearest neighbors
Model
Dimensionality
Naive Bayes
Simple Linear Regression
Multiple Regression
Logistic Regression
Decision Tree
Neural Networks
Deep Learning
Clustering
Natural Language Processing
Network Analysis
Eigenvector
Directed graphs
Recommender Systems
Collaborative Filtering
Matrix Factorization
Databases and SQL
Mapreduce
Python
numpy
pandas
scikit-learn
visualization
Practical Natural Language Processing
This book gives a comprehensive view on building real world NLP applications. it covers the complete lifecycle of a typical NLP project - right from data collection to deploying and monitoring the model. Some of these steps are applicable to any ML pipeline while some are very specific to NLP. The book also introduces task-specific case studies and domain-specific guides to build an NLP system from scratch.
Topics
NLP: A Primer
NLP Pipeline
Text Representation
Text Classification
Information Extraction
Chatbots
Topics in Brief
Social Media
E-Commerce and Retail
Healthcare, Finance, and Law
The End-to-End NLP Process
Deep Learning from Scratch
Shows you how neural networks work using a first principles approach. You’ll learn how to apply multilayer neural networks, convolutional neural networks, and recurrent neural networks from the ground up. With a thorough understanding of how neural networks work mathematically, computationally, and conceptually
Topics
Math Foundations
Fundamentals
Deep Learning from Scratch
Extensions
Convolutional Neural Networks
Recurrent Neural Networks
PyTorch
Generative Deep Learning by David Foster
Discover how to re-create some of the most impressive examples of generative deep learning models, such as variational autoencoders, generative adversarial networks (GANs), encoder-decoder models, and world models.
Topics
Generative Versus Discriminative Modeling
Probabilistic Generative Models
Deep Neural Networks
Convolutional Layers
Batch Normalization
Dropout Layers
Autoencoders
Variational Autoencoder
Using VAEs to Generate Faces
Generative Adversarial Networks
Oscillating Loss
Mode Collapse
Uninformative Loss
Hyperparameters
Discrimators
Wasserstein GAN
CycleGAN
Neural Style Transfer
LSTM Network
Stacked Recurrent Networks
Gated Recurrent Units
Bidirectional Cells
Encoder–Decoder Models
Music-Generating RNN
Reinforcement Learning
MDN-RNN
Controller Architecture
In-Dream Training
Transformer
ProGAN
Self-Attention GAN (SAGAN)
BigGAN
StyleGAN
Introducing MLOps
Introduces the key concepts of MLOps to help data scientists and application engineers not only operationalize ML models to drive real business change but also maintain and improve those models over time. Through lessons based on numerous MLOps applications around the world, nine experts in machine learning provide insights into the five steps of the model life cycle—Build, Preproduction, Deployment, Monitoring, and Governance
Topics
People of MLOps
Model Development
Data Sources and Exploratory Data Analysis
Feature Engineering and Selection
Training and Evaluation
Reproducibility
Productionalization and Deployment
Monitoring
Iteration and Life Cycle
Governance
Evaluating and Comparing Models
Adaptation from Development to Production Environments
Quality Assurance for Machine Learning
Reproducibility and Auditability
Machine Learning Security
Building ML Artifacts
Scaling Deployments
Model Degradation
Drift Detection in Practice
The Feedback Loop
The Feedback Loop
Model Governance
Responsible AI
MLOps in Practice
Consumer Credit Risk Management
Marketing Recommendation Engines
Consumption Forecast
Introduction to Statistical Learning
Accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications
This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Easier than Elements of Statistical Learning
Topics
linear regression
classification
resampling methods
shrinkage approaches
tree-based methods
support vector machines
clustering
deep learning
survival analysis
multiple testing
naïve Bayes
generalized linear models
Bayesian additive regression trees
matrix completion
UC Berkeley CS188 Intro to AI
This introductory Berkeley course accompanies the “Artificial Intelligence: A Modern Approach” book and provides lectures and course materials
Topics
Uninformed Search
A* Search and Heuristics
Constraint Satisfaction Problems
Game Trees
Minimax
Expectimax
Markov Decision Processes
Reinforcement Learning
Probability
Markov Models
Hidden Markov Models
Bayes’ Nets
Decision Diagrams
Naive Bayes
Perceptrons
Kernels and Clustering
Advanced Applications: NLP, Games, Cars, Robotics, and Computer Vision
Artificial Intelligence: A Modern Approach
The de facto bible of artificial intelligence* It combines in-depth treatments of introductory and advanced concepts, along with historical background and accessible explanations. Including algorithms, code and pseudo-code, the book sits between master’s and PhD
Focuses on machine learning, deep learning, probabilistic programming, multiagent systems, and includes sections where the AI’s utility function is uncertain, rather than certain.
Topics
Problem-solving
Searching
Adversarial Search and Games
Constraint Satisfaction Problems
Knowledge, reasoning, and planning
Logical Agents
First-Order Logic
Knowledge Representation
Automated Planning
Uncertain knowledge and reasoning
Probabilistic Reasoning
Decision Making
Machine Learning
Learning from Example
Learning Probabilistic Models
Deep Learning
Reinforcement Learning
Communicating, perceiving, and acting
Natural Language Processing
Deep Learning for NLP
Computer Vision
Robotics
An Introduction to Probability and Inductive Logic
Book focused on probability and logic from a philosophical rather than mathemetical perspective.
The book has been designed to offer maximal accessibility to the widest range of students (not only those majoring in philosophy) and assumes no formal training in elementary symbolic logic. It offers a comprehensive course covering all basic definitions of induction and probability, and considers such topics as decision theory, Bayesianism, frequency ideas, and the philosophical problem of induction.
Probability for the Enthusiastic Beginner
This book is a resource for high school and college students learning about probability for the first time. It covers all of the standard introductory topics, such as combinatorics, the rules of probability, Bayes’ theorem, and expectation value, and includes 150 worked-out problems. Calculus is not required, although some problems involve it. It can be used as a main text or supplement in an introductory probability course.
Topics
Combinatorics
Bayes Theorem
Stirling’s Formula.
Expected values
Variance
Standard deviation
Distributions
Uniform
Bernoulli
Binomial
Exponential
Poisson
Gaussian
Gaussian approximations
Law of large numbers
Central limit theorem
Correlation and regression
Elements of Statistical Learning
This book descibes the important ideas in areas such as data mining, machine learning, and bioinformatics in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry.
Topics
Overview of supervised learning
Linear methods for regression
Linear methods for classification
Basis expansions and regularization
Kernel smoothing methods
Model assessment and selection
Model inference and averaging
Additive models, trees, and related methods
Boosting and additive trees
Neural networks
Support vector machines and flexible discriminants
Prototype methods and nearest-neighbors
Unsupervised learning
Random forests
Ensemble learning
Undirected graphical models
High-dimensional problems
Statistical Rethinking: A Bayesian Course
A modern course focused on bayesian statistics, and includes a course
The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation.
Topics
Sampling
Linear models
Multivariate linear models
Overfitting, regularization, and information criteria
Interactions
Markov chain Monte Carlo
Big entropy and the generalized linear model
Counting and classification
Multilevel models
covariance
Missing data
Pattern Recognition and Machine Learning
Topics
Probability Theory
Model Selection
The Curse of Dimensionality
Decision Theory
Information Theory
Probability Distributions
The Gaussian Distribution
Exponential Family
Nonparametric Methods
Linear Models for Regression
Linear Basis Function Models
Bayesian Linear Regression
Linear Models for Classification
Discriminant Functions
Probabilistic Generative Models
Probabilistic Discriminative Models
The Laplace Approximation
Neural Network
Feed-forward Network Functions
Network Training
Error Backpropagation
Hessian Matrix
Mixture Density Networks
Bayesian Neural Networks
Kernel Methods
Gaussian Processes
Sparse Kernel Machines
Maximum Margin Classifiers
Relevance Vector Machines
Graphical Models
Conditional Independence
Markov Random Fields
Inference in Graphical Models
Mixture Models and EM
K-means Clustering
Mixtures of Gaussians
Approximate Inference
Variational Inference
Variational Linear Regression
Sampling Methods
Basic Sampling Algorithms
Markov Chain Monte Carlo
Hybrid Monte Carlo Algorithm
Continuous Latent Variables
Principal Component Analysis
Probabilistic PCA
Kernel PCA
Nonlinear Latent Variable Models
Sequential Data
Markov Models
Hidden Markov Models
Linear Dynamical Systems
Combining Models
Bayesian Model Averaging
Committees
Boosting
Tree-based Models
Conditional Mixture Models
Deep Learning Goodfellow Book
The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology
Topics
Linear Algebra
Probability and Information Theory
Numerical Computation
Deep Feedforward Networks
Regularization for Deep Learning
Optimization for Training Deep Models Gradient Descent and Structure of Neural Network Cost Functions
Tutorial on Optimization for Deep Networks
Batch Normalization
Convolutional Networks
Sequence Modeling: Recurrent and Recursive Networks
Linear Factors
Autoencoders
Representation Learning
Structured Probabilistic Models for Deep Learning
Monte Carlo Methods
Confronting the Partition Function
Reinforcement Learning: An Introduction
Part I covers as much of reinforcement learning as possible without going beyond the tabular case for which exact solutions can be found including UCB, Expected Sarsa, and Double Learning.
Part II extends these ideas to function approximation, such as artificial neural networks and the Fourier basis, and offers expanded treatment of off-policy learning and policy-gradient methods.
Part III has new chapters on reinforcement learning’s relationships to psychology and neuroscience, as well as an updated case-studies chapter including AlphaGo and AlphaGo Zero, Atari game playing, and IBM Watson’s wagering strategy.
The final chapter discusses the future societal impacts of reinforcement learning.
Topics
Tabular Solution Methods
Multi-arm Bandits
Finite Markov Decision Processes
Dynamic Programming
Monte Carlo Methods
Temporal-Difference Learning
Eligibility Traces
Planning and Learning with Tabular Methods
Approximate Solution Methods
On-policy Approximation of Action Values
Off-policy Approximation of Action Values
Policy Approximation
Psychology
Neuroscience
Applications and case studies
Papers
Adam: A Method for Stochastic Optimization
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Faster R-CNN: towards real-time object detection with region proposal networks
Neural Machine Translation by Jointly Learning to Align and Translate
Human-level control through deep reinforcement learning
Mastering the game of Go with deep neural networks and tree search
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Semi-Supervised Classification with Graph Convolutional Networks
Explaining and Harnessing Adversarial Examples
ImageNet Classification with Deep Convolutional Neural Networks(alexnet)
Deep Residual Learning for Image Recognition (resnet)
Attention Is All You Need (Transformers)
Mike Jordan Book list
Extremely rigorous books recommend by Mike Jordan from Berkeley intended for those focused on research. I will probably never read these and feels impossible to get through these in one lifetime.
Esentially all of the material in the following intermediate-level statistics book:
1.) Casella, G. and Berger, R.L. (2001). “Statistical Inference” Duxbury Press.
For a slightly more advanced book that’s quite clear on mathematical techniques, the following book is quite good:
2.) Ferguson, T. (1996). “A Course in Large Sample Theory” Chapman & Hall/CRC.
You’ll need to learn something about asymptotics at some point, and a good starting place is:
3.) Lehmann, E. (2004). “Elements of Large-Sample Theory” Springer.
Those are all frequentist books. You should also read something Bayesian:
4.) Gelman, A. et al. (2003). “Bayesian Data Analysis” Chapman & Hall/CRC.
you should start to read about Bayesian computation:
5.) Robert, C. and Casella, G. (2005). “Monte Carlo Statistical Methods” Springer.
On the probability front, a good intermediate text is:
6.) Grimmett, G. and Stirzaker, D. (2001). “Probability and Random Processes” Oxford.
At a more advanced level, a very good text is the following:
7.) Pollard, D. (2001). “A User’s Guide to Measure Theoretic Probability” Cambridge.
The standard advanced textbook is Durrett, R. (2005). “Probability: Theory and Examples” Duxbury.
Machine learning research also reposes on optimization theory. A good starting book on linear optimization that will prepare you for convex optimization:
8.) Bertsimas, D. and Tsitsiklis, J. (1997). “Introduction to Linear Optimization” Athena.
And then you can graduate to:
9.) Boyd, S. and Vandenberghe, L. (2004). “Convex Optimization” Cambridge.
Getting a full understanding of algorithmic linear algebra is also important. At some point you should feel familiar with most of the material in
10.) Golub, G., and Van Loan, C. (1996). “Matrix Computations” Johns Hopkins.
It’s good to know some information theory. The classic is:
11.) Cover, T. and Thomas, J. “Elements of Information Theory” Wiley.
Finally, if you want to start to learn some more abstract math, you might want to start to learn some functional analysis (if you haven’t already). Functional analysis is essentially linear algebra in infinite dimensions, and it’s necessary for kernel methods, for nonparametric Bayesian methods, and for various other topics. Here’s a book that I find very readable:
12.) Kreyszig, E. (1989). “Introductory Functional Analysis with Applications” Wiley.
Other
Karpathy’s CS231n: Convolutional Neural Networks for Visual Recognition
Bayesian Reasoning and Machine Learning
Principles of Mathematical Analysis by Walter Rudin