Hey,
This list will cover the statistical ML topics that might be useful to anyone who's prepping for ML software positions. The depth in which one needs to study depends on multiple factors including the company, team in the company and the person him/herself.
-
- What is entropy? Information gain (IG) concepts
- Gradient Boosting
- Bagging
- XGBoost (Why popular - parallelization)
- Trees for classification versus regression
- CART/Regression Trees, algorithmic change to incorporate regression in trees (maximum, mean of samples in each leaf to make final prediction)
- Variance reduction method instead of IG
-
Estimation strategies: Maximum likelihood (MLE) versus Maximum apriori (MAP)
-
Naive Bayes, Logistic Regression
- Generative versus Discriminative models
- Logistic regression intuition from a perceptron
- Loss functions for Logistic regression
- Multiclass LR (derivations for likelihood estimation and gradient calculations)
- How Multiclass LR is different from MLPs (Multi-layer perceptron)
-
Regularization
- Types, differences, uniqueness in norms L0, L1, L2
- Why L3, L4, L5, .. norms are not used
- Why is L1 sparse?
- Bagging - Boosting - Cross validation
- Boosting loss similarity to log-loss/Logistic regression
-
Regularization in Deep Networks
- Dropouts
- BatchNorm (Is it a regularizer?)
- Data augmention as regularization
- Early stopping, multitask learning, adversarial learning
- Zoneouts, dropconnect (specifically for LSTMs)
-
- What PCA?
- Loss of PCA
- Difference between the two, convexity of both their losses
- Eigenvalue calculations
- What they depict, why important
-
Class imbalance issues
- Algorithmic ways
- Sampling ways
-
BayesNet and unsupervised learning
- Why inference on BayesNet is intractable?
- Inference
- Monte carlo methods
- Giibs Sampling
- Expectation-Minimization
- Gaussian Mixture models
- KMeans - loss and code from scratch
- KNNSs and how they are different from KMeans
-
Metrics to test a model
- Precision, recall, F1 - differences, use cases
- AUC, area under ROC curve
- What the area signifies? use-case based questions
-
Different Sampling Techniques
- Gibbs
- Reservoir Sampling
- Importance Sampling
-
SVMs
- Hinge loss
- Code implementation
-
Linear Regression - loss function calculation and derivations
- MLE vs MAP (Different estimation strategies)
- How MAP brings regularization in linear regression loss
- Convexity, solving the loss directly
- Kernel regression
-
ICA (Independent component analysis) - difference from PCA/SVD
- When to use ICA?
-
Difference in decision boundaries for all algorihtms (Tree vs Logistic vs Linear Reg vs SVMs vs Naive Bayes)