100 Machine Learning Tips and TRICKs to Celebrate YouTube Partner

Channel	Publish Date	Thumbnail & View Count	Download Video
	Publish Date not found	0 Views	View on YouTube Download Video

This channel now partners with YouTube!

Celebrate with me and these 100 machine learning tips!

Whether it concerns training, learning, evaluation or MLOps.

———————————————————————————————————————
Sign up for my newsletter for more insights:
https://dramsch.net/nieuwsbrief
———————————————————————————————————————

The blog post contains links and code snippets for you to try:
https://dramsch.net/posts/100-machine-learning-tips/

Here are a few:

Missing data https://buttondown.email/jesper/archive/towels-have-quite-a-dry-sense-of-humor/
ConvNext 2020 https://paperswithcode.com/paper/a-convnet-for-the-2020s
The Illustrated Transformer https://jalammar.github.io/illustrated-transformer/
GANs https://www.kaggle.com/code/jesperdramsch/getting-started-with-standard-gans-tutorial
Cycle GANs https://www.kaggle.com/code/jesperdramsch/understanding-and-improving-cyclegans-tutorial

———————————————————————————————————————
My camera gear
https://dramsch.net/r/gear
My music
https://dramsch.net/r/epidemic
———————————————————————————————————————
️Timestamps

00:00 – Start
00:20 Learn more about shortcuts and compression
00:31 Handle missing data correctly
00:50 Read the Convnext 2020 paper for CNNs
01:25 Let experts label your data
01:49 Learn more about transformers
01:59 For regression, don't forget R²
02:15 GANs are easier to train than you think
02:32 Get to know your data
02:44 Split your test set as quickly as possible
03:01 Transfer learning is great
03:23 Go with the basics
03:37 Tune your hyper parameters
03:48 Use cross-validation and baseline models
04:17 Use data augmentation
04:31 Use explainable AI
04:48 Be careful with benchmark results
05:02 Put your papers on Arxiv
05:15 Cut through the noise
05:25 Publish your code
05:49 Talk to domain scientists
06:16 Research the literature
06:38 Use benchmarks
06:50 Check for class imbalances
07:02 Build trust through communication
07:19 Build credibility benchmarks
07:38 Why class imbalance is difficult
08:04 Use Pytorch lightning
08:14 Never upgrade CUDA
08:22 Train your models online
08:36 Don't overpromise solutions
08:51 Overfit a small batch for debugging
09:03 Use Adam or SGD optimizers
09:25 Set your gradients to None
09:37 Try Gradient clipping if you get NaNs
09:50 Fuse small operations
10:06 Reduce batch size to replicate paper
10:16 Don't mix BatchNorm with prejudices
10:26 Pin Pytorch memory and monitor your weight loss
10:45 Use gradient accumulation
11:06 Be careful with Softmax
11:32 Use mixed precision
11:42 Inspect bad data points
11:52 Build redundancy into your MLOps
12:01 Loading Pytorch async data
12:17 Use the classification report
12:28 Keras Lambda layers
12:38 Don't use random forests just for feature interests
12:55 Use XGBoost and neural networks
13:05 Einsum is great!
13:25 Examine adjacent fields
13:40 Hydra for configurations
13:54 MissingNo library
14:04 Pandas Profiler
2:15 PM Papers with code
14:34 Try Unets
14:44 Use Stop Early
14:54 Set your dropout correctly
15:04 View Profilers
15:14 Experience repetition
15:24 Use schedules in production
15:34 Empty Pytorch and TF cache
15:44 Normalize your input
15:54 Use robust scalers
16:05 Finding it hard to train monsters
16:21 Random input sizes
16:34 Use GANs for real-world data
16:52 Setting up data pipelines
17:02 Use confusion matrices and find the maximum batch size
17:21 Use checkpoints on Colab
17:35 Learn the different model APIs
17:51 Debugging with Tensorboard
18:01 Preallocate memory for dynamic tensors
18:13 Feature engineering
18:37 Random Forest can overvalue noisy elements
18:48 Read the documents
19:05 Ensemble models
19:15 Always consider whether a model should be built at all
19:25 Remove correlated samples from training data
19:35 Dare to distance yourself from standards
19:45 Register your experiments
20:02 Build smaller models
20:14 Change Kaggle sort
20:39 Learn from Kaggle
20:49 Create ablation studies
20:57 View regularization techniques
21:15 Learning rate planner
21:39 Don't overdo it with the hand
21:56 Create decor-related validation and test sets
22:35 Create Tensors on device
22:45 All randomness resolved before publication
22:59 Visualize your training
23:18 Compare models with AIC
23:30 Publish your model weights
23:40 Look at your results
24:01 Huber loss
24:19 Trust domain scientists
24:50 Don't believe all the old ML wisdom
25:04 Off

———————————————————————————————————————
Social

Linkedin: https://dramsch.net/linkedin
Twitter: https://dramsch.net/twitter

Main website: https://dramsch.net
Community: https://dramsch.net/patreon

———————————————————————————————————————
Disclaimer
Jesper Dramsch is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.

Opinions from me. No financial advice. Sponsors are acknowledged. For entertainment purposes only.