Linklog | Nicolas Chagnet

Week 11

Learning Word Embedding (lilianweng.github.io) - #deep-learning
An old but fantastic reference on vector embeddings.

Week 7

Binary vector embeddings are so cool (emschwartz.me) - #llm #deep-learning #data
A description of the effect of binary quantization on embeddings. By restricting the dtype of embedding vectors, you can get a tradeoff between accuracy in latent space and size of the embedding. Using binary dtype seems to conserve a surprisingly high amount of the original information content (about 97%) while yielding a gigantic amount of saving in space (about 97% too here).
A Deep Dive into Memorization in Deep Learning (blog.kjamistan.com) - #deep-learning
An interesting series of articles explaining how machine learning models memorize data.
Solving differential equations using neural networks (labpresse.com) - #deep-learning #physics
Toy example of how to use neural networks to solve differential equations. This blew my mind when I first read it.

Week 6

A Visual Guide to How Diffusion Models Work (towardsdatascience.com) - #deep-learning #diffusion
An interesting dive into what makes diffusion models work. The summary is that diffusion models are models trained on data with noise to find the original data, at various level of noise. They eventually learn the probability distribution of the images in the space of all possible pixel arrangements. You can then iteratively denoise a pure Gaussian noise picture until you generate a new image: this is like sampling the learned probability distribution.

Week 5

Understanding LSTM Networks (colah.github.io) - #deep-learning
In-depth explanation of LSTM Networks. The figures on this blog are incredible and truly help explaining what happens inside the network.
A Brief Introduction to Recurrent Neural Networks (jaketae.github.io) - #deep-learning #python
Introduction and example of how to build a recurrent neural network from scratch.

Week 44

Transformers From Scratch (blog.matdmiller.com) - #deep-learning #llm
Thorough explanation of the Transformers model. If like me you've been confused about what's so special about transformers compared to RNNs or LSTMs, this might help.