Linklog
A curated collection of links and resources I have found over time.
March 2025
- Learning Word Embedding (lilianweng.github.io)
An old but fantastic reference on vector embeddings.
February 2025
- Binary vector embeddings are so cool (emschwartz.me)
A description of the effect of binary quantization on embeddings. By restricting the dtype of embedding vectors, you can get a tradeoff between accuracy in latent space and size of the embedding. Using binary dtype seems to conserve a surprisingly high amount of the original information content (about 97%) while yielding a gigantic amount of saving in space (about 97% too here).
- A Deep Dive into Memorization in Deep Learning (blog.kjamistan.com)
An interesting series of articles explaining how machine learning models memorize data.
- Solving differential equations using neural networks (labpresse.com)
Toy example of how to use neural networks to solve differential equations. This blew my mind when I first read it.
- A Visual Guide to How Diffusion Models Work (towardsdatascience.com)
An interesting dive into what makes diffusion models work. The summary is that diffusion models are models trained on data with noise to find the original data, at various level of noise. They eventually learn the probability distribution of the images in the space of all possible pixel arrangements. You can then iteratively denoise a pure Gaussian noise picture until you generate a new image: this is like sampling the learned probability distribution.
January 2025
- Understanding LSTM Networks (colah.github.io)
In-depth explanation of LSTM Networks. The figures on this blog are incredible and truly help explaining what happens inside the network.
- A Brief Introduction to Recurrent Neural Networks (jaketae.github.io)
Introduction and example of how to build a recurrent neural network from scratch.
October 2024
- Transformers From Scratch (blog.matdmiller.com)
Thorough explanation of the Transformers model. If like me you've been confused about what's so special about transformers compared to RNNs or LSTMs, this might help.