Linklog
A curated collection of links and resources I have found over time.
February 2025
- A Gentle Intro to Running a Local LLM (www.dbreunig.com)
Has lots of great insights on which LLM models to use locally, depending on needs and performance available.
- Binary vector embeddings are so cool (emschwartz.me)
A description of the effect of binary quantization on embeddings. By restricting the dtype of embedding vectors, you can get a tradeoff between accuracy in latent space and size of the embedding. Using binary dtype seems to conserve a surprisingly high amount of the original information content (about 97%) while yielding a gigantic amount of saving in space (about 97% too here).
- A Deep Dive into Memorization in Deep Learning (blog.kjamistan.com)
An interesting series of articles explaining how machine learning models memorize data.
- A Visual Guide to How Diffusion Models Work (towardsdatascience.com)
An interesting dive into what makes diffusion models work. The summary is that diffusion models are models trained on data with noise to find the original data, at various level of noise. They eventually learn the probability distribution of the images in the space of all possible pixel arrangements. You can then iteratively denoise a pure Gaussian noise picture until you generate a new image: this is like sampling the learned probability distribution.
November 2024
- Perspectives on diffusion – Sander Dieleman (sander.ai)
Some interesting thoughts on diffusion models.