Linklog | Nicolas Chagnet

August 2025

Week 33

Best Practices for Building Agentic AI Systems: What Actually Works in Production - UserJot (userjot.com) - #llm
This is an excellent article about multi-agent structure. It resonated with my own experience building such systems, the advice is on point and the overall recipe is great for building resilient agentic systems.

July 2025

Week 31

Jujutsu For Busy Devs, Part 2: "How Do I...?" (maddie.wtf) - #vcs
When picking up a new tool like jujutsu, it's always useful to have a guided introduction with the previous tool as a reference. This article is a great colleckt of "How do I..." questions you might ask yourself with jj.

Week 30

Writing Python like it’s Rust (kobzol.github.io) - #rust #python
I love writing in Rust for fun, and I have to write in Python for work where I tend to miss some of Rust's features. This article is a great review on how to adapt Rust's idiomatic style to write better python.
Functional Documentation (www.dzombak.com) - #best-practices
This article brings forth an interesting aspect of documentation: making it "load bearing" ensures it is maintained more regularly.
Jujutsu For Busy Devs (maddie.wtf) - #vcs
As I keep going on my discovery journey with jj, I like finding articles like this one which gives you a bird's eye view of a full workflow.

Week 28

How to Think About Time in Programming - Shan Rauf (shanrauf.com)
Everything you've ever wanted to know about handling time in computer systems, and then some.

Week 27

330× faster: Four different ways to speed up your code (pythonspeed.com) - #python
The title of the article feels a bit clickbait-y, but the content is very interesting and takes you through a whole performance improvement journey. To add to this, the author also accounts for rewriting of code in lower level languages like Rust.
Application Logging in Python: Recipes for Observability (www.dash0.com) - #python
I always find the logging module in Python to be quite complex and hard to use. This article helps make it clearer how to build complex logging system using the standard library only.
greyblake/kinded: Generate Rust enum variants without associated data (github.com) - #rust #library
Useful Rust crate to work with complex Enums and automatically build a 'Kind' Enum.

June 2025

Week 26

Rust: A unique perspective (limpet.net) - #rust
Rust's ownership system can sometimes feel a bit complex, but this article does a really good job at explaining why it is how it is and how each data structure plays a role in this.

Week 25

Introduction to the A* Algorithm (www.redblobgames.com) - #algorithms
I always love a good algorithm visualisation, and this article does a really good job at explaining the A* and Dijkstra graph exploration algorithms.

Week 23

Why do philosophy of physics when you can do physics itself? | Aeon Essays (aeon.co) - #physics
As a physicist, I have always been drawn towards philosophical works that tackle the deep questions I was trying to understand. This article really resonated with that part of myself. There is more to being a physicist than pure calculation or experimentation. The curiosity and interest for the knowledge about our reality is what drives us, and it is something we have in common with philosophers.
jcrist/msgspec: A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML (github.com) - #python
A well-known Python library to handle structured data with serialization/deserialization. Very similar to pydantic, but much faster.

May 2025

Week 22

Reservoir Sampling (samwho.dev) - #algorithms
A very interesting sampling method specialized in fair selection of streaming events.
Five simple things that will immediately improve your diagrams (vexlio.com)
I'm always looking to improve my diagrams and visualisations. It's a tough topic, and I'm not really an artist at heart. This article not only contains really useful (and specific) advice on how to improve diagrams, it also illustrates each tip quite well, further reinforcing the point.

Week 21

The Copilot Delusion (deplet.ing)
Opinionated take on using Copilot. I love it.
Are you more likely to die on your birthday? (pudding.cool) - #data #statistics
A fun analysis of the birthday effect using actual data and thorough methodology.

Week 20

dtolnay/thiserror: derive(Error) for struct and enum error types (github.com) - #rust #library
This is a very useful crate providing macros to make wrapping custom/various error types in one error Enum. Importantly, it is equivalent to using the standard library and does not introduce any custom error handling, just reducing boilerplate.
Adventures in Imbalanced Learning and Class Weight | andersource (andersource.dev) - #data-science
The question of imbalanced classes is a source of recurring discussions within the data science community. Common wisdom says to weigh the samples inversely proportional to their frequency in order to make sure all classes get enough representation during training. This post provides a thorough mathematical derivation that this does not work for the F1-score. It does work for different metrics, though, and so has some value there. The important takeaway from this is to always consider the metric most relevant to the problem at hand, and adapt the methodology to that.

Week 19

astral-sh/ty: An extremely fast Python type checker and language server, written in Rust. (github.com) - #python #tools
I've been waiting for the astral type checker for a while: mypy is just excruciatingly slow, and every astral product tends to be a superperformer. And finally, it looks like it's coming together, with a preview release!
zerowidth positive lookahead (zerowidth.com) - #vcs
I'm still itching to try jujutsu in more details, but every time I dive in the documentation, I feel like I'm missing a piece. I've been wondering if I just need to "see it at work", see how someone actually uses it, especially from my perspective. This post is much closer to that and I found it useful.

Week 18

dataframely — A declarative, 🐻‍❄️-native data frame validation library (tech.quantco.com) - #data #library
I've been working a lot on our data pipelines at work, switching to polars mostly for performance and introducing rigorous checks and validations of data at various stages. I haven't yet used dataframely, but its principle really resonates with my use case, so I recommend checking it out.

April 2025

Week 17

Bloom Filters: A Memory-Saving Solution for Set Membership Checks (www.thecoder.cafe)
Bloom filters are interesting data structures. This blog post explains them very well!
Are polynomial features the root of all evil? (alexshtf.github.io) - #data-science
This is a great article presenting various polynomial bases used in mathematics (canonical, Legendre, Chebyshev) and how they can be used to fit data. It is well-known that they tend to overfit and be hard to regularize, but by using an appropriate basis for this kind of problem (Bernstein), you can get really good results. Interestingly, the reasoning behind this choice reminds me a lot of the kind of physics reasoning with regards to scaling and units.

Week 16

14 Advanced Python Features | Edward Li's Blog (blog.edward-li.com) - #python
You can find all sorts of beginner "top 10 features of X" online, and most of the time, they're basic and barely interesting. This article attempts to go counter that experience and, at least in my case, succeeded in teaching me a few things and provided some interesting points of discussion.

Week 15

A Visual Exploration of Gaussian Processes (distill.pub) - #optimization
This is a very thorough and well designed visual introduction to Gaussian Processes and Bayesian optimisation. The article features interactive visualisations, which I found great to truly get a feel for what's happening.
A feel for the data | Briefer (briefer.cloud)
This is a high-quality review of how visualisations shape our understanding of data. Its focus on the strengths of each visualisation type makes it a great learning resource to improve our storytelling skills.
Managing friction (arslan.io)
This article shares an interesting viewpoint on the role of friction in our lives, both as a positive and negative influence.
Getting Started with TDD: A Practical Guide to Beginning a Lasting Practice (8thlight.com)
TDD can feel daunting, and advocacy to strict adherence of TDD principles can be off-putting when you are starting with it. This article does a good job of reminding all of us of the pragmatic take that some testing is better than no testing, and that TDD just like any other practice, is something you learn with time.

March 2025

Week 14

Writing useful Documentation (www.blog.philodev.one)
A great write-up on how to write good documentation.

Week 12

Don't Be Afraid Of Types (lmika.org)
Adding new types to existing codebases can be daunting, but one shouldn't be shy to do what's necessary. This is a good opinion piece on this topic!
"Vibe Coding" vs Reality (cendyne.dev)
Read this if you want a good reality check on the current "vibe coding" trend.
A Visual Guide to LLM Agents (newsletter.maartengrootendorst.com) - #llm
It is possibly one of the best summaries out there of how LLMs function, broken down by high-level components (memory, tools), and well illustrated.

Week 11

Learning Word Embedding (lilianweng.github.io) - #deep-learning
An old but fantastic reference on vector embeddings.
On the Importance of Naming in Programming (wasp.sh) - #best-practices
Some musings on the importance of good naming conventions in programming.
How To Boil the Mediterranean Sea (benbyfax.substack.com)
This is an extremely interesting take on recent global warming data and the role of sulfur in masking some of the effects.
Algorithms Books (algorithmsbook.com) - #optimization
A fantastic collection of free textbooks on algorithms for optimization, decision making and validation.
Slidev (sli.dev) - #tools
During my PhD, I wrangled with beamer for important presentations, but I always yearned for a simpler markdown-based system for smaller, recurrent presentations. I just discovered slidev, and it just checked every feature I would want from this, and more.

Week 10

Succinct data structures (blog.startifact.com) - #data
Succinct data structures are clever ways to pack a lot of information in lightweight structures like bit vectors. A very interesting read!
patrick-kidger/jaxtyping (github.com) - #python #library
I've been looking for a good numpy and pytorch typing system in Python. Initially written for Jax, this library looks like exactly what I wanted.
Understanding Attention in LLMs (bartoszmilewski.com) - #llm
This is a good example that even if you understand the math behind a concept, there's nothing like good storytelling. I knew how attention worked, but this post brillantly summarized it and clarified some steps for me. A great read!
Death of Best Practices (korshakov.com) - #best-practices
An interesting take on the rigidity of best practices and how much more productive we can be once we let go of them.
Markov Chains explained visually (setosa.io) - #algorithms
A very neat summary of what Markov chains are and how they work, with beautiful animations.

Week 9

Some Advanced Typing Concepts in Python (jellis18.github.io) - #python
Another article about python's type system. This one is addressed to a more advanced audience. I had been looking for such a resource for a while, and I wasn't disappointed.
Abstract Base Classes and Protocols: What Are They? When To Use Them?? Lets Find Out! (jellis18.github.io) - #python
Very cool breakdown of the difference between abstract classes and protocols in python. Well written and with lots of clear examples.

February 2025

Week 9

Git Branching for Small Teams (victoria.dev) - #vcs
A good git workflow for small teams. Reading it, it happens to be the one we use in my team, and I can confirm it's a very effective one!
SolracHQ/bmath (github.com) - #tools
An interesting CLI math tool, with its own language and sane defaults.
Do not log (sobolevn.me)
An interesting analysis of the cost of modern logging infrastructure and its usefulness (or lack thereof).

Week 8

Generating Mazes (healeycodes.com) - #algorithms
A great introduction to maze generation algorithms with informative visual.
Summary of Major Changes Between Python Versions (www.nicholashairs.com) - #python
This is a very useful reference sheet containing major changes added by every new python version. Extremely handy if you have to update an old codebase multiple versions at once.
Prototyping in Rust (corrode.dev) - #rust
This article presents various pieces of advice and tips on how to efficiently write Rust code at the prototype stage. Most introductory material on the language focuses on "proper use of syntax." But prototyping is often a compromise between code quality and coding efficiency, and this article makes some great suggestions on how to do that.
Deep dive into LLMs like ChatGPT by Andrej Karpathy (TL;DR) (anfalmushtaq.com) - #llm
A TL;DR version of Andrej Karpathy's "Deep dive into LLMs like ChatGPT" video. Manages to keep the essentials but presents them in digestible clear chunks.
(Ab)using General Search Algorithms on Dynamic Optimization Problems (dubovik.eu) - #optimization #algorithms
An interesting analysis of various optimization algorithms applied to a simple dynamical programming problem. Features beautiful visualizations of those algorithms.
uchū (uchu.style) - #web-dev
uchū is a minimalistic color palette based on OKLCH color space. I personally find it very aesthetically pleasing.
flywhl/logis (github.com) - #python #vcs #library
An interesting library to record ML experiments metadata through commit messages. Even better, it supports a query language to find which commit satisfies a given criterion.

Week 7

jj init (v5.chriskrycho.com) - #vcs
I've been more and more tempted by jujutsu as a drop-in replacement for git. Its default way to handle changes seems so sane compared to git. This article is a very thorough and accessible introduction to how it works, and it definitely nudged me further along the jj train.
Luxa CSS (www.luxacss.com) - #web-dev
This is an interesting CSS framework which picks some parts out of Tailwind while also being more minimalistic. Bonus point: it was made by the creator of the fantastic Dracula theme.
How I Use Git Worktrees (matklad.github.io) - #vcs
Useful example of a git worktree workflow. Worktrees help avoiding all the stashing and branch hopping a typical workflow would have. You can pull the repository multiple times on different branches and work on different features, review pull requests, run automated tests, etc..., without having to break your flow.
Binary vector embeddings are so cool (emschwartz.me) - #llm #deep-learning #data
A description of the effect of binary quantization on embeddings. By restricting the dtype of embedding vectors, you can get a tradeoff between accuracy in latent space and size of the embedding. Using binary dtype seems to conserve a surprisingly high amount of the original information content (about 97%) while yielding a gigantic amount of saving in space (about 97% too here).
efugier/smartcat (github.com) - #cli #LLM #tools
An interesting CLI tool designed to call on to LLMs from the CLI with isolated short prompts. Seems to adhere to core Unix philosophy unlike most AI tools out there. Handles both local and hosted LLMs.
How I program with LLMs (crawshaw.io) - #llm
Some interesting reflections on how to use LLMs in daily development work. I personally adhere mostly to the "autocomplete" part with Github Copilot, and I'm getting used to the "search" part where the LLM helps me find information on some language or coding paradigm faster than I can search it. I'm not yet onboard with "Chat-driven programming".
A Deep Dive into Memorization in Deep Learning (blog.kjamistan.com) - #deep-learning
An interesting series of articles explaining how machine learning models memorize data.
ben-nour/SQL-tips-and-tricks (github.com) - #SQL #best-practices
I'm not a great SQL user, I have experience (mainly from database management in web development and now as a data scientist) but I don't consider myself an SQL wizard. This list of opinionated "tips" was quite useful to me.
How to fine-tune open LLMs in 2025 with Hugging Face (www.philschmid.de) - #llm
An in-depth example of how to fine-tune an LLM using the Hugging Face ecosystem.
aneeshnaik/lintsampler (github.com) - #python #library
A useful Python library to sample custom probability distributions. Looks useful if the PDF is expensive to compute.
Skforecast (skforecast.org) - #library #data-science #python
A Python library for timeseries forecasting with very extensive features. The documentation also features some in-depth pedagogical explanations of how to properly forecast data and what methods can be used to improve results.
Bayesian Methods for Hackers (dataorigami.net) - #statistics #python
An illustrated introduction to Bayesian statistics using Jupyter notebooks. I was always confused about the difference between Bayesian and Frequentist approaches until I read this.
Helix (helix-editor.com) - #tools #rust
A rust-based alternative to neovim with opinionated defaults. After setting up an LSP for Python, it immediately became my daily driver.
LoRA (jaketae.github.io) - #llm
Explanation of LoRA methods for LLMs.
Thinking About Recipe Formats More Than Anyone Should (rknight.me) - #markdown
An interesting reflection on markup languages for recipes. I was surprised other people spent as much time as I did pondering on recipe formats.
Rust for the Polyglot Programmer (www.chiark.greenend.org.uk) - #rust
A book introducing Rust for programmers with experience in other languages. I'm not polyglot enough, but some of it helped me better understand the design choices in the language.
Effective Simulated Annealing with Python (nathan.fun) - #optimization #python
Fantastic introduction to the simulated annealing metaheuristic in Python. This is a powerful method to build good approximate solutions to optimization problems.
Taking a Look at Compression Algorithms (cefboud.com) - #algorithms
A short blog post summarizing the main compression algorithms. It's incredible how little I knew about something I use so much.
Linklog (ewintr.nl)
Example of what a linklog should look like, and what I am for with my own.
Solving differential equations using neural networks (labpresse.com) - #deep-learning #physics
Toy example of how to use neural networks to solve differential equations. This blew my mind when I first read it.
Blogging in Djot instead of Markdown (www.jonashietala.se) - #web-dev #rust #markdown
Interesting dive on how to handle multiple markup languages in a Rust-based static website generator. My Rust journey hasn't taken me there yet, but it probably will eventually!

Week 6

A Visual Guide to How Diffusion Models Work (towardsdatascience.com) - #deep-learning #diffusion
An interesting dive into what makes diffusion models work. The summary is that diffusion models are models trained on data with noise to find the original data, at various level of noise. They eventually learn the probability distribution of the images in the space of all possible pixel arrangements. You can then iteratively denoise a pure Gaussian noise picture until you generate a new image: this is like sampling the learned probability distribution.

January 2025

Week 5

Understanding LSTM Networks (colah.github.io) - #deep-learning
In-depth explanation of LSTM Networks. The figures on this blog are incredible and truly help explaining what happens inside the network.
A Brief Introduction to Recurrent Neural Networks (jaketae.github.io) - #deep-learning #python
Introduction and example of how to build a recurrent neural network from scratch.
Data Contracts as Therapy (benrutter.github.io) - #data
Musings about the use of data contracts to validate data sources. If you've ever been frustrated by a data source suddenly changing its schema or sending unexpected data, this is for you!

Week 4

Polars for initial data analysis, Polars for production (pythonspeed.com) - #python #data
Article about the use of Polars for both production and development stages. When starting with Polars, I found it easy to write production code (usually a long pipeline of LazyFrames ending with a collect), but struggled with writing optimal development code.
Modern Polars (kevinheavey.github.io) - #python #data
Great online book about Polars targeted to Pandas users. If you haven't heard about Polars yet, do yourself a favor and read this.

Week 2

Einsum in Depth (einsum.joelburget.com) - #python
A guide on how to use "einsum" in Python for tensor manipulation. Einstein notation made working with algebra a much nicer experience in physics, and for anyone doing heavy tensorial operations, they should do the same. But I always found the python implementation a bit awkward and difficult to understand. This article really helped with that.
Building effective agents (www.anthropic.com) - #llm
Advice on agentic workflow for practical applications from Anthropic. A good read to better understand what structure you should use when establishing your project.

Week 1

Hyperparameter Tuning LightGBM (macalusojeff.github.io) - #data-science
A useful guide for hyperparameter tuning of LGBM models. Mostly, if like me you always forget what parameter range is sensible, you can find it in there.

December 2024

Week 52

Software design principles for machine learning applications (github.com) - #best-practices #python
A series of examples of proper software design in data science beyond Jupyter notebooks. Very good examples of proper refactoring, step by step, from a messy script to a properly encapsulated program.

Week 51

Quick software tips for new ML researchers (www.eugenevinitsky.com) - #best-practices
A short list of best practices. Some are obvious from a software development perspective (VCS, package manager, linter), but some others have some good recommendations on ML specific tools (Hydra for configs, Optuna for hyperparameter tuning).
Hands-on Optimization with OR-Tools in Python (kunlei.github.io) - #python #optimization
Detailed use cases of the OR-Tools library for optimization problems. Many problems can be solved in a more efficient way with linear programming, and this library makes it a breeze to do so.

Week 50

GitHub Actions by Example (www.actionsbyexample.com) - #vcs
I always have to google Github actions format and snippets, or prompt an LLM for it. This is a collection of examples so you never have to google it again.

Week 49

Data Science at the Command Line (jeroenjanssens.com) - #cli #data-science
Online book on how to use command-line tools for quick data science results. This is for when your boss asks you about some statistics of your recent data output and you don't want to write a whole script for it.

November 2024

Week 47

Perspectives on diffusion (sander.ai) - #diffusion
Some interesting thoughts on diffusion models.
Thoughts on Riemannian metrics and its connection with diffusion/score matching [Part I] (blog.christianperone.com) - #physics #diffusion
An in-depth description of the connections between diffusion models and Riemannian geometry.

Week 46

shshemi/tabiew (github.com) - #cli #rust #tools
A handy rust-based TUI application to view and manipulate data from CSV and databases. Supports SQL syntax to query the data regardless of its sources.

Week 45

Algorithm Afternoon (algorithmafternoon.com) - #optimization #algorithms
This a collection of all the optimization metaheuristic you can possibly imagine, with comments on how to implement them and what parameters can be tuned. The aim is to take it one algorithm per afternoon.

October 2024

Week 44

First aid for figures: all resources (helenajamborwrites.netlify.app) - #data
A collection of resources to help make better data visualizations. Definitely useful as a refresher or reference before making a report or a presentation.
Transformers From Scratch (blog.matdmiller.com) - #deep-learning #llm
Thorough explanation of the Transformers model. If like me you've been confused about what's so special about transformers compared to RNNs or LSTMs, this might help.

Week 43

dry-python/returns (github.com) - #library #python
Bring some sanity to Python and remove null checks. Clearly inspired by Haskell's Maybe or Rust's Option type. I am mostly familiar with the latter, and I often wish it existed in Python, and now it does.

Week 42

Blog of Claudio Jolowicz (cjolowicz.github.io) - #python #best-practices
Series of articles on best practices around Python coding and tooling. Definitely worth checking it out if you're still building your workflow.

Week 41

shap/shap (github.com) - #python #library
Useful library to estimate feature importance of machine learning models, based on game theory principles. The main idea is to estimate the importance of each feature to take a sample from the mean prediction value to a given prediction value. It can also be aggregated over samples to understand global feature importance, conditional on feature value.
Modern Good Practices for Python Development (www.stuartellis.name) - #python #best-practices
A set of best-practices in Python development. Given the permissiveness of Python in terms of syntax and design, I find that following community accepted best practices is the best way to learn how to write good code too.

September 2024

Week 40

Introduction to Data Science (rafalab.dfci.harvard.edu) - #statistics #data-science
An online book focusing on the fundamentals of data science (statistics, traditional machine learning). I don't know much about R (on which this book is based) but most of the theory in there is relevant for any junior data scientist.

Week 39

Visualizing Algorithms (bost.ocks.org) - #algorithms
A beautiful set of visualizations of common algorithms. Perfect to truly understand what happens in a quicksort algorithm, or to compare different sampling algorithms.
Was Michael Scott the World’s Best Boss? (datacream.substack.com) - #data
I always love when data scientists take it too far on their hobbies. This is a cool example of data science applied to "The Office", to figure out through sentiment analysis if Michael Scott was truly appreciated.

Week 38

dleemiller/WordLlama (github.com) - #library #llm
Natural language processing toolkit optimized for CPU hardware. I haven't tested it yet but it looks really useful for quick clustering, deduplication, similarity search, etc...

Week 37

Pico CSS (picocss.com) - #web-dev #library
A minimalistic take on CSS frameworks which is simple and lightweight. Hopefully I one day have the time to rewrite this blog with it. Update: it looks semi-abandoned, but some forks are keeping the torch alive.

Week 36

posit-dev/great-tables (github.com) - #library #python
Library to make great-looking tables from Polars dataframes. It works with Pandas too but there you can just generate HTML directly, while Polars currently does not have many more options.

August 2024

Week 35

sharkdp/hyperfine (github.com) - #cli #tools
A very useful CLI tool to perform benchmarking tests. Very useful to test a bash script of a simple script file without any complicated profiling.
REDOKU (padolsey.github.io)
A fun RegExp-based crossword. Not easy though, and you might see English differently afterwards.

Week 33

Modern SQL Style Guide (gist.github.com) - #SQL #best-practices
An interesting and opinionated take on SQL formatting. You might not be able to impose it at work, but you can always try!

Week 31

Column Names as Contracts (emilyriederer.netlify.app) - #best-practices #data
An interesting explanation of implicit data contracts through naming conventions.

July 2024

Week 31

A User’s Guide to Statistical Inference and Regression (mattblackwell.github.io) - #statistics
Brief introductory book to essential statistics. This online book is very clear and helped me understand concepts I always found confusing.

March 2024

Week 10

Machine Learning Notebooks (sebastianraschka.com) - #python
A collection of detailed Python notebooks written by Sebastian Raschka. It's like a big cheatsheet of machine learning methods.

February 2024

Week 7

Python Data Science Handbook | Python Data Science Handbook (jakevdp.github.io) - #python #data-science
A must-read for anyone beginning in data science. Chapter 5 features some great in-depth notebooks on classical machine learning methods like SVM, random forests, etc...
faif/python-patterns (github.com) - #python
A detailed list of design patterns in Python. While I don't believe you should always look to insert design patterns everywhere you can, knowing them is often the key to writing more robust code when relevant.

Week 5

Scientific Computing with Python — Scientific Computing with Python (caam37830.github.io) - #python
A reference on how to use Python for efficient computations in science. I wish I had read this before my PhD.