Building an LLM-based CLI tool to generate commit messages
A simple afternoon project to familiarize myself with the Ollama API.
Introduction
I’ve been chronically late to the LLM party. When ChatGPT first became public, it took me a while to even try it. I heard a lot about it, but I tend to be quite suspicious of tech hypes, so I took my time. Then things accelerated. By now, large language models are everywhere, and I’ve used an AI bot out of convenience (when it’s useful1), I’ve integrated Github Copilot in my IDE, and I’ve also studied the theory behind LLMs during my path to becoming a data scientist.
But somehow, I still don’t see the appeal of making everything “AI-something”. I still keep those chats as a last resort option when all else fails, and I mostly use copilot as a (great) auto-complete for boring tasks I really dislike, like writing Python docstrings. But this all changed (slightly) recently: the last part of the training we have at work is on LLMs, and it culminates with an LLM-focused project.
During the introductory part of the module, we went through practical examples of how to use these models both locally and through an API, what the best practices in model development are, as well as deep dives about tuning and prompt engineering. We also covered all sorts of RAG examples, and spent a healthy amount of time on evaluating the models.2 And this is when it clicked! The reason I’ve struggled with adopting LLMs in my own workflow is because of the chat UX. I like my tools to be simple, functional interfaces, and somehow having to explain my problem through a chat, and then copy/paste code or whatever is just very unpleasant to me. But now that I know how to go beyond that, how to use models in a more programmatic way, it all feels like a new world opened to me.
So this is how this little exercise/afternoon project was born. I wanted to play around with a local model — I chose Ollama for this, but I appreciate how most frameworks for these tasks are so easy to use you can just swap them out — and I looked for a real problem I can solve with this. The second thing I hate the most after docstrings are commit messages, and I figured it would be great if I could get an assistant to help me write them.3 Of course it’s a bit of an overkill solution for such a simple problem, but simplicity was just not one of my criteria of choice that day.
In this blog post, I just want to show you how I made this project, but before I start let me open with the fact that this is not original by any measure (in fact I got the idea from a recent HackerNews post) and there are most likely smarter or better ways to do what I did. In particular, this project and this other project were a great source of inspiration. However, I often find that when it comes to specialized scripts or tools you want to integrate tightly in your workflow, writing them yourself from scratch is often better than using ready-made ones, because you can tweak it and adapt it closer to your needs as those change.
LLM commit
Without further ado, let’s dive into the project! I wanted to build a simple CLI tool so I used the click library to make my life easier. Then I set out to define the parameters and behaviour of the tool:
As you can see, I wanted to be able to quickly change the LLM parameters such as temperature, top-p, top-k values to tweak the output of the model. The type-commit
parameter can be used to enforce the correct type in the conventional commit format if you already know what you want. When you have to deal with so many inputs, validation can quickly become a pain. For that of course, you should use Pydantic, and here in particular I used pydanclick to specify the CLI parameters using Pydantic BaseModel
-inherited objects like this:
The two objects are ToolConfiguration
, which defines the general parameters, and ModelOptions
, which defines model-specific options. To make this work, you then need to define these classes somewhere:
Let’s take a moment here to discuss some of these choices. I didn’t go through a systematic check of which parameter is best, but instead modified the defaults to improve the results I was getting as I was developping the package. In a more structured project, you might want to actually have a strategy to select the LLM hyperparameters. So what did I choose?
model
for which I chose theqwen2.5
family as there are lots of small models which fit in my GPU-poor laptop, with thecoder
edition since I want to analyze code diffs. I initially tried the 1.5B version but the results were poor, and then eventually found that the 3B one worked so much better.max_size_diff
is to truncate the diffs if they get too big to avoid overflowing inside the context window. In practice I haven’t needed to touch this (I tend to commit often and aim for small diffs), but you might need to tweak it to your taste.message_max_length
is to restrict the number of words you want to include inside the commit message description.type_commit
which I previously explained.
As for the more model specific parameters, we have
temperature
defines how stochastic the model will behave. I found that 0.8 works well for my needs, I didn’t get any hallucination and the model is creative enough to actually describe the commit instead of listing keywords for the changes.top_p
,top_k
,num_predict
(the number of tokens to produce) I didn’t touch too much, these values were fine for my needs.
The rest of the code in the project then goes rather simply:
- Call the command
git diff --cached
, catch the output and truncate it so it fits insize the context window using
- Craft the user prompt using the git diff and the
type_commit
parameter, through the function
- Send the prompt to the LLM and get the answer. In the user prompt (see above), I requested that the commit message should be inside XML
<summary>
tags, so the message must be extracted from the full response, and it should be cleaned from symbols which could interfere later on in the shell
- Finally print the candidate commit message to the user and ask if they want to finalize the commit. If so, the commit is made using:
Okay now all the pieces are in place, they just need to be linked together inside the entrypoint
function like this:
With the Ollama API, it’s very easy to interact with the model through the generate
function, and you can override pretty much any option by specifying it inside the options
dictionary. The system prompt can also be overriden and for this tool, I used the following prompt4
And here we go, with these simple ingredients, you have a versatile LLM assistant to summarize your code changes. So what does it look like? Well a GIF is worth 1000 words per frame:
Summary
As I mentioned at the beginning of this post, this was a small afternoon project, but it fills a real need for me and it encapsulated well how LLMs can act as assistant in our day-to-day lives, through simple programs. I really enjoyed writing this tool, and I really encourage anyone who wants to learn how to work with large language models to try their hands with this kind of small projects: this is the best way to learn and get some intuition behind their, somewhat mysterious at times, behaviour.
NOTE
All the code for this project can be found on this Github repository.
Footnotes
-
I tend to really like LeChat by Mistral. Not only is it European, but I salute their open-source contributions. ↩
-
As far as I can see, this is something which is often skipped or ignored, and honestly I’m surprised any company would put an LLM in production without strict checks. ↩
-
I’ve read some very valid opinions on what exactly should be used as a commit message (and I daresay there is little consensus here beyond the agreement that it should be useful), and I agree that full automation of it makes little sense, as an LLM will rarely truly catch your intent (the why), and just end up describing what changed (the what). I personally use it to speed up the process, not as a replacement. ↩
-
I used the system prompt in this project for inspiration. ↩