September Update

I just came back from a two weeks vacation in France with my daughter, and without my laptop, which was pretty nice. As a result I don’t have much to say for this month ! After coming back I had an idea for a small project, a “guess who” app that I vibecoded in a few hours with Firebase Studio. It got me a Next.js+tailwindcss app pretty quickly, with LLM calls handled by Genkit. Originally it was using the Gemini API and hosted on Firebase, but I’ve changed it to GPT-5 and Fly.io for hosting. ...

September 25, 2025 · 2 min · Jerome Marhic

Training a small LLM

So in my [last post about Pilish]({% post_url 2025-08-09-pillmish %}) I mentioned that a follow up to get better results would be to use a LLM with word level tokenization. It’s actually a bad idea in general because the vocabulary size can be huge, and that’s why most LLM these days use BPE or subword tokenization, but I decided to give it a quick try and train a LLM from scratch with word level tokenization. Back in 2023 I used to be quite into finetuning and all that but this year I haven’t done much low level tinkering with LLM so it was a good exercise. ...

August 16, 2025 · 4 min · Jerome Marhic

Pillmish

A few days ago, I heard about Pilish for the first time. As Wikipedia puts it, Pilish is a style of constrained writing in which the lengths of consecutive words or sentences match the digits of the number π (pi). My second thought (right after “who would enjoy doing that”) was, this sounds easy for LLMs! So I set out to make a minimal proof of concept. Long story short, the trick is to constrain the output to follow the digits of Pi—the same way that structured output (JSON, etc.) works. With Hugging Face Transformers, it’s easy to do with a custom logits_processor. Here, we just constrain the tokens to those matching the desired number of characters at each step. I shared the code on GitHub. ...

August 9, 2025 · 2 min · Jerome Marhic

August 2025 Update (GPT-5, Veo 3, Gemini CLI, Open Source)

Lot of things going on this summer, but yesterday’s big news was the release of ChatGPT 5. I don’t have an opinion on it yet, but I guess it’s nice to have a single model (at least exposed, even if this is routed under the hood) instead of switching between 4o and o4-mini-high, etc. Recently I’ve been on the market for a “CLI Agent” (synchronous, like Claude Code, not async like Devin and co.). I’ve been using Gemini CLI with good results — as long as I’m within the free PRO requests. As soon as it switches to the “flash” model it becomes useless, can’t edit a file properly, and just loops on itself. I wouldn’t pay for it. Claude Code seems too expensive; I might give OpenAI Codex a try with GPT-5, but I haven’t heard much about it. ...

August 8, 2025 · 3 min · Jerome Marhic

Maths notes (1)

So I started reading a paper about diffusion (the original DDPM paper) and I was quickly out of my depth. I needed a refresher about probabilities, and actually even more basic stuff like exponential and integration. And I thought why not share the notes here! So this blog post is the content of a python notebook about exponential and the normal distribution, exported to markdown with jupyter nbconvert. It’s not deep or anything, just a nice refresher for myself. Kind of a pain to write math formulas and get them displayed in my Jekyll blog but got it working with Mathjax. ...

July 30, 2025 · 6 min · Jerome Marhic

I apologize for the confusion. You're absolutely right.

Google did me dirty this week and I’m salty, so let me tell you. The other day I received an email informing me that my YouTube Premium subscription had been renewed, which surprised me because I canceled it about 6 months ago, and I haven’t missed it ever since (thanks, yt-dlp!). I first thought it was my daughter’s shenanigans, but after a closer look I had only “paused” the subscription instead of canceling it, and Google conveniently failed to inform me it was going to be restarted… No advance email like “your subscription will restart in a few days,” just “whoops, we restarted your subscription and you’ve been charged 13 euros, teehee.” ...

July 25, 2025 · 2 min · Jerome Marhic

July Update

Just a quick update to say I am still making progress on the Password Game Agent project I mentioned last post. I have now reached up to step 16 where we need to solve a chess position… Seems like a suitable job for a reasoning model ! The main changes that enabled going from step 11 (Wordle answer) to 16 were adding a “search tool” based on OpenAI web search tool and changing the reasoning effort from “medium” to “high”. The current version of the code is here : https://github.com/goverture/password-game-agent/blob/master/manually.py I’m still making regular changes and trying new ideas to make progress. I’ve noticed we often seem to get stuck for various reason (for instance a badly recognized Capcha), so I want to add a new step to ensure that we make progress consistently, or backtrack, in order to not get stuck.

July 5, 2025 · 1 min · Jerome Marhic

The password game agent

There is this javascript game called The Password Game where you have to chose a password, following increasingly ludicrous requirements. Agentic coding/research is all the rage today and I got the idea of making an “agent” that would solve it. Conceptually it’s just about calling a LLM in a loop, giving it the proper tools and context until it solved the task at hand. Here’s where I got so far, using ChatGPT 4o and Playwright MCP (only gave it the navigate and type tools for now). It’s solving correctly the first few steps, but gets stuck at the “sponsor” rule because it cannot view the image. The next step is to give it access to the playwright’s screenshot tool. Let’s see how far it can go ! ...

June 18, 2025 · 2 min · Jerome Marhic

May update

Another May update ! Not much going on as usual, rainy season arrived in Ho Chi Minh city so we get some strong rain and nice sunsets. Yesterday we got the announcement that Pocket was saying goodbye. Kind of bummed because I found the service convenient, though I just used it as a link bookmark (I never cared much about the reading mode, I always go to the original link - though apparently that worked well with the Kobo e-reader). Anyway I switched to a self-hosted alternative immediately: Wallabag. It’s kind of sluggish but it does the job, and it has an Android app so I can still “share link” to save a link from my phone. ...

May 25, 2025 · 3 min · Jerome Marhic

My First MCP server

So MCP is all the rage these days, and it’s not very complex: it’s basically a standardized way to provide tools to LLMs. So you have an RPC server that provides a description of the tools and the parameters they expect, and your client (a LLM based application) can connect to it and tell the LLM what tools are available, and the LLM can decide to call them when appropriate. Easy stuff and the SDK provided do most of the heavy lifting. ...

May 4, 2025 · 2 min · Jerome Marhic