A few days ago, I heard about Pilish for the first time. As Wikipedia puts it, Pilish is a style of constrained writing in which the lengths of consecutive words or sentences match the digits of the number π (pi). My second thought (right after “who would enjoy doing that”) was, this sounds easy for LLMs! So I set out to make a minimal proof of concept.

Long story short, the trick is to constrain the output to follow the digits of Pi—the same way that structured output (JSON, etc.) works. With Hugging Face Transformers, it’s easy to do with a custom logits_processor. Here, we just constrain the tokens to those matching the desired number of characters at each step. I shared the code on GitHub.

Here are a few samples I generated:

  • and I have a small apartment in London which has three bedrooms including private bathrooms and my own bathroom with
  • you d have a whole community of people where you could interact virtually without requiring you to get involved with
  • isn t even a place dedicated to making money and doing anything important besides promoting you to the audience that

You can see the letter count is 3,14159…

So that’s that. Now, the model I used, facebook/opt-125m, is rather small and the tokenizer uses Byte Pair Encoding (BPE)—meaning a lot of words are split into multiple tokens, so the results are obviously subpar. But you get the idea. That’s about as far as I’m willing to go, as I already lost interest in the whole affair. Some possible next steps would be to stop the generation or backtrack depending on the loss, and maybe train a model from scratch using a word-level encoding. Also, you could easily implement a trick where two digits like “11” could be interpreted as an eleven-letter word as well as two one-letter words, etc.

Updated: