William: a tiny poetry model in the browser

William is a tiny local language model I trained to write short poems. The model on this page is loaded by the browser and sampled locally, one token at a time. There is no server endpoint behind the button.

loading William...

The title line is editable. William tokenizes it in the browser before generating the poem.

William is a small decoder-only transformer: 6 layers, 384 hidden dimensions, 6 attention heads, and a 256-token context window. I trained it locally with MLX on Apple Silicon.

The training pipeline was two-stage. First, the model learned general poem-shaped text from the biglam/gutenberg-poetry-corpus line corpus after filtering out Project Gutenberg boilerplate, headers, editorial apparatus, prose-like blocks, and non-English fragments. Then I fine-tuned it on title/body poem pairs from suayptalha/Poetry-Foundation-Poems, with extra filtering for rows that were too long or too prose-like for the short context window. I also used prism-ml/Bonsai-8B-mlx-1bit locally as a grading model to help reject low-fitness fine-tuning rows and audit pretraining artifacts.

For this page, the MLX checkpoint was converted to ONNX and dynamically quantized to int8. The page downloads that static model file and runs it with ONNX Runtime Web in your browser; the model asset is around 14 MB.