Theia Vogel on LLMs and the infamous seahorse emoji

Theia Vogel wrote about why LLMs freak out over the seahorse emoji and why they keep claiming there is one whenever asked:

Maybe LLMs believe a seahorse emoji exists because so many humans in the training data do. Or maybe it’s a convergent belief – given how many other aquatic animals are in Unicode, it’s reasonable for both humans and LLMs to assume (generalize, even) that such a delightful animal is as well. A seahorse emoji was even formally proposed at one point, but was rejected in 2018.

Regardless of the root cause, many LLMs begin each new context window fresh with the mistaken latent belief that the seahorse emoji exists. But why does that produce such strange behavior? I mean, I used to believe a seahorse emoji existed myself, but if I had tried to send it to a friend, I would’ve simply looked for it on my keyboard and realized it wasn’t there, not sent the wrong emoji and then gone into an emoji spam doomloop. So what’s happening inside the LLM that causes it to act like this?

Theia then dug into the weeds and explored the tokens and training data to figure out what could be causing it. It should be obvious by now but, as always, don’t believe everything the chatbots output.

Filed under:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.