Pokémon is something of a testing ground for passionate coders who want to see how good large language models (LLMs) and vision models really are. You may have heard of ClaudePlaysPokémon, where Anthrophic’s Claude model tries to play through Pokémon (it’s currently on Claude 4, launched in May this year). Results have been mixed but it’s got as far as Celadon City if that means anything to you.
For piman though, he opted for a GPT model (GPT-4V, a vision model version of GPT-4) to see if it could play Pokémon Crystal. The results weren’t as mixed—they were just terrible. The video shows the run up to the character getting to Professor Elm’s lab and trying to pick Cyndaquil, which all took over an hour, and a post-run synopsis.
As you’ll see, this thing needs work and isn’t built for this use case.
Filed under: computer vision GPT models Pokémon video