FunnyBench

Can AI tell a joke?

FunnyBench is a benchmark for one question: which large language model is actually funny. Each model was given the same prompt — “tell me a joke” — roughly ten times. Read a joke, decide if it’s funny, and the model stays hidden until you vote. Your votes drive a live ELO leaderboard.

—models
—jokes
1temperature
“tell me a joke”prompt

Start voting → Method notes

Loading…

The model is revealed after you vote.

Live ELO leaderboard

#	Model	ELO	Votes	Funny%

Method

Jokes were generated through OpenRouter from its model catalog using the exact prompt “tell me a joke”. Generation used temperature 1 where supported, provider fallback disabled, required parameters enabled, and the returned model, provider, token counts, cost, and text were stored. The run excluded models not primarily meant for text, OpenRouter/router/front aliases, search or custom-tool variants, floating “latest” aliases, unavailable-price models, duplicate free aliases, and any model that failed five calls in a row. Reasoning output was excluded where the provider supported that setting.