Loading…
Can AI tell a joke?
FunnyBench is a benchmark for one question: which large language model is actually funny. Each model was given the same prompt — “tell me a joke” — roughly ten times. Read a joke, decide if it’s funny, and the model stays hidden until you vote. Your votes drive a live ELO leaderboard.
- —models
- —jokes
- 1temperature
- “tell me a joke”prompt
The model is revealed after you vote.
Live ELO leaderboard
| # | Model | ELO | Votes | Funny% |
|---|
Method
Jokes were generated through OpenRouter from its model catalog using the exact prompt “tell me a joke”. Generation used temperature 1 where supported, provider fallback disabled, required parameters enabled, and the returned model, provider, token counts, cost, and text were stored. The run excluded models not primarily meant for text, OpenRouter/router/front aliases, search or custom-tool variants, floating “latest” aliases, unavailable-price models, duplicate free aliases, and any model that failed five calls in a row. Reasoning output was excluded where the provider supported that setting.