KONA 1.0

EBM

Compare our EBM reasoning model against the latest frontier AI models. Enter a Sudoku or load a random hard puzzle.

Sudoku Text Format

9 lines of 9 digits each, using 0 or _ for blanks

Puzzle Preview

Results

*To ensure we are testing the AI models ability to actually reason and self-align, we disabled code execution for both the EBM and LLMs. If you run these tests on public LLMs, rather than trying to reason through the puzzles themselves, they will run a brute-force search in Python to "cheat." Kona actually reasons through the Sudoku without access to code execution.