SCALE

SQL Capability Leaderboard for LLMs

This leaderboard evaluates the SQL capabilities of major large language models. For more details, check out our GitHub or

The Large Model SQL Capability Leaderboard (SCALE) reveals the true proficiency of large models in SQL. SCALE comprehensively evaluates Large Language Models' (LLMs) core SQL capabilities through scientific, rigorous assessment, focusing on three critical dimensions: SQL Optimization (enhancing query efficiency and performance), Dialect Conversion (enabling seamless cross-database platform migration), and Deep SQL Comprehension (accurately parsing complex logic and user intent). To authentically reflect real-world database operation performance, we established a multi-dimensional, multi-metric evaluation system employing strict testing with real-world cases across varying difficulty levels. Each test case carries scientifically calibrated weights based on technical complexity and practical value (higher difficulty = greater weight), ensuring final scores precisely measure models' comprehensive performance in high-value, high-challenge tasks. Through rigorous testing and weighted scoring, SCALE provides developers, database administrators, and enterprise technology decision-makers with authoritative, objective benchmarks, clearly delineating models' relative strengths in SQL processing to advance intelligent database application development and implementation.

SCALE: Large Model SQL Capability Leaderboard - August 2025

1. Executive Summary In August 2025, the SCALE evaluation benchmark continues to track the cutting edge of AI technology. This month, we welcome several high...

Click to view details →

SQL Optimization

N/A

Dialect Conversion

N/A

SQL Understanding

N/A