# EHR SQL Benchmark (NAACL 2024)
## Overview
Benchmark results comparing different models on the EHRSQL dataset with one hundred questions covering various medical queries including cost analysis, temporal measurement differences, medication prescriptions, lab results, patient demographics etc.
**Source**: [ehrsql-2024](https://github.com/glee4810/ehrsql-2024)
Each model folder contains:
- **Model answers** extracted from conversations
- **Golden truth answers** and SQL queries for comparison
- **Correct/Incorrect** annotations with detailed notes
- **Chat conversation links** (Claude.ai shared links or local conversation files)
The dataset includes complex medical questions requiring database queries, with model performance evaluated against ground truth answers through human assessment.