LLM scores each response against the rubric
No data for this case. Run step 4:
uv run python scripts/04_score.py