Evaluations/LLM As A Judge
gemini-flash
rag_instruct_benchmark_tester.jsonl
texttext
Unknown/Gemini 1.5 Flash
Google Google
is_correct
Are the answers equivalent? Answer "true" or "false". All lowercase.

Answer 1: {answer}
Answer 2: {prediction}
Nov 7, 2024, 11:40 PM UTC
Nov 7, 2024, 11:42 PM UTC
200 rows
11469 tokens$ 0.0002
200 rows processed, 11469 tokens used ($0.0002)
completed
8 columns, 1-100 of 200 rows