gemini-flash
rag_instruct_benchmark_tester.jsonl
text → text
is_correct
Are the answers equivalent? Answer "true" or "false". All lowercase.
Answer 1: {answer}
Answer 2: {prediction}gemini-flash-results
Nov 7, 2024, 11:40 PM UTC
Nov 7, 2024, 11:42 PM UTC
200 rows
11469 tokens$ 0.0002
200 rows processed, 11469 tokens used ($0.0002)
completed
8 columns, 1-100 of 200 rows