Evaluations/LLM As A Judge/Iteration history
History
Total running cost: $0.0002
PromptRowsTypeModelTargetStatusRuntimeRunByTokensCost
Run
Are the answers equivalent? Answer "true" or "false". All lowercase. Answer 1: {answer} Answer 2: {prediction}
200texttextUnknown/Gemini 1.5 Flash78b9940028d1e24ed8fd189168a45c39 completed 00:01:206 months agoox11469 tokens$ 0.0002
Sample
Are the answers equivalent? Answer "true" or "false". All lowercase. Answer 1: {answer} Answer 2: {prediction}
5texttextUnknown/Gemini 1.5 FlashSample - N/A completed 00:00:016 months agoox213 tokens$ 0.0000
Sample
Are the answers equivalent? Answer "true" or "false". Answer 1: {answer} Answer 2: {prediction}
5texttextUnknown/Gemini 1.5 FlashSample - N/A completed 00:00:016 months agoox198 tokens$ 0.0000