#benchmarks #moo #paper #emo #nas
https://arxiv.org/abs/2208.04321
#legal #nlp #llm #bar_exam
https://taxprof.typepad.com/taxprof_blog/2023/05/re-evaluating-gpt-4s-bar-exam-performance.html
#g_eval #llm #evaluation #microsoft #stanford #google #team
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
https://arxiv.org/abs/2303.16634
https://arxiv.org/abs/2208.04321
#legal #nlp #llm #bar_exam
https://taxprof.typepad.com/taxprof_blog/2023/05/re-evaluating-gpt-4s-bar-exam-performance.html
#g_eval #llm #evaluation #microsoft #stanford #google #team
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
https://arxiv.org/abs/2303.16634