This technology integrates a list of benchmarks for LLMs and computes their performance. Includes medical benchmarks, general purpose as well as bias and toxicity ones.

Contacto:
License:
Apache License (Version 2.0)
Primary tabs
Apache License (Version 2.0) (Latest Version)
LLM eval
Release Notes