This technology integrates a list of benchmarks for LLMs and computes their performance. Includes medical benchmarks, general purpose as well as bias and toxicity ones.
Contact:
License:
Apache License (Version 2.0)