Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.
Amber Glenn made a critical error on one of her jumps in the short program.
Looking for help with today's New York Times Pips? We'll walk you through today's puzzle and help you match dominoes to tiles ...