an open source tool for automated behavioral evaluations
Summary
We're releasing <strong>Bloom</strong>, an open source agentic framework for generating behavioral evaluations of frontier AI models. <strong>Bloom</strong> takes a researcher-specified behavior and quantifies its frequency and severity across automatically generated scenarios. <strong>Bloom</strong>'s evaluations correlate strongly with our ...
View original source →