pnpm add evalite@beta The Dev Loop
When you’re developing evals, you want to iterate quickly. Running your entire test suite every time you make a change can be slow and frustrating.
Evalite gives you several tools to speed up your development workflow: watch mode for automatic re-runs, file filtering to run specific evals, and selective execution to focus on particular test cases.
Watch Mode
Watch mode automatically re-runs your evals whenever you make changes to your files.
evalite watchThis command watches all .eval.ts files in your project and re-runs them whenever they change.
Serve Mode
If you have slow-running evals, you might not want them to re-run every time you make a change. Serve mode runs your evals once and then keeps the UI available for inspection.
evalite serveThis runs your evals once and then serves the UI at http://localhost:3006. Your tests won’t re-run when files change.
You can then re-run your evals by pressing the “Rerun” button in the UI.
Run Specific Files
Sometimes you don’t want to run your entire eval suite. You can run specific eval files by passing them as arguments:
evalite my-eval.eval.tsYou can also run multiple files at once:
evalite eval1.eval.ts eval2.eval.tsThis works with both watch and serve modes:
evalite watch my-eval.eval.tsevalite serve my-eval.eval.tsSkip Entire Evals
If you want to temporarily disable an eval without deleting it, you can use evalite.skip():
evalite.skip("My Eval", { data: () => [], task: () => {},});Focus on Specific Test Cases
Sometimes you want to focus on a single test case within an eval. You can use the only flag to do this:
evalite("My Eval", { data: () => [ { input: "test1", expected: "output1" }, { input: "test2", expected: "output2", only: true }, { input: "test3", expected: "output3" }, ], task: async (input) => { // Only runs for "test2" },});When any data entry has only: true, only those entries will run.