pnpm add evalite@beta runEvalite()
Run evaluations programmatically from Node.js scripts or custom tooling.
Signature
runEvalite(opts: { mode: "run-once-and-exit" | "watch-for-file-changes" | "run-once-and-serve"; path?: string; cwd?: string; scoreThreshold?: number; outputPath?: string; hideTable?: boolean; storage?: Evalite.Storage; forceRerunTriggers?: string[];}): Promise<void>Parameters
opts.mode
Type: "run-once-and-exit" | "watch-for-file-changes" | "run-once-and-serve" (required)
The execution mode for running evals.
Modes:
"run-once-and-exit"- Run evals once and exit. Ideal for CI/CD pipelines."watch-for-file-changes"- Watch for file changes and re-run automatically. Starts the UI server."run-once-and-serve"- Run evals once and serve the UI without watching for changes.
import { runEvalite } from "evalite/runner";
// CI/CD modeawait runEvalite({ mode: "run-once-and-exit",});
// Development mode with watchawait runEvalite({ mode: "watch-for-file-changes",});
// Run once and keep UI openawait runEvalite({ mode: "run-once-and-serve",});opts.path
Type: string (optional)
Path filter to run specific eval files. If not provided, runs all .eval.ts files.
await runEvalite({ mode: "run-once-and-exit", path: "my-eval.eval.ts",});opts.cwd
Type: string (optional)
The working directory to run evals from. Defaults to process.cwd().
await runEvalite({ mode: "run-once-and-exit", cwd: "/path/to/my/project",});opts.scoreThreshold
Type: number (optional, 0-100)
Minimum average score threshold. If the average score falls below this threshold, the process will exit with code 1.
await runEvalite({ mode: "run-once-and-exit", scoreThreshold: 80, // Fail if average score < 80});Useful for CI/CD pipelines where you want to fail the build if evals don’t meet a quality threshold.
opts.outputPath
Type: string (optional)
Path to write test results in JSON format after evaluation completes.
await runEvalite({ mode: "run-once-and-exit", outputPath: "./results.json",});The exported JSON contains the complete run data including all evals, results, scores, and traces.
Note: Not supported in watch-for-file-changes mode.
opts.hideTable
Type: boolean (optional, default: false)
Hide the detailed results table in terminal output. Keeps the score summary but removes the detailed table.
await runEvalite({ mode: "watch-for-file-changes", hideTable: true, // Useful for debugging with console.log});opts.storage
Type: Evalite.Storage (optional)
Custom storage backend instance. If not provided, uses the storage from evalite.config.ts or defaults to in-memory storage.
import { runEvalite } from "evalite/runner";import { createSqliteStorage } from "evalite/sqlite-storage";
await runEvalite({ mode: "run-once-and-exit", storage: createSqliteStorage("./custom.db"),});See Storage for more details.
opts.forceRerunTriggers
Type: string[] (optional)
Extra file globs that trigger eval reruns in watch mode. This overrides any forceRerunTriggers setting in evalite.config.ts.
await runEvalite({ mode: "watch-for-file-changes", forceRerunTriggers: ["src/**/*.ts", "prompts/**/*", "data/**/*.json"],});This is useful when your evals depend on files that aren’t automatically detected as dependencies (e.g., prompt templates, external data files).
Tip:
forceRerunTriggersglobs are evaluated relative to thecwdyou pass (orprocess.cwd()if unspecified).
Usage Examples
Basic CI/CD Script
import { runEvalite } from "evalite/runner";
async function runTests() { try { await runEvalite({ mode: "run-once-and-exit", scoreThreshold: 75, outputPath: "./results.json", }); console.log("All evals passed!"); } catch (error) { console.error("Evals failed:", error); process.exit(1); }}
runTests();Development Script
import { runEvalite } from "evalite/runner";
// Run specific eval in watch modeawait runEvalite({ mode: "watch-for-file-changes", path: "chat.eval.ts", hideTable: true,});Custom Storage
import { runEvalite } from "evalite/runner";import { createSqliteStorage } from "evalite/sqlite-storage";
const storage = createSqliteStorage("./evalite.db");
await runEvalite({ mode: "run-once-and-exit", storage,});Multi-Environment Testing
import { runEvalite } from "evalite/runner";
const environments = [ { name: "staging", url: "https://staging.example.com" }, { name: "production", url: "https://example.com" },];
for (const env of environments) { console.log(`Running evals for ${env.name}...`);
process.env.API_URL = env.url;
await runEvalite({ mode: "run-once-and-exit", scoreThreshold: 80, outputPath: `./results-${env.name}.json`, });}Parallel Eval Execution
import { runEvalite } from "evalite/runner";
// Run multiple eval sets in parallelawait Promise.all([ runEvalite({ mode: "run-once-and-exit", path: "chat.eval.ts", }), runEvalite({ mode: "run-once-and-exit", path: "completion.eval.ts", }),]);Configuration Priority
Options merge in this order (highest to lowest priority):
- Function arguments (
opts) - Config file (
evalite.config.ts) - Defaults
Example:
export default defineConfig({ scoreThreshold: 70, hideTable: true,});
// script.tsawait runEvalite({ mode: "run-once-and-exit", scoreThreshold: 80, // Overrides config (80 used) // hideTable not specified, uses config (true)});Error Handling
The function throws an error if:
- Evals fail to run
- Score threshold is not met
- Invalid options are provided
try { await runEvalite({ mode: "run-once-and-exit", scoreThreshold: 90, });} catch (error) { console.error("Eval run failed:", error); // Handle error or exit process.exit(1);}Return Value
Returns a Promise<void>. The function completes when:
run-once-and-exit: All evals finishwatch-for-file-changes: Never (runs indefinitely)run-once-and-serve: All evals finish, but UI server keeps process alive
See Also
- CLI - Command-line interface
- defineConfig() - Configuration file
- Storage - Custom storage backends
- Run Evals Programmatically Guide - More examples