Open-source AI regression testing

The open-source AI framework for regression testing.

Built on Playwright. Describe tests in plain English. AI runs them once, caches every action, and self-heals when your UI changes. Your regression suite, on autopilot.

Simpler than hand-written Playwright. More reliable than real-time AI.

~/your-project
npx passmark init
Created passmark.config.ts
Connected to Redis
Ready to write tests
npx passmark test
Running 12 tests across 3 suites...
9 replayed from cache (~4ms each)
3 healed and re-cached (~28s each)
12 passed · 0 failed · 1.4s total
The problem

Your test suite can't keep up with your team.

Your engineers ship 50 PRs a week. AI coding tools made them faster. But testing? Still manual. Still brittle. Still someone else's problem. Most teams live in one of three places:

Passmark takes a different approach. ↓

No tests.

You skipped e2e testing because nobody had time to write and maintain a Playwright suite. Regressions hit production. Users find the bugs.

Broken tests.

You wrote tests six months ago. Half are skipped. The rest flake on every CI run. Your team ignores the failures. The suite is dead weight.

AI tools too slow.

AI testing tools promise to fix this. But they run every action through an LLM, every time. That works for checking a single PR. It doesn't work for 500 regression tests running on every commit. Too slow. Too expensive. Too unpredictable for CI.

How it works

AI once. Playwright speed forever.

01

Describe

Write test steps in plain English. No selectors. No page objects. No boilerplate.

steps: [
  { description: "Add the first product to cart" },
  { description: "Open cart and proceed to checkout" },
  { description: 'Fill shipping address with zip "90210"' },
  { description: "Complete payment with test card" },
]
02

Execute

On first run, AI agents navigate your app using accessibility snapshots and screenshots. Every action is cached to Redis.

First run:  AI agents ~30s per step
03

Replay

Every run after uses cached Playwright actions. Native speed. No LLM calls. No API costs. When a cached action fails because the UI changed, AI re-engages only for that step and re-caches.

First run:  AI agents ~30s per step
Every run after:  Cached Playwright ~ms per step
Auto-heal:        Cache miss AI re-discovers re-caches
Code

This is a Passmark test.

8 lines of test logic. Plain English. Runs on Playwright. Self-heals when your UI changes. Cached after first run.

Compare that to the 40-line Playwright test with data-testid selectors, waitForSelector calls, and hardcoded timeouts you'd write by hand. Then imagine maintaining 200 of those.

purchase.test.ts
import { test, expect } from "@playwright/test";
import { runSteps } from "@bug0/ai";

test("user can complete purchase", async ({ page }) => {
  await runSteps({
    page,
    expect,
    userFlow: "purchase flow",
    steps: [
      { description: "Navigate to the store" },
      { description: "Add the first available item to cart" },
      { description: "Proceed to checkout and complete purchase" },
    ],
    assertions: [
      { assertion: "Confirmation page shows an order number" },
    ],
  });
});
Comparison

Where Passmark sits.

Hand-written Playwright

  • Plain English test authoring
  • Runs at native Playwright speed
  • Self-heals on UI changes
  • Built for CI (500+ test suites)
  • Cross-test state sharingManual wiring
  • Email and OTP testingExternal tools
  • Multi-model consensus assertions
  • $0 AI cost on repeat runs
  • Open-source

Passmark

Recommended
  • Plain English test authoring
  • Runs at native Playwright speedYes (cached)
  • Self-heals on UI changes
  • Built for CI (500+ test suites)
  • Cross-test state sharingBuilt-in (Redis)
  • Email and OTP testingBuilt-in
  • Multi-model consensus assertions
  • $0 AI cost on repeat runs
  • Open-source

Real-time AI tools

  • Plain English test authoring
  • Runs at native Playwright speed
  • Self-heals on UI changesPartial
  • Built for CI (500+ test suites)
  • Cross-test state sharing
  • Email and OTP testing
  • Multi-model consensus assertions
  • $0 AI cost on repeat runs
  • Open-sourceSome
Built for regression

Everything you need at test 200 that you don't think about at test 1.

Cross-test state

Sign up in test 1. Log in with the same credentials in test 2. Global placeholders persist across tests via Redis.

// Test 1: Sign up
data: { email: "{{global.dynamicEmail}}" }

// Test 2: Log in (same execution, same email)
data: { email: "{{global.dynamicEmail}}" }

Built-in email testing

Disposable inboxes. OTP extraction. Verification link clicks. No Mailosaur. No third-party infra.

{ description: "Sign up", data: { email: "{{run.dynamicEmail}}" } },
{ description: "Enter OTP", data: { otp: "{{email.otp:Extract the 6-digit code}}" } }

Consensus assertions

Every assertion is evaluated by Claude and Gemini independently. A third model breaks ties. Passes only on consensus. Returns a confidence score (0-100).

assertions: [
  { assertion: "Dashboard shows a welcome message for the new user" }
]
// Evaluated by multiple models. Passes only on agreement.

Dynamic test data

Every run gets unique IDs, emails, names, phone numbers. Parallel tests never collide. Three scopes: {{run.*}} (per test), {{global.*}} (per suite), {{data.*}} (per project).

// Per test run
data: { email: "{{run.dynamicEmail}}" }

// Per suite (shared across tests)
data: { email: "{{global.dynamicEmail}}" }

// Per project
data: { apiKey: "{{data.stagingApiKey}}" }
CI integration

npx passmark test runs Playwright.

Your existing Playwright config, fixtures, and CI setup work. Passmark doesn't replace Playwright. It sits on top of it.

  • GitHub Actions, GitLab CI, Jenkins, any CI runner
  • Parallel execution with zero shared-state collisions
  • OpenTelemetry tracing for every step, cache hit, and assertion
  • Redis for action caching and cross-test state
test.yml
# .github/workflows/test.yml
- name: Run regression tests
  run: npx passmark test
About

What is Passmark?

Passmark is an open-source AI regression testing framework built on Playwright. Developers write end-to-end tests in plain English. AI agents execute tests on first run and cache every browser action to Redis. Subsequent runs replay cached actions at native Playwright speed with zero LLM calls. When UI changes cause a cached action to fail, the AI automatically re-discovers the correct interaction and updates the cache. Passmark includes built-in email testing, cross-test state management, dynamic test data generation, and multi-model consensus assertions. It requires Node.js 18+, Playwright 1.57+, Redis, and API keys for Anthropic and Google AI.

Team

Built by the team behind Bug0 and Hashnode.

Passmark is the open-source core of Bug0, an AI-native QA platform trusted by 200+ engineering teams. Built by Fazle Rahman and Sandeep Panda, who previously created Hashnode, a developer blogging platform with millions of monthly readers.

Backed by Accel, Guillermo Rauch (Vercel), and Naval Ravikant.

Go managed

Love the engine? Let someone else run it.

Passmark is the open-source core. Bug0 Managed is the service built on top of it. A dedicated QA engineer writes your tests, triages every failure, and gates your releases. Same engine. Zero maintenance on your end.

Open-source

Passmark

  • Test enginePassmark
  • Who writes testsYou
  • Who triages failuresYou
  • InfrastructureYours
  • PriceFree
Get Started
Service

Bug0 Managed

  • Test enginePassmark
  • Who writes testsDedicated QA engineer
  • Who triages failuresHuman-verified reports
  • InfrastructureBug0 AI agents for testing
  • PriceFrom $2,500/mo
Learn more at bug0.com