AI/MLexperimental

AI Testing Agent

An AI agent that writes and runs end-to-end tests by reading your app like a user would. It opens pages with Playwright, identifies elements visually and structurally, drafts test cases, and runs them. The honest result: it is great for happy paths and visual regression, mediocre for complex flows, and useless for anything that requires domain knowledge it does not have. Worth it as a first sweep, not as your test suite.

OpenAIPlaywrightTypeScriptComputer Vision

What this is

A lab, not a product.

An AI agent that writes and runs end-to-end tests by reading your app like a user would. It opens pages with Playwright, identifies elements visually and structurally, drafts test cases, and runs them. The honest result: it is great for happy paths and visual regression, mediocre for complex flows, and useless for anything that requires domain knowledge it does not have. Worth it as a first sweep, not as your test suite.

5

Features

4

Learnings

4

Technologies

Capabilities

What it does

The features that actually got built and run in this prototype.

feature_01.ts
Automatic test generation by crawling routes and inferring user flows
feature_02.ts
Visual regression detection that compares screenshots across runs, with a learned tolerance
feature_03.ts
Self-healing selectors that fall back to vision when DOM selectors break
feature_04.ts
Natural language test descriptions, generated as readable prose alongside the code
feature_05.ts
CI/CD integration via the CI/CD playbook

The stack

What it is built with

The libraries and runtimes I picked for this lab and why they earned their place.

OpenAI
Playwright
TypeScript
Computer Vision

What I learned

Learnings, in order of how much they surprised me

The things I would tell another engineer before they tried the same experiment.

01
Vision models help understand page structure for test generation, especially for design-heavy pages
02
Self-healing selectors reduce test maintenance dramatically but introduce non-determinism. Pick your poison
03
Generated tests work best for happy paths. Edge cases still need human input and domain knowledge
04
See the AI agent orchestration blueprint for how to wire this into a wider agent setup

Note: This is an experimental project in the experimental stage. It is a learning exercise and technical exploration rather than a production-ready solution. Patterns and code may change.

Want me to build something like this for you?

If this kind of work fits your roadmap, I take on a small number of paid projects each quarter.