How I Build Cypress Automation & AI Agent QA

I build two layers of protection for every client project: Cypress E2E Automation, automated browser tests that click, type, and verify your app like a real user, and AI Agent QA, an intelligent agent that reviews code, spots security gaps, and writes tests using context about your product. Both work together to catch bugs before your users do.

the implementation

One spec file per page or feature

Instead of dumping all tests in one giant file, I follow the rule: one file per menu or page.

javascript

1	inventory/
2	inventory-items.cy.js ← all tests for the Inventory page
3	inventory-reports.cy.js ← all tests for Inventory Reports
4	trading/
5	trading-journal.cy.js
6	portfolio.cy.js

6 lines

What MCP Adds to the QA Layer

The most common mistake in Cypress is targeting elements by class names or CSS selectors that change when a designer updates the UI. Instead, I attach a unique id attribute to every interactive element:

javascript

1	<button id="saveItem_inventoryPage">Save</button>
2	<input id="itemName_inventoryPage" />

2 lines

Then in tests:

javascript

1	cy.get('#saveItem_inventoryPage').click()
2	cy.get('#itemName_inventoryPage').type('Laptop Stand')

2 lines

When the styling changes, the test doesn't break. The ID stays the same.

A shared constants file

Every test reads from a single source of truth — app-constants.json — which stores all element IDs, API endpoints, and test values:

javascript

1	{
2	"selectors": {
3	"saveItem": "saveItem_inventoryPage",
4	"itemName": "itemName_inventoryPage"
5	},
6	"endpoints": {
7	"items": "/api/items"
8	}
9	}

9 lines

If an ID changes, you update it in one place and all tests stay in sync.

The test structure I follow

One of the more interesting behaviors is how the Tester Agent communicates back to other agents when it finds problems.

All inter-agent communication goes through a single file: .claude/agents/signals/pending-signals.md. It's a structured inbox — every agent checks it at kickoff and resolves pending signals before starting new work.

When the Tester Agent audits a component and finds a missing data-testid:

javascript

1	🔖 data-testid Request — Tester → Frontend
2	Component: ProductsTable (app/main/inventory/product-list/list/ProductsTable.jsx)
3	Missing IDs:
4	- product-table-row → the <tr> element for each product row
5	- product-stock-badge → the stock level indicator badge
6	Needed for: cypress/e2e/inventory_management/product/product-list-ui.cy.js
7	Action: Add data-testid attributes + register in cypress/fixtures/app-constants.yaml

7 lines

When the Tester Agent finds an endpoint missing an edge case:

javascript

1	⚠️ Endpoint Gap — Tester → Backend
2	Endpoint: POST /api/inventory/v1/product/create
3	Missing edge case:
4	- Returns 200 instead of 400 when usage_quantity is a negative number
5	Action: Add validation so tests can assert the correct behavior

5 lines

The signal sits in pending-signals.md as [PENDING]. The Frontend or Backend Agent picks it up at their next kickoff, makes the fix, and marks it [RESOLVED: YYYY-MM-DD].

This creates a tight feedback loop that would normally require a Jira ticket, a Slack message, and two meetings.

Running tests headed (visible browser)

I always run tests in headed mode — meaning you can watch the browser do exactly what the test does:

npx cypress run --headed --browser chrome --spec "cypress/e2e/inventory/**"

No black-box magic. You see every click, every form fill, every assertion. This makes debugging fast and gives clients confidence that the tests are real.

What Cypress catches

Scenario	Does Cypress catch it?
A button that does nothing when clicked	Yes
A form that submits without validation	Yes
An API that returns the wrong data	Yes
A page that breaks after a CSS change	No (that's visual testing)
A feature that works in Chrome but not Firefox	Yes
A regression introduced by a new feature	Yes

What is an AI Agent for QA?

A regular QA engineer reads code, thinks about what could go wrong, and writes tests. An AI Agent does the same thing — but it has read every file in your codebase, understands your product requirements, and never gets tired.

In my workflow, I use a Tester Agent built on top of Claude (Anthropic's AI model). It's not just a chatbot — it's a specialized agent with a defined role, specific tools, and strict rules about what it can and cannot do.

How the AI Tester Agent works

01.

Step 1: It reads everything first

Before the agent writes a single line of test code, it reads:

The PRD (Product Requirements Document) — what the feature is supposed to do
The GitHub issue — the exact acceptance criteria
The backend API routes — what endpoints exist and what they return
The frontend components — what UI elements are on the page
The existing Cypress fixtures — what IDs and endpoints are already registered

This is what separates an AI agent from a simple autocomplete tool. It has context.

02.

Step 2: It writes a task plan before touching any file

The agent is required to write its test plan in the GitHub issue first — before creating any files:

Tester

Test item creation — happy path
Test item creation — missing required fields
Test item creation — duplicate item name
Test item list loads correctly
Test delete item — confirm dialog appears
Test delete item — item disappears from list after confirm

This plan is visible to the developer and the client. It's a contract — the agent commits to exactly what it will test.

03.

Step 3: It writes the Cypress tests

Once the plan is approved, the agent generates the spec file — following the same structure, naming conventions, and selector patterns described in Part 1.

It knows the rules:

Use id selectors, not class names
One spec file per page
Happy path + error handling + edge cases
Register new IDs in app-constants.json

04.

Step 4: It runs the tests until 100% pass

The agent runs the tests itself using the Bash tool:

npx cypress run --headed --browser chrome --spec "cypress/e2e/inventory/inventory-items.cy.js"

If a test fails, it reads the error, diagnoses whether it's a test bug or an app bug, fixes it, and reruns. It loops until everything is green

05.

Step 5: It updates the QA report

After all tests pass, the agent updates three QA tracking files:

Feature coverage report — which features have tests
Pass/fail history — when tests last ran and their results
Spec file registry — which file covers which feature

How the AI Code Reviewer works

Alongside the Tester, I use a Code Reviewer Agent and a Security Reviewer Agent — both read-only agents that inspect the code before tests are written.

Code Reviewer looks for:

Logic bugs (wrong condition, off-by-one errors)
Missing validation (what if the input is empty? what if the API call fails?)
Code that's harder to maintain than it needs to be

Security Reviewer looks for:

Unauthorized access (can User A see User B's data?)
Exposed secrets (API keys in client-side code)
SQL injection, XSS, and other OWASP vulnerabilities

Both agents produce a structured report with severity levels:

CRITICAL — must fix before merge
WARNING — should fix, has risk
SUGGESTION — optional improvement

Only after all CRITICALs are resolved does the Tester Agent begin.

The Full QA Pipeline

Here's how all the pieces connect in a real feature delivery:

PM writes requirements → GitHub issue created

Developer writes the code (Backend + Frontend)

Code Reviewer reviews the PR → flags issues

Security Reviewer checks for vulnerabilities

Developer fixes all CRITICAL issues

Tester Agent reads the issue + code

Tester Agent writes plan → writes tests → runs until 100% pass

Feature ships to production

Every step is tracked in the same GitHub issue. Every agent leaves a trace — what it planned, what it found, what it fixed.

Why this approach works

Traditional QA	AI Agent QA
Manual testing — slow, inconsistent	Automated — runs in minutes, same every time
QA engineer has to learn your codebase	Agent reads the codebase before it starts
Tests written after the bug is found	Tests cover acceptance criteria from day one
Hard to scale across multiple features	One agent, unlimited features
Expensive for small teams	Efficient — agent works in parallel with developers

Modern software teams can't afford to ship blind. By combining Cypress E2E Automation with an AI Agent QA pipeline, every feature goes through structured browser testing, intelligent code review, and security checks — all tracked in a single GitHub issue from start to finish. The result is software you can confidently change tomorrow, because if something breaks, you'll know before your users do.

How I Build Cypress Automation & AI Agent QA

the implementation

One spec file per page or feature

Instead of dumping all tests in one giant file, I follow the rule: one file per menu or page.

javascript

1	inventory/
2	inventory-items.cy.js ← all tests for the Inventory page
3	inventory-reports.cy.js ← all tests for Inventory Reports
4	trading/
5	trading-journal.cy.js
6	portfolio.cy.js

6 lines

What MCP Adds to the QA Layer

javascript

1	<button id="saveItem_inventoryPage">Save</button>
2	<input id="itemName_inventoryPage" />

2 lines

Then in tests:

javascript

1	cy.get('#saveItem_inventoryPage').click()
2	cy.get('#itemName_inventoryPage').type('Laptop Stand')

2 lines

When the styling changes, the test doesn't break. The ID stays the same.

A shared constants file

Every test reads from a single source of truth — app-constants.json — which stores all element IDs, API endpoints, and test values:

javascript

1	{
2	"selectors": {
3	"saveItem": "saveItem_inventoryPage",
4	"itemName": "itemName_inventoryPage"
5	},
6	"endpoints": {
7	"items": "/api/items"
8	}
9	}

9 lines

If an ID changes, you update it in one place and all tests stay in sync.

The test structure I follow

One of the more interesting behaviors is how the Tester Agent communicates back to other agents when it finds problems.

When the Tester Agent audits a component and finds a missing data-testid:

javascript

1	🔖 data-testid Request — Tester → Frontend
2	Component: ProductsTable (app/main/inventory/product-list/list/ProductsTable.jsx)
3	Missing IDs:
4	- product-table-row → the <tr> element for each product row
5	- product-stock-badge → the stock level indicator badge
6	Needed for: cypress/e2e/inventory_management/product/product-list-ui.cy.js
7	Action: Add data-testid attributes + register in cypress/fixtures/app-constants.yaml

7 lines

When the Tester Agent finds an endpoint missing an edge case:

javascript

1	⚠️ Endpoint Gap — Tester → Backend
2	Endpoint: POST /api/inventory/v1/product/create
3	Missing edge case:
4	- Returns 200 instead of 400 when usage_quantity is a negative number
5	Action: Add validation so tests can assert the correct behavior

5 lines

The signal sits in pending-signals.md as [PENDING]. The Frontend or Backend Agent picks it up at their next kickoff, makes the fix, and marks it [RESOLVED: YYYY-MM-DD].

This creates a tight feedback loop that would normally require a Jira ticket, a Slack message, and two meetings.

Running tests headed (visible browser)

I always run tests in headed mode — meaning you can watch the browser do exactly what the test does:

npx cypress run --headed --browser chrome --spec "cypress/e2e/inventory/**"

No black-box magic. You see every click, every form fill, every assertion. This makes debugging fast and gives clients confidence that the tests are real.

What Cypress catches

Scenario	Does Cypress catch it?
A button that does nothing when clicked	Yes
A form that submits without validation	Yes
An API that returns the wrong data	Yes
A page that breaks after a CSS change	No (that's visual testing)
A feature that works in Chrome but not Firefox	Yes
A regression introduced by a new feature	Yes

What is an AI Agent for QA?

How the AI Tester Agent works

01.

Step 1: It reads everything first

Before the agent writes a single line of test code, it reads:

The PRD (Product Requirements Document) — what the feature is supposed to do
The GitHub issue — the exact acceptance criteria
The backend API routes — what endpoints exist and what they return
The frontend components — what UI elements are on the page
The existing Cypress fixtures — what IDs and endpoints are already registered

This is what separates an AI agent from a simple autocomplete tool. It has context.

02.

Step 2: It writes a task plan before touching any file

The agent is required to write its test plan in the GitHub issue first — before creating any files:

Tester

Test item creation — happy path
Test item creation — missing required fields
Test item creation — duplicate item name
Test item list loads correctly
Test delete item — confirm dialog appears
Test delete item — item disappears from list after confirm

This plan is visible to the developer and the client. It's a contract — the agent commits to exactly what it will test.

03.

Step 3: It writes the Cypress tests

Once the plan is approved, the agent generates the spec file — following the same structure, naming conventions, and selector patterns described in Part 1.

It knows the rules:

Use id selectors, not class names
One spec file per page
Happy path + error handling + edge cases
Register new IDs in app-constants.json

04.

Step 4: It runs the tests until 100% pass

The agent runs the tests itself using the Bash tool:

npx cypress run --headed --browser chrome --spec "cypress/e2e/inventory/inventory-items.cy.js"

If a test fails, it reads the error, diagnoses whether it's a test bug or an app bug, fixes it, and reruns. It loops until everything is green

05.

Step 5: It updates the QA report

After all tests pass, the agent updates three QA tracking files:

Feature coverage report — which features have tests
Pass/fail history — when tests last ran and their results
Spec file registry — which file covers which feature

How the AI Code Reviewer works

Alongside the Tester, I use a Code Reviewer Agent and a Security Reviewer Agent — both read-only agents that inspect the code before tests are written.

Code Reviewer looks for:

Logic bugs (wrong condition, off-by-one errors)
Missing validation (what if the input is empty? what if the API call fails?)
Code that's harder to maintain than it needs to be

Security Reviewer looks for:

Unauthorized access (can User A see User B's data?)
Exposed secrets (API keys in client-side code)
SQL injection, XSS, and other OWASP vulnerabilities

Both agents produce a structured report with severity levels:

CRITICAL — must fix before merge
WARNING — should fix, has risk
SUGGESTION — optional improvement

Only after all CRITICALs are resolved does the Tester Agent begin.

The Full QA Pipeline

Here's how all the pieces connect in a real feature delivery:

PM writes requirements → GitHub issue created

Developer writes the code (Backend + Frontend)

Code Reviewer reviews the PR → flags issues

Security Reviewer checks for vulnerabilities

Developer fixes all CRITICAL issues

Tester Agent reads the issue + code

Tester Agent writes plan → writes tests → runs until 100% pass

Feature ships to production

Every step is tracked in the same GitHub issue. Every agent leaves a trace — what it planned, what it found, what it fixed.

Why this approach works

Traditional QA	AI Agent QA
Manual testing — slow, inconsistent	Automated — runs in minutes, same every time
QA engineer has to learn your codebase	Agent reads the codebase before it starts
Tests written after the bug is found	Tests cover acceptance criteria from day one
Hard to scale across multiple features	One agent, unlimited features
Expensive for small teams	Efficient — agent works in parallel with developers