Building an E2E Testing Architecture for a Complex Micro-Frontend
Before creating an E2E architecture for a complex project, I did not realize this work could become so involved.
At the beginning, I thought it would mostly be about writing browser automation: open a page, click a button, fill in a form, and verify the result. But after working on it for a while, I found that E2E testing is much closer to building a small testing platform.
There are page objects to design, test data to prepare, roles and permissions to consider, browser behavior to handle, files to upload and download, and many flaky situations to avoid.
After finishing one stage of the work, I want to write down what happened during this busy period. Hopefully, when I or someone else looks back at this later, it can make the next round of E2E work a little easier.
Background
The project I am working on is a large micro-frontend inside an existing enterprise application.
It is not a small standalone page. The tested features are loaded inside a host system, and the user flows depend on real routing, login state, permissions, feature flags, backend data, and environment configuration.
This makes E2E testing more valuable, but also more complicated.
The test is not only checking whether a button can be clicked. It is checking whether the micro-frontend works correctly inside the real product context.
Some flows are simple, such as opening a page or clicking a menu item. Some flows are much longer, involving creation, editing, search, upload, download, preview, permission control, and cleanup.
Because of this, I cannot treat the E2E suite as a group of random scripts. It needs a structure.
Goals
The first goal is broad coverage.
I want the E2E suite to cover all major modules, not just a few happy paths. Each important feature should have at least one user-facing flow that proves it still works.
The second goal is to focus on front-end behavior.
The E2E tests do not need to verify every backend detail. If a function can be validated through the UI result, the test should use the simplest visible signal.
The third goal is to avoid unnecessary hard waits.
Using browser.pause is easy, but it makes tests slow and unstable. A better test should wait for a real condition: an element appears, a button becomes clickable, a file exists, a loading state disappears, or a page has finished navigation.
The fourth goal is to support different runtime conditions.
The product has old and new service paths, different roles, different feature flags, and different UI states. A useful E2E structure should make these differences manageable instead of duplicating the same test logic everywhere.
Overall Structure
The E2E architecture is organized in layers:
E2E Test Infrastructure
-> Base Element
-> Base Module
-> Base Page Object
-> Case
-> Case Group
-> Spec
The idea is simple: lower layers handle browser and DOM details, while upper layers describe business behavior.
In practice, the directory structure looks like this:
e2e/
specs/
cases/
pageObject/
core/
files/
constants.ts
fileUtils.js
envUtils.js
Each folder has a clear responsibility.
specs are the entry points of test suites. They decide which user to use, which page to open, which cases to run, and how to clean up data.
cases contain reusable business steps and assertions.
pageObject contains page-level and module-level operations.
core contains shared logic such as login, logout, and common hooks.
files stores test files used for upload scenarios.
fileUtils handles download directory setup, file existence checks, and temporary file cleanup.
This structure takes time to build, but it prevents the test suite from becoming a pile of selectors and duplicated browser commands.
Base Element
Base Element is the lowest abstraction above WebDriver.
It wraps basic DOM operations such as:
click
hover
getText
getValue
setValue
waitForDisplayed
waitForClickable
scrollIntoView
dragAndDrop
The main reason to create this layer is consistency.
Without this layer, each test may handle waiting differently. One test may click directly. Another may use a pause. Another may wait for display but not for clickable state.
That inconsistency is a common source of flaky tests.
By putting common behavior into Base Element, each interaction can follow the same pattern:
await element.waitForClickable()
await element.click()
This also makes future improvement easier. If we find that a certain type of click is unstable, we can improve it once in the base layer instead of changing dozens of test files.
Base Module
A page in this project is often too large to model as a single object.
Many pages contain reusable sections: filters, tables, dialogs, forms, search panels, upload panels, and setting blocks.
So I use Base Module to describe a reusable part of a page.
A module usually contains related elements and actions. For example:
Search module
Table module
Upload module
Dialog module
Form section module
This makes the Page Object smaller and easier to understand.
Instead of putting every selector into one huge page file, a page can be composed from smaller modules.
Base Page Object
Base Page Object represents a real page or a major product area.
It usually contains:
launchUrl
credibleElement
visit()
hasLaunched()
hasLeft()
waitUntilPageIsLoaded()
waitForLoading()
waitForAjaxLoading()
The most important concept here is credibleElement.
When a test opens a page, it should not only check the URL. Some pages may share similar routes, or the page may load slowly after navigation.
A credible element is a stable element that proves the expected page is really loaded.
For example, after visiting a page, the test should wait until a unique title, table, button, or root container appears.
This makes the test more reliable than only checking navigation.
Case
A Case is a reusable business action or assertion.
For example:
Open page
Click add button
Fill basic information
Save form
Verify table result
Upload file
Download file
Delete created data
Each case follows a common lifecycle:
before()
expect()
after()
test()
This structure is useful because many business flows share the same steps.
Instead of writing the same logic in multiple specs, I can create a case once and reuse it.
A case is smaller than a full spec, but more meaningful than a raw browser action. It usually represents one step that a real user would understand.
This layer also makes the test suite easier to expand. When a new flow needs the same behavior, I can compose existing cases instead of starting from zero.
Case Group
A Case Group combines multiple cases into a complete flow.
For example:
new BaseCaseGroup(userId, [
LaunchPage,
CreateItem,
UpdateItem,
SearchItem,
DeleteItem,
])
This gives the test suite a pipeline-like structure.
The spec does not need to know every low-level detail. It can focus on the order of business steps.
This is especially useful for long flows. A full E2E flow may include many actions, but each action can still stay small and reusable.
The tradeoff is that case order becomes important.
Some cases depend on data created by previous cases. For this reason, the spec still needs to manage shared state carefully, especially created data names and cleanup logic.
Spec
Spec is the orchestration layer.
A spec usually does the following things:
Choose a test user
Login
Open target page
Create page objects or case groups
Run cases
Record created data
Clean up
Logout
This layer should describe the business journey clearly.
A good spec should be readable without knowing every selector. It should tell the reader what the test is trying to verify.
For example, instead of writing all interactions directly inside the spec, the spec can say:
Login as a user with specific permission
Open the target module
Create a record
Verify it appears in the list
Update the record
Delete it after the test
This makes the test closer to documentation.
User Matrix
One difficult part of this project is that different users may see different behavior.
The same page can behave differently depending on:
Role
Permission
Feature flag
Data privacy setting
Old or new service path
To handle this, I use a user matrix.
Each test user represents a specific business context. The spec passes a user ID into the test flow, and the case can read the related role, feature status, or permission setting.
This avoids duplicating the same flow for every user type.
The benefit is clear: one test flow can be reused under different conditions.
The cost is also clear: the user matrix must be documented well. Otherwise, IDs like test001 or test101 are hard to understand for new maintainers.
File Upload and Download
Upload and download flows need special handling.
For upload, the test needs stable test files. These files should be stored in the repository, under a clear folder such as:
e2e/files/
For download, clicking the button is not enough.
The test should verify that the expected file actually appears in the download directory.
The browser download path must be configured, and the test should wait for the file to exist:
Click download button
Wait for file to appear
Fail if timeout is reached
Clean temporary download files
This is much better than only checking whether the button is clickable.
A download function is only useful if the file is really created.
React Controlled Input Issue
One issue I met is React controlled input.
For normal input fields, setting the value directly may look enough. But when the input is controlled by React state, directly changing the DOM value may not update the internal state correctly.
The safer approach is to behave more like a real user:
Read current value
Clear it with keyboard actions
Type the new value
Trigger the expected input/change events
This is slower than direct assignment, but it is more reliable.
The lesson is that E2E tests should interact with the application through the same path as the user whenever possible.
If the test only changes the DOM but the framework state is not updated, the test result is not trustworthy.
Dialog and Legacy UI Issues
Another problem is legacy UI.
Not every part of the product is built with the same front-end stack. Some dialogs and components still use older patterns.
This creates special problems for E2E tests:
Dialog not destroyed after close
Hidden elements still exist in DOM
Multiple dialogs share similar selectors
Click event does not behave like a modern component
In these cases, improving the test code is not always enough.
Sometimes the application code also needs small changes, such as adding stable data-testid attributes or making sure dialogs are properly destroyed after closing.
This is an important point: testability is part of product quality.
If a UI is hard to test, it is often also hard to reason about.
Waiting Strategy
The most dangerous shortcut in E2E testing is fixed waiting.
For example:
await browser.pause(1000)
This may work on a developer machine, but fail in CI or under a slower environment.
Instead, I try to wait for meaningful conditions:
Element is displayed
Element is clickable
Table row appears
Toast message appears
Loading indicator disappears
URL changes
File exists
Dialog closes
This makes the test both faster and more stable.
The goal is not to wait longer. The goal is to wait for the right thing.
Test Data and Cleanup
E2E tests create real data.
That means cleanup is not optional.
Each spec needs to know what it creates and how to delete it after the test.
A common pattern is to use a test-only naming convention:
auto_{name}
e2e_{timestamp}
test_{runId}
This makes cleanup easier and reduces the risk of deleting real data.
The cleanup logic should run in after hooks, but that is not always enough. If a test fails in the middle, some data may still remain.
A better approach is to clean old test data before starting a new run, then clean again after the run finishes.
This makes the test suite more repeatable.
Reports
The test suite outputs both console reports and JUnit XML reports.
The console report is useful when running tests locally.
The JUnit report is useful for CI because it can be collected and displayed by the pipeline.
For E2E tests, reporting is very important. A failed E2E test without enough information is hard to debug.
At minimum, the report should help answer:
Which spec failed?
Which case failed?
Which user was used?
Which environment was used?
What was the error?
Was the failure caused by assertion, timeout, or setup?
Screenshots, videos, and traces would make this even better, but even basic structured reports are already helpful.
What Works Well
The layered structure works well.
It keeps the test code from becoming too messy and makes business steps reusable.
The most useful parts are:
Base Element for consistent browser operations
Page Object for page and module behavior
Case for reusable business actions
Case Group for composing long flows
Spec for orchestration and cleanup
This structure also makes it easier to onboard new tests. When adding a new flow, I do not need to think from raw selectors immediately. I can first ask:
Is there already a Page Object?
Is there already a module?
Is there already a reusable case?
Can this flow reuse an existing Case Group?
That saves time.
Current Problems
The biggest problem is still complexity.
The E2E suite depends on many things:
Browser driver
Target environment
Login user
Permission setting
Feature flag
Backend data
Network speed
File system path
Cleanup result
Any one of these can break a test.
Another problem is hidden dependency between cases.
A case may assume that another case has already created data or opened a page. This makes the flow flexible, but it also means maintainers must understand the full case group before changing one case.
The third problem is local setup.
If the README does not clearly explain environment variables, user data, browser requirements, and single-spec execution, the test suite becomes hard to run for new developers.
Improvements I Want to Make
The first improvement is documentation.
The E2E README should include:
How to run all tests
How to run one spec
How to run in local mode
How to prepare user data
How to choose environment
How to debug failed tests
How to clean test data
How download validation works
The second improvement is better test data isolation.
Each run should generate unique data names, preferably with a run ID. Cleanup should be safer and easier to repeat.
The third improvement is reducing fixed waits.
Any browser.pause should be reviewed. If it can be replaced by a real condition, it should be replaced.
The fourth improvement is clearer case metadata.
A case could declare what it requires and what it creates:
{
requiresPage: 'TargetPage',
requiresData: ['createdRecord'],
createsData: ['uploadedFile'],
cleanup: true
}
This would make case groups easier to understand.
Summary
This E2E work is much more complex than I expected. It is not only about writing automation scripts. It is about creating a maintainable testing architecture for a real product.
The most important lessons so far are:
Do not put raw selectors everywhere
Do not rely on fixed pause
Model pages and modules clearly
Make business steps reusable
Treat test data seriously
Clean up after each run
Use real conditions for waiting
Document the local setup
Keep specs readable
A good E2E suite should not only tell us whether the product works today.
It should also help future developers understand how the product is supposed to work.
That is the real value of this work.
Comments