Track Test Flakiness and Duration with CircleCI Test Insights (2026)

CircleCI’s Test Insights feature doesn’t just tell you which tests are failing; it reveals the hidden, insidious creep of flakiness and duration that’s silently degrading your development velocity.

Let’s see it in action. Imagine a simple Node.js project with a few tests.

// package.json
{
  "name": "my-test-project",
  "version": "1.0.0",
  "scripts": {
    "test": "jest"
  },
  "devDependencies": {
    "jest": "^29.0.0"
  }
}

// __tests__/example.test.js
describe('Math operations', () => {
  test('adds 1 + 2 to equal 3', () => {
    expect(1 + 2).toBe(3);
  });

  // This test is designed to be flaky
  test('random number is even', () => {
    const num = Math.floor(Math.random() * 10);
    expect(num % 2).toBe(0);
  });

  // This test is designed to be slow
  test('should take a while', async () => {
    await new Promise(resolve => setTimeout(resolve, 5000)); // 5 seconds
    expect(true).toBe(true);
  });
});

In your CircleCI configuration (.circleci/config.yml), you’d have a basic test job:

version: 2.1

jobs:
  test:
    docker:
      - image: cimg/node:18.17.0
    steps:
      - checkout
      - run: npm install
      - run: npm test
      # This is the crucial step for Test Insights
      - store_test_results:
          path: test-results

workflows:
  version: 2
  test-and-deploy:
    jobs:
      - test

When this runs on CircleCI, the store_test_results step gathers the output from your test runner (Jest, in this case, which can be configured to output JUnit XML). CircleCI then parses this data.

The real power comes from navigating to your project on CircleCI and clicking the "Test Insights" tab. You’ll see a dashboard showing:

Test Failures: A list of tests that have recently failed, along with their historical failure rates.
Test Durations: A breakdown of how long each test is taking, identifying outliers.
Flaky Tests: Tests that pass and fail intermittently without code changes.

This feature solves the problem of "silent killers" in your CI/CD pipeline. Developers often focus on preventing outright test failures, but they neglect the subtle erosion of confidence caused by flaky tests and the ever-increasing build times from slow tests. Test Insights brings these issues to the forefront, making them visible and actionable.

Internally, CircleCI collects test result artifacts (typically in JUnit XML format) from your test commands. It then analyzes this historical data, correlating test names with their execution status and duration across multiple runs. This allows it to identify patterns: a test that fails 10% of the time is flagged as flaky, and a test consistently taking longer than the average is flagged for duration. You control what gets reported by ensuring your test runner outputs results in a format CircleCI understands (like JUnit XML) and by configuring the store_test_results step with the correct path to those artifacts.

The most surprising thing is how Test Insights helps you distinguish between a genuine bug and a test environment issue or a race condition within the test itself. A test that fails only on certain CI runners, or at specific times of day, often points to environmental inconsistencies or timing dependencies that are hard to debug locally. By aggregating data across all your CI runs, Test Insights provides the statistical evidence to pinpoint these subtle problems that would otherwise be dismissed as "random failures."

The next step is to integrate this data into your workflow, perhaps by failing builds if a test exceeds a certain duration threshold or if flakiness crosses an unacceptable percentage.