Ship Features Safely with Feature Flags in Your DevOps Pipeline (2026)

Feature flags are a powerful tool for managing the rollout of new features, but they can also become a surprisingly effective way to test code in production without impacting users.

Let’s look at how this works in a typical CI/CD pipeline. Imagine a developer pushes a new feature, new-checkout-flow, behind a feature flag.

# .gitlab-ci.yml
deploy_staging:
  stage: deploy
  script:
    - helm upgrade --install my-app ./charts/my-app --set featureFlags.newCheckoutFlow=true
  environment: staging

deploy_production:
  stage: deploy
  script:
    - helm upgrade --install my-app ./charts/my-app --set featureFlags.newCheckoutFlow=false # Default off in production
  environment: production
  when: manual # Manual approval required for production

In this simplified GitLab CI configuration, the deploy_staging job deploys the application with the newCheckoutFlow flag enabled. The deploy_production job, however, deploys with the flag disabled by default. This means the new code is present in production, but inactive.

Now, let’s say we want to test this new-checkout-flow in production, but only for internal QA engineers. We can use a feature flag management system like LaunchDarkly or Unleash.

// LaunchDarkly Feature Flag Configuration
{
  "name": "new-checkout-flow",
  "description": "Enables the new checkout process.",
  "variations": [
    { "name": "off", "value": false },
    { "name": "on", "value": true }
  ],
  "rules": [
    {
      "clauses": [
        {
          "contextKind": "user",
          "attribute": "email",
          "op": "in",
          "values": ["qa@example.com", "lead.qa@example.com"]
        }
      ],
      "variation": 1 // "on"
    }
  ],
  "defaultVariation": 0 // "off"
}

With this configuration, only users with the specified email addresses will see the new checkout flow. For everyone else, the old flow remains active. This allows QA to thoroughly test the new code in the real production environment, interacting with actual production data and infrastructure, without risking a bad user experience.

The core problem feature flags solve is the coupling of deployment and release. Traditionally, deploying new code meant releasing it to all users immediately. Feature flags decouple these, allowing you to deploy code whenever it’s ready, but release it to users on your own schedule.

Internally, feature flags are typically implemented as conditional logic within your application code. When a request comes in, the application queries the feature flag service (or checks a local cache of flag states) to determine if a specific flag is enabled for the current user or context.

// Example in Node.js with LaunchDarkly SDK
const LaunchDarkly = require('launchdarkly-node-server-sdk');
const ldClient = LaunchDarkly.init('YOUR_SDK_KEY');

ldClient.once('ready', () => {
  console.log('LaunchDarkly client is ready.');

  app.get('/checkout', (req, res) => {
    const user = {
      key: req.session.userId,
      email: req.session.userEmail // Used for targeting rules
    };

    ldClient.variation('new-checkout-flow', user, false, (err, isNewFlowEnabled) => {
      if (err) {
        console.error('Error evaluating feature flag:', err);
        // Fallback to old flow in case of error
        return res.render('old_checkout.ejs');
      }

      if (isNewFlowEnabled) {
        res.render('new_checkout.ejs');
      } else {
        res.render('old_checkout.ejs');
      }
    });
  });
});

The ldClient.variation() call checks the state of the new-checkout-flow flag for the given user. If the flag is true for this user (based on the rules defined in LaunchDarkly), the new_checkout.ejs template is rendered; otherwise, the old_checkout.ejs template is used.

The exact levers you control are primarily targeting rules and percentage rollouts. Targeting rules allow you to specify which users, groups, or attributes should receive a particular feature. Percentage rollouts enable you to gradually expose a feature to a small percentage of your user base (e.g., 1%, 5%, 10%), monitoring for issues before increasing the exposure.

A common misconception is that feature flags are only for "big bang" feature releases. In reality, they are incredibly useful for gradual performance testing of new infrastructure or database changes. For instance, you could route 5% of traffic to a new read replica using a feature flag, monitoring its latency and error rates without impacting the majority of your users. This allows you to validate performance characteristics of infrastructure changes under real-world load before committing fully.

Once a feature is fully rolled out and stable, you’ll want to clean up the old code paths and remove the feature flag from your codebase and feature flag management system. This prevents technical debt from accumulating and keeps your application lean.