Developing for Production, Not the Next Environment

Wael Rabadi2025-09-25

Most projects start the same way: spin up a dev environment, stand up staging, wire in a test framework. It feels safe, methodical, and mature. But what it really does is delay the only environment that matters — production. By the time production enters the picture, it's under deadline pressure. That's when the shortcuts surface. Assumptions snap. Fragility reveals itself.

We hear the same refrain from engineering teams: *"We're getting everything ready for dev, then for stage." *But production isn't the last stop in a sequence — it's the anchor. Development and staging are useful, but only as safety nets. If you're not designing for production from the start, every step is a rehearsal without impact.

The cracks appear fast. You don't need to squint to see them:

  • Hard-coded values masquerading as design:

    One of the most evident signs a team isn't thinking production-first is when code relies on hard-coded values. A developer might embed an API key or a database connection string directly into a class. It works on their machine. It even works in staging. But the moment it hits production, the connection fails — or worse, points at the wrong system entirely. Suddenly every release requires a manual swap, and what should be a clean, automated pipeline turns into a brittle ritual.

  • Staging success that collapses under real data:

    Staging is tidy. Datasets are small, clean, and uniform. Production is messy: missing fields, duplicated entries, sometimes corrupted records. A service that passes in staging may throw errors, time out, or corrupt records in production —all because the system was tested against a simulation, not reality.

  • Manual tweaks propping up fragile environments:

    If each environment requires special handling — a different connection string in staging, a manual config edit in prod, or a restart sequence only the ops team knows — everyone loses confidence. Success depends on tribal knowledge instead of system design. Treating manual steps as "smells" and codifying them into infrastructure-as-code keeps environments reproducible and aligned with production.

image1

A production-first approach flips that logic. You design for production from day one and make other environments flexible enough to support it. That way, dev and stage aren't destinations—they become the safety nets they were intended to be.

This is the mindset behind the Modern Delivery Stack. We're not asking which tool to buy. We're asking different questions:

  • What are the essential components of delivery?
  • How should they be structured?
  • How do we automate and orchestrate them so every change flows smoothly to production?

The goal isn't speed for its own sake—it's building a process that gets software into production reliably and with confidence so that engineering leaders aren't left explaining why deployments take longer while incidents keep increasing.

What Production-First Development Looks Like: The Systems Approach

1. Design for Production Requirements from Day One

Setting up dev environments, spinning up staging, and building test frameworks feels like progress, but the path to production is silently being sabotaged. And when that becomes evident, every shortcut and hidden assumption surfaces at once.

Fragility often starts when teams optimize for developer convenience, promising themselves they'll fix it later for production. That "later" always arrives sooner than they think — usually in the form of an outage, a failed deployment, or a 2 a.m. scramble to patch brittle code.

The alternative is deceptively simple: Make your very first deliverable run in production. It doesn't have to be polished. It might be a landing page, a barebones API, or a feature flag that toggles nothing. The value isn't in the feature itself — it's in forcing the team to learn what it takes to ship safely where it matters. Complexity can come later, but first, you have to set the anchor.

In practice, that means:

  • Using configuration, not hard-coded values: Keys, connection strings, and IDs belong in configuration files, environment variables, or secret managers — never in code.

  • Building with production-shaped data in mind: Production datasets aren't small or clean — they're messy, duplicated, and often unpredictable. Systems need to be tested against real-world shapes and volumes to avoid the classic "it worked in staging" surprise.

  • Designing for production security from the start: A super-user account might work in development, but it's a liability in production. Role-based access, least privilege policies, and secure defaults should be designed in from the start.

  • Planning for failure modes: Timeouts, retries, and degraded operations aren't optional extras. They're what separates a minor production hiccup from a customer-facing outage.

In a systems approach, these aren't patchwork fixes bolted on later; they're structural choices that shape how every component interacts across environments. They ensure that what runs in production is not an accident of convenience, but the intentional outcome of design.

2. Make Environments Configuration, Not Architecture

When teams think in silos, environments quickly become bespoke systems: staging with its own database quirks, dev with its own playbook, and production with manual overrides nobody dares touch. Each environment drifts away from the others, creating hidden failure points that sprout up only at the worst possible moment.

image1

A production-first systems approach rejects this fragmentation by treating environments as configurations of the same architecture. They're variations of the same system, distinguished only by configuration. Production sets the standard; everything else inherits from it.

That principle reshapes delivery:

  • The same deployment artifact runs everywhere. Build once, and promote it with confidence. The artifact that passes in staging is identical to the one that runs in production.

  • Configuration is the only thing that changes. Endpoints, feature flags, and secrets are injected per environment — ensuring consistency while maintaining flexibility.

  • Infrastructure as a constant. Networking, security, scaling, and orchestration are defined once and applied universally as environment-agnostic foundations.

  • Integration points behave the same way regardless of provider. Neither authentication nor data access breaks when the provider changes; The pattern stays the same whether it's Okta today or Azure AD tomorrow.

The result is systemic clarity: Environments stop being competing realities and start being expressions of the same system. That's how teams build confidence. They're not proving whether the system works in staging; they're proving that the same system will work anywhere it runs.* *This creates predictability and enhances the scalability of delivery.

3. Eliminate Environment-Specific Manual Steps

In fragile systems, releases depend on tribal knowledge: An ops engineer who knows which service to restart, a put-together script that must be run before deploy, or a checklist taped to someone's desk that explains how to "get production ready."

These steps calcify as hidden dependencies. Soon, confidence in the pipeline isn't about whether the system works, but whether the right people are around to shepherd it safely into production.

A systems approach treats these manual interventions as signals of deeper design flaws. Every step that requires a human handoff introduces variability, delays, and risk. On the other hand, production-first teams codify them into automation or infrastructure-as-code:

  • Restart sequences become orchestration logic: Instead of someone logging in and restarting services in a specific order (e.g., "restart the database first, then the API, then the frontend"), the system encodes that order into an orchestrator like Kubernetes or Terraform. Now the sequence is automatic and consistent every time.

  • Config edits become version-controlled templates: Instead of manually editing .env files or tweaking config in production, configuration is stored in templates (e.g., YAML, JSON, Helm charts, etc.) and kept in version control with a system like GitHub. That way, changes are tracked, reviewed, and deployed consistently.

  • Deployment rituals become pipelines that run the same way every time: Instead of following a checklist ("run this script, copy this file, notify ops"), everything is defined in a CI/CD pipeline. The pipeline runs the same way every time, no matter who presses deploy.

image3

When it comes to production deployment, speed is excellent, but investing in reliability is even better. A system that can be deployed the same way, every time, by anyone, scales without dependencies. It frees engineers from fighting fires and creates a foundation where production shifts from precarious to predictable.

Case Study: How a Payments Company used Kisasa's Blueprint to Achieve Production with Confidence

One of Kisasa's earliest clients was a payments company. Where they started is a place many of us know well, and their journey toward modernization illustrates why production-first thinking isn't optional.

The Evolution

They began with what seemed like a solid foundation: a monolith deployed on AWS, following Amazon's recommended architecture at the time. On paper, it looked fine. In practice, it was rickety!

Everything was created manually. A single super-user account controlled the system. Roles and policies didn't exist. Staging was a clone of production — sometimes even sharing the same database. The result? A highly fragile system. Everyone knew it worked, but nobody dared touch it. Production had become a glasshouse.

The Breakthrough

When the need for modernization became urgent, the company explored Kubernetes briefly but abandoned the idea as it was too complex for a two-person team. The fallback was Elastic Beanstalk, which was simpler and more accessible. But it came with hidden costs. Branch names were tied directly to environments, and the CI/CD pipelines had "staging" and "production" naming hard-coded into them—making the entire setup rigid and hard to evolve. Plus, knowledge lived in the engineers' minds instead of the system. When they implemented Elastic Beanstalk, the system did become more stable, but it was still fragile.

Over time, the team added unit and integration tests to the pipelines, and pull requests couldn't be merged without passing these tests. This improved confidence, but the infrastructure was still a house of cards. Nobody wanted to adjust security groups or IAM roles because nobody knew what depended on them. When issues arose, the fixes were handled with patches: firewall rules stacked on top of firewall rules, IP blocks hastily applied. The system still lacked resilience.

The Turning Point

The shift came with CDK for Terraform (CDKTF). Using the same runtime the product was written for (C#), their developers didn't need to be Terraform experts. They could work in a language they knew, while CDKTF generated the manifests.

CDKTF forced discipline: Security had to be explicit, dependencies had to be mapped, and environments had to be reproducible.

The early days were punishing — deployments took hours, and fixing mistakes often meant tearing everything down and starting over. At Kisasa, we knew what good looked like. We always kept focus of the end goal and guided the development team through the tough times. Soon, a system emerged. Infrastructure became software. Environments became consistent. And production stopped being a brittle exception. It became one of many environments that could be spun up, tested, and deployed with confidence.

The Result

At the end of the deployment, the team could stand up environments on demand with the same repeatable process:

  • Initialize – Validate the prerequisites: make sure state management exists, profiles are configured, and security policies are in place.

  • Build – Generate Terraform configurations safely, using CDKTF to translate familiar C# code into infrastructure manifests.

  • Deploy – Spin up fully isolated environments—from VPCs to load balancers—consistently and with confidence.

What had once been a glass house became a system they could build on and scale with confidence. Production wasn't something to fear anymore with the modern delivery stack.

The Modern Delivery Stack Is What Separates Fragile from Resilient Delivery Systems

Most DevOps conversations collapse into tools.

  • Which CI/CD platform should we use?
  • Which branch promotes staging?
  • How many runners do we need?

These are tactical questions. They solve local problems, but they don't scale across an engineering org. Tool-first DevOps often fixes today's bottleneck while planting tomorrow's fragility.

Modern delivery requires a different lens — one that prioritizes patterns over playbooks. The questions shift from tool choice to delivery design:

  • How do we structure deployment capabilities across the org?
  • What are the repeatable patterns that get us to production reliably?
  • How do we create composable, flexible platforms that don't collapse when tools evolve?

Patterns are where leverage lives.

Instead of each group reinventing how to handle authentication, observability, or database access, the organization defines a consistent approach once. That pattern is abstracted and codified into the delivery stack — for example, a standard authentication flow that works whether the provider is Okta today or Azure AD tomorrow. Every team consumes the same pattern through configuration, not by rewriting code. The result is less drift between teams, faster adoption of new capabilities, and fewer points of fragility when tools evolve. That's what makes the delivery stack predictable at scale.

What does this look like in practice? Here are three common delivery patterns that highlight how reliability scales when you define once and reuse everywhere.

Patterns in Practice: How Reliability Scales

Authentication patterns: Without patterns, chaos creeps in. One team hardcodes Okta calls, another uses Azure AD SDKs directly, and a third builds a custom login flow. When the provider changes, every team scrambles. With a pattern, authentication is abstracted: applications request "SSO capability" from the platform, and the stack handles the provider details. It doesn't matter if the backend is Okta today, Azure AD tomorrow, or something else in five years — the pattern stays the same.

Observability patterns: Teams left to their own devices choose different logging formats, tracing tools, or metrics exporters. Leaders end up with siloed dashboards and inconsistent signals. With a pattern, observability is standardized: Every service emits metrics, logs, and traces in a shared format, and the platform routes them to the chosen backend. Teams focus on what to measure, not how to wire it up, and the org gains systemic visibility.

Database access patterns: In organizations with fragmented approaches to development, developers embed direct connection strings into code, each team choosing its own setup. That leads to drift, security risks, and crack-prone migrations. With a pattern, database access is requested the same way everywhere: through environment variables and configuration, with credentials managed centrally. Teams don't need to know which instance they're on — they just know the pattern works in dev, stage, and production.

Patterns like these shift the burden from teams to the delivery stack. Reliability stops being something each team has to reinvent, and becomes something the org encodes once, at scale. That's how delivery becomes predictable, and that's why the Modern Delivery Stack scales.

Applying the Modern Delivery Stack at Scale

Once the essential components of delivery are mapped, the harder questions emerge — the ones that separate vulnerable organizations from resilient ones:

  • Where is tribal knowledge creating risk, and how do we codify it?

    Critical dependencies often live in the heads of senior engineers. Unless this knowledge is encoded into infrastructure-as-code, automated pipelines, and shared patterns, delivery confidence remains elusive.

  • How do we measure engineering confidence in the pipeline?

    Deployment frequency, rollback rates, and time-to-recover aren't vanity metrics. They tell leaders whether the system can absorb failure and recover predictably — or whether every deployment is a coin toss.

  • Which parts of the pipeline need to be automated end-to-end, and which don't?

    Automation everywhere sounds good, but it isn't strategic. The real strategy is discernment, knowing which paths to production must be codified and which steps can remain manual without adding vulnerabilities.

  • Are we automating local optimizations, or are we solving system-wide bottlenecks?

    Automating a single step may feel like progress, but unless it addresses system-wide flow, it doesn't move the needle on delivery outcomes.

These are the questions leaders with a vision for more resilient systems ask. They shift the conversation from tool choice to system design. They know that modernization doesn't fail for lack of automation — it fails when automation doesn't scale.

Beyond the Toolchain: Designing for the Delivery Lifecycle

It's tempting to reduce modernization to tool selection. Jenkins, GitHub Actions, Tanzu — each promises faster pipelines, smoother automation, and fewer headaches. But as my co-founder, Wael Rabadi, puts it, "It's never about the tool. It's about the lifecycle. It's about confidence. It's about encoding the entire playbook, not just picking another hammer."

That's the real value of modern delivery. A platform like Tanzu can take code and run the entire pipeline — build, test, deploy — behind the scenes. But that power is wasted if the system isn't designed with production-first principles. Without lifecycle thinking, even the most advanced tools get reduced to half-used platforms and broken delivery processes.

We hear it all the time: "We've got Tanzu, we just don't know how to use it." That doesn't reveal a tooling problem. It reveals a design problem. Tools can be accelerators, but only if the underlying system points in the right direction.

image4

Modernization succeeds when lifecycle design comes first and tools follow. That's the foundation of a production-first strategy, and it's the only way tools can ever fully deliver on their promise. In the end, tools don't fix weak systems — they magnify whatever system they're plugged into. A production-first lifecycle ensures that what gets amplified is reliability, not vulnerability.