If you’re confused when you read about “[some software term] as code” or “everything as code,” all you really need to know is that we’re talking about automation: The thing we use to do tedious tasks for us, or orchestrate tasks when they become too large and complex for manual methods.
Developers had been automating their software delivery processes for a while, but IT infrastructure operators/sysadmins have been catching up only in the past decade. Sysadmins still used automation, but it was usually through a series of manually triggered scripts—some of them less organized than others. Developers are helping Ops drive this evolution so that devs will have fewer bottlenecks and can take care of most of the software production process themselves, without much manual gatekeeping.
So where did everything as code come from? To understand that, let’s actually look at every stage of software production and see where there’s already “*-as-code” automation. Let’s use this article as an opportunity to think about areas of the software development lifecycle (SDLC) that are already well-automated and abstracted so that we can see if there are any areas for novel automation and learn where “everything as code” resides. This might help you sort through the “*-as-code” buzzword noise and distinguish which tools are actually novel and helpful versus those that are just trying to sound new and innovative.
This is the phase before coding begins. Requirements are created by business and technical stakeholders, plans for the software are drawn up, and architecture and UI designs are diagrammed and wireframed.
Architecture as code
Note about an emerging usage of the term “architecture as code”: This term is also starting to be used by Amazon Web Services, and others, as a term to describe higher-level infrastructure as code patterns. This is obviously different from the application-level architecture I’ve referenced in this section. AWS’s architecture as code instead refers to an infrastructure architecture—such as container orchestrator configuration, load balancers, security groups, and IAM roles—all defined via reusable templates for particular service types (e.g. an API service behind a load balancer, a service that pulls work from a queue, or a scheduled job).
These templates automate the provisioning of components such as container orchestrator configurations, load balancers, security groups, and IAM roles with your pre-specified configurations. This version of architecture as code exists in the form of Terraform modules, Kubernetes custom resource definitions, and various constructs from the major cloud vendors.
Projects as code + Scaffolding
Application environment as code
Virtual machines (VMs), containers, and other various abstractions of a software production environment have been a blessing to developers for many years as they continue to get better and easier to manage. Why try and debug 100 different things that could go wrong while manually setting up an operational environment when you can automatically spin up template environments that allow you to start coding with almost no issues?
You can think of these tools as providing system dependencies as code. A Dockerfile is a codification and templatization of the dependencies—such as the version of your programming language, or the version of your database—and Docker is the tool that runs and enforces those dependencies.
Version control is an overarching phase that is constantly in use from the first completed lines of code all the way to patches on the final product years later. Even most beginning developers quickly start to understand that modern development is impossible without version control. It’s change tracking, development gatekeeping, race condition management, and a host of other things as code.
Unit testing is almost always codified and run by the development team as one big automated checklist. Integration and UI tests are typically run alongside this bundle of unit tests after a code commit. Some tests will happen before the build, some will happen after. The amount of automatic checks, error catching, and code gatekeeping that test frameworks give is indispensable in modern software development.
Each programming language ecosystem comes with its own list of testing frameworks, some using the language itself to define the test actions as code, and some using their own particular syntax.
- Popular test frameworks for Java include JUnit and Spock
- Popular test frameworks for Ruby include RSpec and Minitest
- Popular frameworks for browser test automation include Selenium and Cypress
Build automation & dependency management
Build automation has been around for a very long time. It’s a simple idea that developers have quickly understood—instead of trying to do a number of complex manual steps off of a checklist to correctly build a working version of a project, you should create a tool that runs down your checklist in the form of code and automates that build process without accidental omissions.
This is especially useful when you have complex build environments that might require:
- Code generation
- Metadata customization
- And more
In a Java environment, this might mean creating DAOs from your configuration, or generating class-mapping code such as JAXB.
Build automation also helps bridge the gap between a build that might pass locally but might still be missing dependencies or fail for another reason when it goes to the production environment. These build automation tools provide another form of dependency management by gathering all the right libraries and other pieces for the build.
A developer runs one command, clicks a button, or in some cases they have set up an automation chain of events where they don’t have to do anything but simply commit code and the build manager will automatically run.
Most major programming languages have their own community-preferred options for build automation.
Continuous Integration / Continuous Delivery (CI/CD)
The term CI/CD is commonly used today to describe the overarching tools and practices that are pulling the practices of build, test, and deployment automation together to happen multiple times a day as code merges to the product’s production branch/trunk become more frequent.
Operations is the phase of software production commonly managed by people with different skill sets than developers (though not always in smaller organizations). These include sysadmins, operations engineers, DevOps engineers, and other titles.
This is the stage where the physical and virtual hardware that the applications will run on are managed and provisioned. These days, many operations professionals only deal with abstractions on top of pools of those physical machines.
This category has a very broad sounding name, but in most software engineering circles it’s used as a descriptor for tools that codify individual machine definitions. These tools install and manage operating system configurations on existing servers.
Their big advantage is the ability to create immutable infrastructure: which means that instead of continually updating existing machines’ configurations, operators will just decommission the machines, then spin up brand new machines and graft updated templates from the configuration management tool onto the blank machine, giving it a fresh start every time a change is made. This may sound extreme, but in the complex world of IT operations, it’s better to remove the possibility of strange bugs emerging as compounding updates allow entropy to creep in.
Infrastructure as Code
The full promise of infrastructure as code came with provisioning tools, which—in addition to providing templates for the configuration of infrastructure components—could also boot up the infrastructure itself. This also became much easier as more organizations started using cloud infrastructure that could be spun up with the push of a button.
Container / Cluster orchestrators
Container orchestrators take the benefits of containers and make it easy to scale up their deployment and management. VMs can also benefit from this approach.
Security & Compliance
Security shouldn’t be treated as a phase after development, but too often, in many organizations it is. The general consensus from leaders in the IT security space is that security and compliance should include actions—some manual, some automated—that are baked in to your software production lifecycle.
Security as code
Security as code is a bit broad to be taken literally (like many of the “as code” categories here), but it’s helpful to think of it as inserting automated security actions into each area of software production. These automatable actions could include a number of things:
Planning & Design
- Generating initial threat models
- Building templates for automating security test generation
Development & Testing
Deployment & In Production
Using all of the previously mentioned as code practices will also improve security as a byproduct since all of these automations create authorized, repeatable, and auditable actions that take a lot of human error and entropy out of the equation—leaving less chance to create security holes.
Policy as code / Compliance as code
Policy as code is similar to the gates in the CI/CD pipelines that would stop a code merge if certain tests failed or the build broke. In fact, most of the policy as code frameworks integrate right into your version control system. Some policy as code involves managing security and security-related compliance at scale, so in that sense it’s partially security as code. But policy as code can also be used for various other engineering governance rules as well.
Maybe you’d like to limit the types or amount of infrastructure a certain user can provision. Or you’d like to require engineers to use tags for the resources they’re creating. There are countless possibilities.
“As code” should mean added efficiency, safety, or insight
I have no doubt that we’ll see more technologies designated as “[thing] as code.” The important thing to remember is that it’s all just automation and abstraction. The real understanding of a new tool will come from understanding the exact areas of software production that it automates, and whether or not that’s anything new.
Automating things according to a piece of configuration code can be beneficial in a number of ways. When something is “codified” you get:
- Linting, static analysis, and alerting to enforce code consistency in a large organization
- Testing to make sure the automation works solidly
- A common language to allow anyone with knowledge of the automation language to collaborate on how it’s built
- Separation of concerns to make it easy to give pre-packaged, best-practice modules of automation to other groups who shouldn’t need to be experts to use the automation
- A model to give you an overarching conceptual map of the section of pipeline that you’re automating.
Hopefully this article serves as a starting point to think about where you might want to introduce and configure some new automation in your own organization.