Modern infrastructure management and application deployment, both in the traditional data centre/on-prem and cloud spaces, have changed significantly in the last decade. The days of throwing meatbags mashing keyboards should be far behind us since the advent of more recent developments in process and tooling. Perhaps the most famous term for how IT changes are deployed is ‘DevOps’.
Put simply, doing DevOps is the adoption of modern software development practice within the operational space responsible for providing and running applications and servers backed by appropriate tooling and workflow. To put that another way, this is just how we computer here in the closing stages of the second decade of the 21st century.
Oh, that’s simples, move on! Of course I’d not be writing this unless it were a bit more complex than that, so maybe we should step back a bit and look at some of the business drivers that lead us here:
- Human beings are error prone and slow at repetitive work. In fact computers exist precisely because of this.
- Manually building systems consistently within an environment presents … challenges.
- Consistency across multiple (live/non-live | dev/prod) environments has historically been aspirational at best.
- Knowledge about the hows and whys of systems only exists in brains, and isn’t codified anywhere
- What codified engineering knowledge there is lives in Word documents, usually with complex and difficult to navigate approval processes.
- Delivering technical change is slow, and iteration is hard
There’s a natural question that arises here which is how can adopting software development practices in an operations context help?
It’s worth isolating some of the high level things developers do as they build stuff that translate to desirable activities in a modern mode of platform development and operation:
- Version control
- Deliberate release process
- Explicit adoption of project management methodology
- Workflow automation
- Dependency management
- Software architecture
- Functional and non-functional software tests
First and foremost, capturing the how and the what of the way things are into an appropriate version control repository is key. I’ve heard this termed turning human capital into organisational capital. In practice, for me, this means getting everything out of people’s heads and into computer consumable formats.
The scope of what should be captured includes all scripts, code, configuration data and templates and other collateral. Platform development, like software development has a lifecycle and the assets used must be managed and tracked in a manner that enables change to be tracked, approved and implemented according to whatever process you adopt. Capturing that knowledge puts your organisation on the path to systematising all aspects of both implementing and delivering change, and doing so at a stroke solves a range of the problems and opens up a whole range of possibilities:
- Everything is visible and in a place where all the humans know how to get it
- Everything is iterable, and iterations can be subjected to machine controlled workflow
- If it’s code, it’s consistently directly deployable
- When it’s visible, you can use it to drive your own standards
- Every change is visible, and the changeset allows you to evaluate risk sensibly
- You get to make change faster, with a greater assurance of success
- You can tell what’s true, and what isn’t
I cannot stress this enough - capturing, codifying and systematising all your organisational IT development and operational knowledge is the number one biggest win achievable. It’s also the step that allows the next set of improvements to happen - and it’s important to remember that the purpose of change is incremental rather than wholesale improvement. Even if you don’t adopt huge amounts of automation immediately, being able to deploy the same stuff in a consistent way even by hand is an improvement.
It’s also worth pointing out that if you want to version control something, that something has to be apt for version control. This will lead you automatically down a route that causes you to adopt modern tooling, and applications whose management can also be driven out of content that can be version controlled.
What you capture should encapsulate the whole lifecycle of your applications and systems, from the start of designing architectures and applications, through to how systems are operationally managed and operated.
This table shows a (non-exhaustive) set of heads of areas and what to codify:
Once you’ve captured the whats and the hows you move toward an organisational state where what you’re dealing with day to day is code. At that point your assets become testable independently of their implementation. This leads you to a world of workflow, where change progresses towards implementation through functional and non-functional test gates, simultaneously decreasing the risk of any particular change failing or having bugs. In our shiny new world, even if it transpires that there are bugs, the route to fixing those bugs is simply another controlled iteration through a known process leading to a known deployment mechanism.
In terms of cost, moving away from a ‘suck it and see’ model has some upfront implementation costs, but your organisation should see that as an investment with the concomitant implication that there will be a return. That return will be seen as not losing money through outages, unintended error and bugs, saving money through human time saved, and money made through increased velocity and ability to deliver change faster.
Once you see everything as code, you can then move on and start thinking about everything being modellable and testable. Once you start to think of things as being objects with properties, methods, knobs and levers, systems management becomes much more automatable. Also, real models have attributes that enable you to evaluate their health and performance much more accurately.
Concurrently, you probably want to think about having a bit of a rearrangement of how (and in many cases, whether) teams collaborate. Specifically making sure that how a platform is going to be operated must be a first class consideration up front, because this feeds directly into designs and technology choices. Avoid the situation where a development team ships an application that cannot be configured or managed by anything other than a meat-bag wielding a mouse.
If we turn our earlier list of the things that were bad about IT on its head after having started dogn the DevOps, the benefits we get are:
- Repetition is performed by computers,
- Builds are templated, and done by computers and resultant environments are testable against a written specification
- Everything known is captured, and if it’s not written down it’s not true
- Everything is version controlled and apt for incremental change
- Making change becomes possible at whatever cadence and speed is appropriate
So, that’s some history and some desired outcomes of changing the high level approaches we should be taking when building out infrastructure and application environments. In my next rambling, I’ll write some words about what that means in practice - i.e. what a process to build thing might look like, how people collaborate, and where do I keep all this stuff?
Written by Chris Spence - Principal DevOps Consultant