“We’ve built a 70,000-node Mesos cluster for our developers, but they won’t use it. Can you help?” This was the beginning of a conversation with the VP of infrastructure operations in a very large and famous company. While an impressive feat to accomplish, it was also by far the largest containerized infrastructure setup I had seen that had gone unused—nor, sadly, was it an isolated incident.
I’ve talked about this encounter with a large number of customers, analysts, friends, colleagues, partners, venture capitalists, and competitors. We all expressed similar experiences, and all wanted to know why this is so. After all, if so many resources are being wasted in our industry, we are all risking a great deal by not understanding and solving the problem. Otherwise, the next wave of adopters might start to doubt containers can help their businesses, and we would all need to starting polishing our resumes.
I have to be honest here: I am a developer, an engineer and a technologist who loves to build products and use the latest technologies. So, the first place I looked, in my quest to find an answer to this 70,000-node question, was the technologies used. Was Mesos the wrong technology? Was it implemented the wrong way? Did they use open source or closed source? Was there an SI involved? Questions like these came first to mind. In hindsight I think those were probably the wrong questions.
Sitting at my desk as a developer in a large bank, I remember impeccably dressed salespeople coming and going into our meeting rooms, courting our VP of infrastructure and his team. They were from VMware, back then the company for virtualized infrastructure. I was just a developer at the bank, but not even my boss or his boss or even his boss’s boss were invited to any of the steakhouse dinner events the VMware people were hosting almost every week. VMware salespeople were only interested in the operations and infrastructure decision makers. Two or three months later, our team was told that a deal with VMware had been signed and we would be moving our services over to VMs soon, and shortly after this move took place over a couple of weekends.
Then one Monday morning, the services my team were responsible for were running on VMs instead of the old bare-metal servers with flashing blue lights and noisy fans. That was all. Our entire infrastructure was virtualized in a matter of months without much say from the developers, and while we were putting on some fake resistance for this change (and who likes change after all?) and grudgingly agreeing to be on standby over a couple of weekends, we couldn’t really tell the difference between the old and the new setup: Everything was the same. Our VM servers behaved and felt like “real” servers. I am sure we wouldn’t have been able to tell the difference in a double-blind test if anyone had conducted one.
Remembering those days made me wonder why the new containerized wave of infrastructure change doesn’t work the same way. Why can’t we build a Mesos or Kubernetes cluster over a weekend or two and send a memo to the developers with the subject: “Welcome to the future of infrastructure. You’re welcome!”?
The answer as we all know is that containerization is not going to work without developers’ involvement and buyin. Developers need to build applications for a containerized setup, but inherent to containers, with APIs like Kubernetes exposed across the software life cycle, is the imperative for developers and operators to change the way they work and communicate with each other. The reason for a 70,000-node shiny cluster that runs tumbleweed instead of business applications is that the tools we have built for this new transition are not addressing this fundamental and essential organizational change, the meshing together of devs and ops.
The exciting reality is, setting up containerized infrastructure is getting easier, as there is an abundance of open source solutions that get you up and running with a Kubernetes cluster. If you are already running on a major cloud provider, you are simply a couple of clicks away from having your own containerized cluster, managed, serviced, and billed by the minute. The benefits of running a containerized infrastructure are visible to operations teams: single-configuration servers (no more “snowflakes”), built-in high availability and resilience, and improved resource utilization, to name just a few. Developers also see the value of running in a containerized setup: more influence over the running environment, improved control over libraries and dependencies, and narrowing the gap between production and development environments are some of those.Each side of this equation (devs and ops) has its own vendors, tools, and open source projects to help them with what it takes to move to a containerized world—but that’s not enough. We are still missing the framework for devs and ops to work together to make this a success. There are simply very few, if any, tools and technologies available that facilitate this communication.
We are all so focused on our individual areas of innovation—from network to storage and orchestration—that we can lose focus on our customers’ achievement of their business goals. In such an environment, system integrator, consultancy and professional services companies do well, as they are the only ones who are focused on the result and on delivering across the software supply chain; but this is not sustainable. Technologies that require customers paying so much to consultancies to make them work are not going to be breakthrough technologies. Let’s face it: If virtualization needed McKinsey to be ever-present on the payroll for it to work, there would be no cloud today.
For us all to benefit from a breakthrough technology like infrastructure containerization, we need to think more broadly than our single-purpose tools or primary focus areas and rethink the way we build products for this industry. This is different from the revolution of virtualization and cloud, and the sooner we realize that, the greater the benefits to our customers.
Devops is not just a bunch of fragmented tools or a fancy “digital transformation” project, it is a method of working collaboratively between functions, enabled by technology. Therefore, any technology aimed at the devops market, specifically around containerization, also needs to address the continuous collaboration mindset before anything else. So, let’s all build products with that in mind to start and maintain a conversation between developers and operators.
This post was first published here