- 17th Dec 2018
- Juanjo Cerezuela, Guestline
It all started with one goal: We needed a platform to ship our new battalion of microservices.
As heavy users of Azure we could choose among many different options in terms of compute, so six months ago we sat down to evaluate our possibilities and we agreed on one destination: Kubernetes.
Start with Why.
We wanted a place where our applications could benefit from the latest cloud practices in the microservice ecosystem.
Quoting Sarah Wells: As a company, you only have X innovation tokens, so you need to spend them wisely.
Guestline is a Cloud-based company. We have been in this journey for a while, and as many in the industry we started by lifting and shifting our main applications to VMs, after that we saw ourselves slicing the monolith either with MicroFrontEnds or MicroServices, and soon we saw that we could benefit from more innovative technologies.
We did a few POCs on Kubernetes and these are the main reasons why we chose it:
- Containers are the new way of working. As soon as you start packing your application into containers, your environment becomes consistent so you can carry it from development to production. Kubernetes comes here by providing an orchestration platform to manage container workloads.
- Horizontal scaling and self-healing. You provide a recipe to Kubernetes of how your application should look like, and it will do the rest for you. A container goes down? It will restart it for you. Do you need Service Discovery, Load Balancing, Automatic Bin packing…? You have them too!
- Everything is pluggable, so you build your ecosystem as if you were using Legos. As an outcome of that, the community is so big that you could benefit from the many frameworks out there. Do you need a template for deployments? Helm could help you with that. Do you need a way to manage certificates? Let’s Encrypt works like a charm in K8S. A way to log metrics and display them in beautiful charts? Prometheus + Grafana. Or OMS if you are running on Azure. Architecture is difficult in the land of microservices and you need a better way to understand how everything ties together? Weave Scope could do. There are just so many.
On top of that, we chose AKS over running our own Kubernetes cluster because we wanted a managed system that could help us testing the waters. We don’t have — yet — the people to run our own Kubernetes cluster, so we took a leap of faith as soon as it went GA. Our very last reasons:
- Being Cost effective. We didn’t want to run all the maintenance and the hassle to keep everything running smoothly. With AKS many of those things such as the master nodes come for free.
- We needed to integrate with a private Virtual Network with an existing Express Route. So AKS helps you do that if you choose to use Advance Networking.
How we did it?
To start off on the right foot we needed a culture shift, so we rebranded a central team dedicated to many things from provision infrastructure, to maintain SQL servers or the legacy applications, manage secrets, create pipelines…to solely being responsible to provide a platform where teams could ship our new set of applications.
As you can imagine, a culture shift doesn’t happen overnight, so it was challenging to balance your legacy while you try to build your new shiny flagship.
The spirit was to create a team with the whole idea of making themselves redundant from their current job by automating every task that they will face along the way. So soon they could spread themselves to other teams to help them embrace the learning curve. At the end of the day, everything is about reducing your lead time to production. Everything is about bringing maturity and speed.
What did we do?
We are heavily invested in AzureDevOps (the old VSTS Online), so we took some time automating Task, Variable Groups, and Service Connections. We created a pipeline to provision AKS, and there was plenty of research around deployments and security. At the end of the day:
While provisioning the clusters we encountered many issues that we felt defeated so many times, but every small win counted towards the goal we set ourselves: To be able to run a workshop by December.
We did a huge research around Helm, our templating choice when it comes down to deployment, and how to effectively work with namespaces to isolate teams, resources, and secrets. We chose to mix it with Hashicorp Vault so each namespace would have their own tiller, so ideally you’ll only deploy to your team namespace.
We mapped how many things were definitely mandatory, and how many things we could live without in our first release so we could get back to them to improve later.
At Guestline Labs we want new hires to have a great onboarding experience. I’ve seen first-hand many times the huge positive difference that investing in new hires makes to the long term success of that employee
What was the workshop like?
We started off with a theoretical talk about why Kubernetes and which are the benefits we can see there.
We built a simple hello world application that teams will need to ship to our Continuous Integration environment. We provided documentation on how to troubleshoot most common scenarios when deploying, either using AzureDevops or running some queries in OMS.
Our default pipeline looked like that:
And when the day of the workshop finally arrived, we faced two main issues:
- Writing Yaml was proven to be a challenge. And the errors we were getting were indeed quite misleading.
- Totally unexpected broken pipes when using tiller-namespaces.
Consequently, the outcome was that not many were able to ship a simple hello world to CI. And if that is proven to be so complicated, how can we face real production workloads? How can we ship business critical applications?
We needed to adapt and change.
So we ran a retrospective.
Yaml is about experience, it’s true, but there are many things we can do out there to help teams help themselves. So we built our own helm template and started to teach teams how to use it.
We discarded the idea of one tiller per namespace. We still want to use Hashicorp Vault to manage certificates, but isolating tiller per namespaces was a bit too much. From here, we are evaluating how to be secure without being a bottleneck, while still providing easy ways of deploying.
How one can achieve balance among all those things?
Those are indeed the challenges that our team will need to face in the following quarters.
After the workshop
We managed to ship three applications to Production following the new template, and the teams feel happier using that.
Our next challenges are: Improve monitoring and alerting (we are already using WeaveScope, and we love it), spread the knowledge about this new paradigm so teams start shipping containers to K8S, continue improving security and start working toward chaos monkey scenarios. Quite a lot to keep us busy for a while!
Keep tuned if you would like to know more about our journey!
If this looks appealing to you and you fancy a new challenge, have a look at our open positions or the hints for our interview Process, you could join Core Services, the team leading Kubernetes at Guestline or any other development role.