Part I: Multi-Tenant Architectures and Vclusters.
Our story today is about George, a certain someone from somewhere who builds houses. It all began with George wanting to construct a house of his own in the middle of nowhere. He put in all the effort; and all the creativity in building his dream dwelling.
And a few months later there it was....a house of his own. Two stories, Multiple bedrooms, a backyard garden, a garage, a terrace....the whole shebang. It ticked off all the checkboxes in his requirements list. And even better, he had made it all by himself! (Did I mention he was pretty good at it?)
Now the story doesn't end here. Over time people started noticing his house and were impressed with its how well everything was designed and built. So they started approaching George to create something similar for them too. "We want the same house. But with a few changes here and there of course" everyone said. At this moment, George decided it was maybe a good idea to monetize this. He already had an idea of what needed to be done after building his own house and he assumed he could just do the same for the others. So that's what he did. For each person coming in, he built a completely new, separate house. Each house was an exact replica of his house but with a few tweaks according to a particular request.
But, soon the construction requests started piling up in number and people started becoming pickier with their requirements. Some people wanted a smaller house, some wanted more rooms.....the list went on. To be honest, George was overwhelmed because it was becoming a bit too much to handle when it came to handling costs, managing different construction projects, and whatnot.
It was time for a change. Something had to be done to stop this. And the solution was so simple. Apartments. Instead of building separate houses for people, George decided to apartments. How does this solve anything you may ask? He built one building. Just one. But it had different apartments in it. So that one building had the capability to house multiple residents. They could easily share the resources of the building. A single area for car parking, a community gym, and a common park accessible for all the residents. It had all the swanky features! But then the residents still had the privacy of their own apartments!
Now George did not have to spend a ton of money to build a new house from scratch. Since he had constructed shared spaces (parks, gym, etc), he really didn't need to build them up again. Moreover, there was no more managing new projects all the time instead, he could just focus on maintaining the one building he constructed with all the apartments in it. The end!
Now, why did I bother with this entire story?
Because that's the basis of a multi-tenancy architecture. Just "a building full of apartments." And that's the concept we are going to explore in this post. Let me tell you more
Multi-Tenant Architectures Explained
Okay, now how do we translate the George saga into something more sensible? More technical?
So consider this; each "house" can be seen as an application.
In the beginning, you are in the development phase where you just want to get the application up and running for one user. You haven't really thought about scale yet and you are running everything on a single infrastructure setup that is serving only one user.
That's George's house in the beginning. Designed to his own taste and requirements. Only for him.
Once you have the reassurance that your application does indeed work well with a single user (or user group/organization), you start thinking of expanding to include other users.
You decide to offer your application to a small pool of users first as a paid service (Yes it's a not-so-subtle way of saying you want a SaaS application now). The easy option is to just replicate the entire infrastructure stack for each user coming in. That's exactly like George initially building an entire new house for every person coming in.
Now when I say "replicate the infrastructure stack" what I mean is a containerized application running on a Kubernetes cluster on some cloud provider. Replicating this for each user would mean provisioning a new cluster, running installations, and configuring the cluster for every user.
It sounds simple in theory but think about it a little. You are spinning up clusters at will for each user. You are going through all that laborious installation and configuration on your cloud provider. And we know the cloud providers can send you a nice big bill for a Kubernetes cluster. Unless you have a secret stash of gold somewhere...that is pricey.
Okay assuming you can handle the money. What about maintenance? After a while keeping track of so many clusters is going to be cumbersome. Think of this situation. If you are responsible for maintaining 50 clusters and ensuring that they are always up and running. And one fine day....20 of them go down. Well, good luck answering angry calls from your user.
Okay, how do you tackle this? Remember how George just switched out to building apartments? That's what you do.
That's where multi-tenancy comes into the picture. It's one building with multiple apartments. People are living in separate apartments and are not worried about the other apartments but in the end, sharing the same infrastructure i.e. the building and the other amenities.
Now comparing it to our software, you can have one infrastructure stack being shared by different users. The catch here though, is that each user is completely segregated from the other and is unaware of any other users. Now there can of course be variations to this and you can more nuances in the whole application design process. But in a gist, that's what a multi-tenant architecture is. One infrastructure has multiple users. Most multi-tenant architectures are used to build out SaaS-based systems to achieve a good balance between costs and scale.
Understand it with an Example
To understand the basic concept a bit further, we take an example of a very minimalistic multi-tenant application.
We are going to extend our "George's Apartments" narrative.
After the apartments are all ready and sold, people start to move in. George wants to keep track of the average energy and water consumption for each apartment. So on arrival, Each resident household is asked to install an application where they maintain a record of their monthly expenses on utilities.
Now as the appointed software engineers/architects/experts over here, if we are to make this application, our main requirement is that each household is only going to maintain and view its own expenses. (No one needs to know why the neighbors had such a high energy bill last month! ) . But then you want to make this application as cost-effective as possible.
For simplicity's sake, we call each household a "tenant". Let's see how the overall application should work
- Each tenant moving into a new apartment is assigned a unique identification number called the tenant ID.
- For each tenant ID, we store details like the name, age, and phone number of the primary resident in the household.
- Once the tenant starts using the application and an expense is added, we store the following details:
1. Tenant ID
2. Amount
3. Description
4. Date when the expense was added.
Translating it to Tech!
Once we have the basic requirements in place, we need to translate these into a system architecture. An easy way is to start with a single-tenant system and then take it to the next level.
System Architecture: Simple View
So, a simple architecture will look like this.
One user interacts with a service that provides a simple API to track expenses that save relevant data to a database. We containerize the service and deploy it to a Kubernetes cluster. Simple. Clean. Easy.
But that's not what we are looking for. We need to think about scaling this to multiple users. Here's how.
Incorporating Multi-Tenancy
Next, we start scaling this architecture to a multi-tenant-based system. Again, good to remember that we will be sharing resources, but each tenant needs to be isolated from the other.
The ideal question over here should be:
What resources can we share?
Answer: The Kubernetes Cluster and the Database.
Got it! But then, how do we target tenant isolation if we share resources? First, let us look at the shared Database.
We have 2 tables called "tenants" that store the tenant details with the tenant ID as the unique ID for each tenant. Then we have the "expenses" table that stores the amount and description. But see that it references the "tenant_id". In this way, we segregate the expenses carried out by each tenant.
So in theory, if you query the database via your app saying "Give me the expenses related to Tenant ID XYZ123" as input, the database should only give you records related to Tenant ID XYZ123
That's the first step. But what about sharing the Kubernetes cluster? Let's look at that too.
Incorporating MultiTenancy
Now, we need to know how we can isolate tenants on a single Kubernetes cluster. The first possible solution that comes to mind is "separate tenant separate namespace". Every tenant has their own namespace where a copy of the application is running. Its basically using the namespace as the segregation It's.
It sounds good in theory, but it is not the best way to achieve isolation. Why? First, A bunch of resources are available cluster-wide independent of namespaces. Things like persistent volumes, service accounts, operators, etc. If any one of these resources is affected/updated, your entire cluster will be affected irrespective of the namespace.
Next, the Kubernetes control plane (namely etcd, API server, scheduler, and controller manager) is also cluster-wide. In case any of the components is affected because of any faults in configuration, your entire cluster gets affected.
So how do we overcome these challenges?
This is the beginning of an interesting concept called Virtual Clusters.
Virtual Clusters - The Cluster Inception
Once "supposed" Kubernetes cluster running on a namespace of another Kubernetes cluster. That's what you'll call a virtual cluster.
In other words, you have a lightweight Kubernetes cluster that is shipped with its own scheduler and control plane components like API server, controller manager etcd storage but is running on an actual cluster's namespace using the host's node groups and networking resources. Like a cluster parasite.
Why use a Vcluster?
Two reasons: Isolation and Cost.
Namespaces in Kubernetes do give you some amount of isolation but not completely. There is a common control plane and global resources that are available cluster-wide and independent of namespaces. So, if one of the cluster-scoped resources is affected, then all the namespaces are affected. In the clusters of vclusters, all the resources are created within the virtual cluster using its own storage without the host being involved
Next is the cost. You have multiple virtual clusters that you can configure according to different tenant needs running on one single cluster. This is definitely a cheaper upgrade over maintaining and configuring so many different clusters.
They are faster to spin up and shut down as compared to provisioning an entire new cluster. Moreover, it's a faster option when you want to quickly test out new developments or have a "playground" environment and need Kubernetes capability for a short time.
Another advantage is the "pause" feature. You can temporarily pause a vcluster and spin it back up in the exact state as it was before. Think about it, if your 'tenant' was a customer who forgot to pay their bills. You pause the vcluster for a while until the bills are paid. Pretty profitable right?
Conclusion
Now we have the basics cleared out, we can actually try out a few things ourselves. I'm talking about writing code and using that beautiful black-and-white terminal! We use minikube and vcluster and deploy a barebone application in Go. We hope to convince you about how nifty vclusters could be.