Software as a Service (SaaS) is the delivery method of choice for software vendors looking to serve a turnkey and reliable product experience to their end customers. There are many considerations a company must take into account when building a SaaS offering, one of which of course is the framework you will use to run your SaaS application. Since modern software development uses software containers, a natural and most popular choice for running modern SaaS platforms is Kubernetes, the popular container orchestrator. In this post we will go over the fundamentals of deciding what architecture to choose when building a SaaS platform on Google Kubernetes Engine (GKE).
Benefits of using GKE for SaaS applications
GKE is a managed, production-ready environment for deploying containerized applications. It is based on Kubernetes, an open-source system for automating deployment, scaling, and management of containerized applications, and was donated to the CNCF by Google, who is still the primary contributor to the project.
GKE offers a number of benefits for SaaS applications, including:
Globally available IP addresses that can be set up to route to one or more clusters depending upon the location of the inbound request. This enables advanced DR and application routing configurations. (to learn more about this, read about multi-cluster ingress and Google Cloud’s Global Load Balancer).Cost optimization: GKE provides cost optimization insights to help you align your infrastructure spend with your usage.Scalability: GKE can easily scale your applications up or down to meet demand. Current scale limits are 15,000 nodes per cluster, which leads the industry.Advanced storage options help you access data reliably, securely, and with high performance.
4 popular SaaS GKE architectures
When choosing a SaaS architecture, you must first think about your isolation requirements and the nature of your SaaS application. There is a trade off between cost and degrees of isolation, at the level of the Kubernetes namespace, node, and cluster. The cost will increase with each. In the following section, we outline architectures based on each in more detail, including their pros and cons. In addition to all the methods mentioned below, you can increase security on the host system by using GKE sandboxes. You can visit the main GKE Security overview page that also includes network security considerations here.
1. Flat multi-tenant application
One way to host a SaaS application is to set up single-ingress routing to a Kubernetes namespace with a copy of the SaaS application. The ingress router would have intelligence to serve data unique to the authenticated user. This setup is common for SaaS applications that don’t need to isolate users beyond the application’s software layer. This design is often only possible for applications that control tenancy via the software layer of the main SaaS application. CPU, memory, and disk/storage are scaled as the application grows with the default autoscaler without concern for which user is driving the most usage. Storage can be connected via persistent volume claims that are unique to each pod.
Cluster and nodes are managed as a single and uniform resource
The same underlying server is used for multiple tenants, and CPU spikes or networking events caused by one tenant (“noisy neighbors”) may affect other tenants.The same cluster control plane is used for multiple tenants, which means that any upgrades to the cluster apply to all tenants at the same time.User data is only isolated on the application layer, meaning problems in the application could expose one user’s data to another user’s.
2. Namespace-based isolation in a multi-tenant cluster
In this pattern, you set up single-ingress routing using the host path to route to an appropriate namespace, where there’s a copy of the application that’s dedicated to a given customer. This setup is common for customers who require highly-efficient resource isolation for their customers. Each namespace can be given a CPU and memory allocation and share excess capacity during spikes. Storage can be connected via persistent volume claims unique to each pod.
Tenants can share resources in an isolated environment to boost efficiency and increase security.Cluster and nodes are managed as a single and uniform resource
The same underlying server is used for multiple tenants, where CPU spikes or networking events caused by one tenant may affect others.The same cluster control plane is used for multiple tenants, which means that any cluster upgrades apply to all tenants at the same time.
3. Node-based isolation
Like above, here you set up single ingress routing using the host path to route to appropriate namespace which contains a copy of the application that’s dedicated to a given tenant. However, the containers running the application are pinned to specific nodes using labels. This provides the application with node-level isolation in addition to namespace isolation. This type of deployment is used when applications are very resource-intensive.
Tenants have dedicated resources in an isolated environment.The cluster and nodes are managed as a single and uniform resource.
Each tenant gets their own node and will consume infrastructure resources regardless of whether the tenant is using the application.The same cluster control plane is used for multiple tenants, which means that any upgrades to the cluster apply to all tenants at the same time.
4. Cluster-based isolation
The final pattern is to use a single unique ingress route to access each cluster, in which a specific version of the application lives that is dedicated to a given customer. This type of deployment is used when applications are very resource-intensive and also require the strictest security standards.
Tenants have dedicated resources in completely isolated environments with a dedicated cluster control plane.
Each tenant gets their own cluster and will consume infrastructure resources regardless of whether they are using the application.Clusters have to be updated independently, which can add a substantial operational burden.
Once you’ve selected your baseline architecture, runtime storage is the next piece you’ll need to add to your setup. There are many different options for container-based storage for your application, and the best one will depend on your application’s specific requirements. Latency and persistence is perhaps the most important thing to consider in this decision. You can use the table below to help guide you on the different available options.
Local SSD (ephemeral storage):
Minimum capacity: 375 GB per disk
Use local SSDs if your applications:
Download and process data, such as AI or machine learning, analytics, batch processing, local caching, and in-memory databasesHave specialized storage needs and need raw block access on high-performance local storage.
Alternately, you may want to run specialized data applications and operate local SSDs like a node-level cache for your Pods. You can use this approach to drive better performance for in-memory database applications such as Aerospike or Redis (more info).
Persistent storage disk:
Minimum capacity: 10 GB
Use Persistent Disk storage if your clusters require access to high-performance, highly available durable block storage. A Persistent Disk volume is typically attached to a single Pod. This storage option supports the ReadWriteOnce access mode. GKE provides support for configuring Persistent Disk volumes with a range of latency and performance options. These disks are best used for general workloads or databases that require performant persistent block storage. Additionally they can be set up with regional redundancy (more info).
You can also connect Cloud Storage buckets to your applications. You can even mount storage buckets to your applications to interact with the remote files locally using the FUSE CSI driver (more info). This is a great option if you have pipelines that hydrate your storage outside of your core application.
Minimum capacity: 1 TB
Filestore provides centralized NFS storage that you can use to connect multiple pods to for read/write operations. This is a good option when multiple services in your cluster need read/write access to a central storage facility. (more info)
Finally, many GKE users connect their SaaS applications to hosted services such as CloudSpanner, Cloud SQL, or Redis to their clusters. This is a great option if you don’t need self-managed data services and can take advantage of Google Cloud managed services to lower your operational burden.
“Building Weaviate Cloud Services(WCS), on Google GKE enables us to scale our AI-native vector database elastically to meet our customers’ needs.”
– Etienne Dilocker, Co-Founder & CTO, Weaviate
“By building EDB BigAnimal™on Google GKE, we were able to leverage EDB’s existing Kubernetes expertise to deliver an intelligent managed database platform. GKE lets us embed automation in the platform with our operators while supporting a wide range of compute and storage options to our customers.”
– Benjamin Anderson, SVP of Technology and Strategy, EnterpriseDB
“Our next-generation serverless offering is tailored to meet our customers’ varied needs across their use cases for Observability, Security and Search, with zero administration overhead,” said Steve Kearns, Vice President, Product Management, Elastic. “For Google Cloud, we’ve chosen GKE, which enables us to utilize the latest compute and storage options in Google Cloud and achieve the most optimal combination between cost, performance and ease of management.“
– Steve Kearns, Vice President, Product Management, Elastic.
If your organization is considering delivering its SaaS applications on GKE, you have a lot to think about: the benefits you hope to gain from doing so, the pros and cons of the various architectures, and your myriad storage options. You may also want to think about familiarizing yourself with GKE, GKE security, and how to distribute and monetize your SaaS app from within the Google Cloud. For more, check out the following resources:
Cloud BlogRead More