Last week, we published our second episode of VM End-to-End, a series of curated conversations between a “VM skeptic” and a “VM enthusiast”. Join Brian and Carter as they explore why VMs are some of Google’s most trusted and reliable offerings, and how VMs benefit companies operating at scale in the cloud. Here’s a transcript:
Carter Morgan: Welcome to VM End to End, a show where we have a VM skeptic and a VM enthusiast try and hash out are VMs still amazing and awesome? So Brian, thanks for coming in today. I appreciate you.
Brian Dorsey: Happy to be here. Let’s get to it.
What a Cloud VM is
Carter: Yes. Yes. Last time, we talked about if cloud VMs are relevant and if VMs are still relevant in a cloud-native future? You said, “definitely yes.” So today I want to go a little bit deeper and find out what exactly is a cloud VM? Can you help me out?
Brian: Absolutely. Let’s get into it. And first, it’s a virtual machine. I think most of what people know about virtual machines is all true for cloud VMs. And the thing I hope we get out of this today is that there’s a bunch of extra stuff you might not know about or might not think to look into.
Carter: Okay. All right. It’s like a regular VM, but not. So a regular VM. We said it’s a machine on a computer. You said a cloud VM is a little bit different. What are the specific differences with the real parts, the physical parts?
Brian: Yeah. In the end, it’s all running on real computers somewhere. So when the VM is up and running, you’ve got a real CPU. You’ve got a physical memory. And I think the parts that are the most fluid are probably the disc and the network. And in Google Cloud, those are both software-defined services that give you a lot of flexibility. So I think the takeaway here is instead of thinking about it as a slice of one computer, think about it as a slice of a data center’s worth of computers.
How we interface with a “slice of the data center”
Carter: Wow. All right. So that’s an interesting thought to me. And what I’m curious about is if I have a slice of a data center, how is that manageable or usable? If I wanted to make a video game, what can I do with a slice of a data center?
Brian: I think that’s a great example. So it’s down to what are you trying to do? And we then take a bunch of the variables that are possible and group them up. So we group them up into compute optimized, memory optimized, accelerator optimized, and then general purpose. So the compute ones are for where you need the lowest latency, single-threaded computation possible. Memory is where you need the maximum memory you can possibly stuff in a machine. You’re running an in-memory database or something. And the accelerator ones are for when you need a GPU or one of our tensor processing units. And then for everything else, general purpose.
Carter: General purpose. It’s like most people just need a laptop. So, okay. You have these specific groupings, maybe families of machines ordered. Within those families, can I specialize, like if I need a high-end gaming laptop versus just a low-end gaming laptop?
Brian: Absolutely. And the first knob you have is just how big it is. So how many CPUs and memory. So you can have a two-core machine or a 20 or a 460-core machine.
Brian: Yeah, really. And up to 12 terabytes of memory right now. And those numbers keep getting bigger over time. So it depends when you see this, it might be different. And by default, those come in a preset ratio. So they grow together. But part of the main reason people wanted to use containers at all is that not everything fits in that exact shape. So you end up orphaning some of the capacity and you’re wasting money. So we also allow you with the general purpose machines to pick your own ratio. So if you’re like, “Oh, I know this is a really CPU-heavy thing, but I don’t need that much memory.” You can make a computer that’s that shape, and you’ll only pay for what you actually need.
Where the physical disks live in the data center
Carter: Oh, that’s really cool. Okay. So if you can make your own shape, somewhere there has to be physical memory. So where do we get this in a cloud VM?
Brian: Yep. So when you go to set one of these up, we find one of the machines out there that has space for the shape you want and start it up there. And so this Tetris problem becomes not your problem. And we’re big enough that there’s almost always a good spot for that to fit.
Carter: Yeah. And so these are all on one machine, it just sounded like there.
Brian: Oh. So there’s a data center worth of machines. And so when you go to start yours up, we find the right spot for it, find a computer that has open space.
Carter: So tell me a little bit more about disk then in this slice of a data center world.
Brian: So if we can just turn it on on any one of these computers in a data center, how do the discs work? Where’s my data? And so by default, our discs are almost always network-attached storage. And so instead of a physical disc plugged into one computer… I mean, those still exist, but we provide a virtual disc made up of hundreds or thousands of discs plugged into a whole bunch of computers, and then assemble the blocks together virtually, which gives you… You actually get higher performance than you might get normally. So the throughput is very fast. And then you get a bunch of the features that you might expect from a SAN, a storage area network.
Carter: So you’re going to attach and detach on the fly. You can-
Carter: Okay. That’s really cool.
Brian: So you can do that. You can resize it and use the snapshots to take backups. But the resize thing is amazing. If you run out of disc space, you can actually make the disc bigger while the computer’s running.
What OSes are allowed to run on a Cloud VM
Carter: Brian, you’re blowing my mind right now. I’ve got to act skeptical. I’m going to be skeptical. But you’re blowing my mind. Here’s something I’m curious about: When I’m using containers, I can select an operating system. And so the benefit of that is I can write applications to operating systems that I know and love. I can only use what I need. Is there that same concept in the VM world, or am I stuck… Am I forced to use certain types of operating systems to use a cloud VM.
Brian: Yeah. Very similar concept. Whereas in a container, it’s mostly just the files that make up that system per runtime, whereas here, we have the whole operating system running. But other than that, it’s really pretty much the same concept. Most workloads these days are running on Linux or windows. And so we provide pre-made images of Debbie and CentOS, CoreOS, Ubuntu, Red Hat Enterprise, SUSE, and Windows Server Datacenter, a bunch of different versions. So when you create a VM, you say, “Okay, I want it to run this OS,” and it makes a copy of that image on the fly and boots your machine off of that.
Carter: Okay. That’s really cool. Can you use your own at all?
Brian: Yeah. Two ways to do it. So one, if you want to use one of these OSes and just put your flavor on it, add some tools, configure it the way you want it to be, you can boot off of one of them and then make a new image based off of the work you did. So that’s a really common thing. And then if you want to, it’s a block device. And so you can make a customized version of an OS or develop a whole new OS. And as long as that runs in a virtual machine, you can boot off of it and go.
What you can actually *do* with Cloud VMs
Carter: I got to be honest. It sounded like there’s a lot of flexibility. All these things, I’m like, “Well, in containers you can do this.” And you’re like, “Yes, you can do this in the VM world too.”
Brian: And a lot of it’s based on… So this is a high-level version of what a cloud VM is. You can basically run anything that runs on a computer.
Carter: Okay. All right. We just specified really quick. There’s some virtual parts, there’s some physical parts. Your disks are going to be spread out over a wide data center, pooled together to give you more reliability, more consistency. A lot of times you said it’s even faster throughput. This is really cool. What I’m curious about is what are actual things that are non-obvious extensions of this? What can I do with this?
Brian: I think one of the underused or underknown things is actually running a container, one container per VM as a pattern.
Carter: Yeah. Why would I do that?
Brian: So a couple of different reasons. One, containers have become a distribution format. So a lot of software’s already ready to go in a container. And sometimes, you don’t want to deal with setting up a whole cluster or managing some other stuff. You just want to run one thing. So that’s a reason to do it. And then sometimes, there’s constraints. Like that particular thing, it might make sense to run it on a very large machine for a short amount of time, or it needs some particular configuration that you might not have otherwise. So it may make sense to run-
Carter: Right. And still use that packaging of containers.
Brian: Yeah. One-to-one.
Carter: Okay. That makes a lot of sense. All right. But I mean, in theory, I could still run containers for a lot of this. What are some other features of cloud VMs that you’re excited about?
Brian: Yeah. So one, I think, is it’s really commonly used in almost all architectures, and pretty much everybody has a load balancer when you have multiple machines, right?
Carter: Mm-hmm (affirmative).
Brian: And the non-obvious cool thing is that yes, we have a load balancer, but it’s a load balancer service that is provided at the data center level. It’s not some particular computer that has a certain capacity that as soon as you hit, things start slowing down or getting overdrawn. So you’re actually configuring the data center level load balancer that Google uses for YouTube and Gmail to run your own machines.
Why developers operate at the VM / IaaS level
Carter: So one, that’s just really cool, thinking about that concept. But what I’m blown away right now is thinking that in Kubernetes, I use services all the time. And if I’m using GKE (aka Google Kubernetes Engine), the load balancer that’s provided is the cloud load balancer, the Google one. So even then, I’m using the Google Cloud load balancer. My question though is I can still access this load balancer. It sounds like it’s configured already for me through something like Kubernetes. Is there a reason to go lower? Is there a reason to go to this level?
Brian: So if you’re already using Kubernetes, use Kubernetes. Those patterns are super useful, but not all software is set up to run containers. So if you want to use those same patterns-
Carter: That pattern of having a single end point that you can communicate with.
Brian: Yeah. There’s this idea of having a known endpoint that provides this service. And then there’s a group of containers usually in Kubernetes, but a group of computers, in this case, that do that. And once you do that, you have a known endpoint and a group of computers. And we call that group in Compute Engine a managed instance group. Then you can put a bunch of logic on that. So it’s actually a service in and of itself. So it handles starting up new machines when they’re needed. If you turn the dial up and you’re like, “Oh, I have five now and I want to have 10,” it starts the new ones. That can be set up to be run automatically depending on the load you get. And then you can spread those out across the world. You can have some of them running in one country, some of them running somewhere else, and route the traffic to the closest one for your users, that sort of thing.
Carter: I’m going to have to find out more about this. I’m going to have to dig in deeper because I want to be skeptical. And I’m like, “This all sounds amazing.” Further, I think… I don’t want this conversation to go too long, but I’m definitely going to want to dig in deeper here. In fact, maybe we can have an episode… Would you count this as an admin infrastructure networking? What is this that we’re talking about?
Brian: Yeah. I think we should dive into networking a bit more next and how that actually works. And when I say it’s not running on one box, how does, “Wait, what? How do you distribute traffic if it’s not going through one machine?” So let’s do networking. And then I love the discs, and there’s a lot more to talk about there. Let’s do that. What else do you want to hear about?
Carter: There is, for sure. I want to hear about costs. I’m going to have to do some of my own homework after hearing about machine families and all of this. I need to go start and create a VM. And I hope people listening at home do the same thing. Yes. Because I’m going to be more skeptical next episode. Okay? I’m going deeper. But this episode, I have to admit, cloud VMs sound pretty cool.
Brian: They are. Give it a try.
Carter: All right. Well, thank you, Brian. And I’ll catch up with you next time.
Brian: See you soon.
So they’re you have it: we learned what cloud VMs were last time, but this time we focused on what cloud VMs are made of. Since cloud VMs are a slice of a data center, they have some benefits over traditional VMs: for example, disks can be fit to exactly the workload you’re trying to run. In other instances, cloud VMs behave like traditional VMs and, as Brian stated, can “run anything a computer can run.”
If you want to learn more about GCE, be sure to check it out here: https://cloud.google.com/compute
Cloud BlogRead More