Zone vs Region Redundancy

Welcome to our latest video in the series on High Availability and Disaster Recovery. Today, we’re diving into a crucial topic that affects anyone using cloud services: the difference between zone and region redundancy. Whether you’re leveraging Azure or any other cloud provider, understanding these concepts is essential for safeguarding your data against disasters. We’ll explore how data redundancy works across different geographic locations and data centers, discuss various redundancy options like local, zone, and global, and explain the costs and logistics of each. So stay tuned as we break down these complex topics to help you make informed decisions for your critical workloads. Let’s get started!

So we’re continuing our series on high availability and disaster recovery. What we’re going to drill into a little bit is zone versus region redundancy. Let’s say you’re using the world of Azure; other cloud providers offer the same options. But you have different regions, which are different geographical locations. In Azure, we call them regions. Within a region, you have one or more data centers. For example, in UK South, there are three availability zones, three data centers. In UK West, there is actually only one. When you have multiple computers storing your data, that’s called locally redundant storage. It means that your data is written to more than one computer in the same data center. This is the cheapest option and is available nearly everywhere. But if you want to ensure that your data is written to multiple computers across multiple regions, you need zone-redundant storage. This allows you to ensure that if one of these data centers is wiped out, say someone cuts through the fiber optic cables, you’re still okay because you have two other data centers to handle the process.

Globally redundant storage is where your data gets replicated to another region. These options get more expensive as you go through them. One thing about global redundancy with storage is that the replication is always, or almost always, asynchronous. In other words, if a region went down, there is a possibility that the data hasn’t actually been written to the other location yet. That’s something to bear in mind. There are only certain cases where you can ensure it is synchronous. If you have all your setups in another region as well, you have options there too. You can have a cold standby; in other words, you have DevOps scripts that can just spin up your application in another region. You aren’t charged for them when they’re not in use. You can bring that up, but it takes some time, maybe a few minutes, maybe a couple of hours, to bring up your application in another region. But that’s a cheaper option because you have to write the DevOps scripts and all that, but you’re not paying for any resources in the other region. You could have a warm standby, where everything’s there but not being used, yet all you have to do is flip a switch, and everything’s there.

Or it could be a hot standby. In other words, it’s already processing data. Maybe you’re processing across both regions simultaneously, and they’re working all the time. You’re load balancing between regions, and that’s hot. Now, that’s the most expensive, but it means you don’t lose a beat when you’re actually wanting to flip over. So, a few options there. But something to bear in mind, given that, particularly, different regions have different numbers of data centers. Zone-redundant storage or zone-redundant services are not always available because some regions only have one availability zone. Also, not all services are available in all regions. So if you’re running something here and might want to fail over to a different region, but that specific part of Azure might not be supported. You need to check that. Now, if you just lose some servers or someone cuts a cable, zone redundancy is usually alright. For most applications, and apart from the most heavily loaded and the most valuable, making it zone redundant—running in one Azure region but across multiple data centers—is usually okay. But there are times when for your really critical workloads, you might want to ensure that you can fail across another region.

And why is that? Well, one reason is that Azure does their upgrades region by region. So let’s say you have a service, like an app service that we run logic apps on, and it undergoes an upgrade that breaks all your code. Well, that’s going to be across the whole region, potentially. So you need to operate in a region that hasn’t been upgraded yet. There are things like that that ensure your most critical workloads are available when you need them. And that can be, “I need to run those in a different region.” Just some thoughts on high availability, more to follow.

Chat to us about your Integration journey

Get in touch

Share this post