AWS Inter-Regional Network Design

In order to support a globally deployed SaaS application, I've recently been tasked with building out a new, scalable, redundant inter-connection of geo-dispersed private networks for the foundation. I can hear you now…“Yeah, and? We solved that ages ago with MPLS or at the very least static L2L VPN tunneling!” Well, this particular application environment happens to be running entirely in Amazon’s AWS cloud in separate VPC’s, spanning multiple regions worldwide. The thing is, Amazon doesn’t provide any MPLS-style product to do this seamlessly, not even for VPC’s that exist within the same region (and possibly even within the same datacenter). So, what now? 


image


As a close example let’s say you have five different VPC’s, all with unique private IP addressing, in five different AWS regions across the globe. How do you interconnect those networks together with the same level of reliability as you’ve come to expect from an MPLS provider or the like? After all, they’ve always managed those ‘cloudy’ little aspects for you haven’t they?
Before digging in, for those of you unfamiliar with Amazon’s services and nomenclature, time for a quick overview:
  • Virtual Private Cloud [VPC] is supposed to be similar to a physical datacenter or at the very least an aggregation zone within that datacenter. You create a VPC by first defining a large, private IP block which you then carve into smaller subnets to eventually place EC2 hosts in. Each VPC comes with essentially an integrated, ubiquitous L3 router so that you can route between those subnets and optionally specify static routes. For fault tolerance Amazon allows you to put the subnets upon creation into any of the region’s availability zones (A,B,C, etc.).
  • Availability Zone [AZ] is (best I can tell) similar to a network aggregation block or suite within a datacenter which Amazon has engineered with a level of network and services independence. It’s you getting the warm and fuzzy that if failures do occur within a region they’ll likely be isolated to a single AZ. I’ve been told that AZ’s span Amazon regional datacenters or are even deployed as independent DCs themselves but I’m not positive on that.
  • VPC VPN is the ability to connect to your VPC remotely through AWS infrastructure. You do so by creating a Virtual Gateway [VGW] within a VPC, which gives you access to terminate on Amazon’s VPN concentrators. From there traffic hauls back to your VPC router through an opaque VRF config. Optionally you can peer with the VGW via BGP over IPSec tunnel for dynamic routing exchange, which is actually quite useful but not that scalable as we’ll see.
A few other things worth mentioning:
  • You can’t dictate physical connectivity paths between Amazon regions and datacenters, nor can you even get very far seeing what the routes actually are since Amazon does a pretty decent job of disguising transit paths. What’s that mean exactly? Just because you created a VPN between Singapore and Europe, it doesn’t necessarily mean that it’s not routing back across the US regions in transit. You also don’t know what’s sent over AWS private links or across the regular Internet.
  • Somewhat off-topic; AWS unsurprisingly has no support for multicast in EC2, so say goodbye to the idea of an IGP or first-hop redundancy protocols for EC2 hosts. Even if you could manage it, AWS’s RIB is limited to next-hop instances, not IP addresses. In-fact Amazon’s proposed solution for FHR is to write a script.
So what are our options? Well, build an overlay of course! But what’s the best way to do that? How do we connect it all together? Should we use VGW’s? How do I connect one VPC to another via VPN?
Let’s break it down.
Inter-connecting the regions
This can be a bit tricky because it will depend completely on your own usage, however let’s continue the example with the following configuration footprint in AWS:
  • 1 VPC in US-East-1 (Ashburn)
  • 1 VPC in US-West-1 (Sunnyvale)
  • 1 VPC in US-West-2 (Portland)
  • 1 VPC in EU-West-1 (Dublin)
  • 1 VPC in AP-East-1 (Singapore)
So what do we do? Daisy-chain those regions together? Full VPN mesh? Static routes? Well…No.
Age-old Reason Not To “Full Mesh”
For the same reason I don’t want a full mesh of anything, it’s a configuration headache. Also, think about what happens when your business grows the VPC count in each region to 2, then 3, then 4. Using the n(n-1)/2 formula we see how much there’d be to manually keep up with if we wanted needed to scale:
  • 1 VPC per region = 10 total tunnels+BGP adjacencies
  • 2 VPCs per region = 45 total tunnels+BGP adjacencies
  • 3 VPCs per region = 105 total tunnels+BGP adjacencies
  • 4 VPCs per region = 190 total tunnels+BGP adjacencies
Don’t get me wrong, a full mesh might get the job done here. However once our little network starts to grow up we’re going to run into some serious scale challenges.
(By the way, that’s just if we use a single router per VPC versus redundant pair)
Avoid “Chaining” VPCs via VGWs
Another option exists to connect one VPC to another with a virtual router instance (vRouter) by way of the Amazon VPC VPN that I covered earlier. Again, this might work for small-scale but we quickly run into limitations when we try and use a VPC in the form of a transit provider. Take a look:
image
This example illustrates the possible issue. In this design of connecting vRouter-to-VGW your realm of connectivity’s limited to no more than 3 AS-path hops away. That means that APAC and EU would not be able to use the US-East and US-West regions for transit paths to reach each other. Why the limitation? Because the Amazon-side VGW neighbor in all regions use the same Amazon ASN # of 7224. Therefor AS_Path loop prevention will prevent the route from propagating any further than through one VGW termination. That sucks.
I should probably clarify by now that I’ve come to name any EC2 host capable of BGP and IPSec termination a ’vRouter’. Probably the best virtualized router solution I’ve used so far is Vyatta/Brocade’s aptly named vRouter, which is what I’m using in AWS for this project. You could bake your own Quagga+ipsec-tools solution yourself if you enjoy constant firefighting, but anyway, that’s another topic.
Direct vRouter-to-vRouter Mesh
The method that I chose was to connect VPC’s together as if they were each individual datacenters. I did this by treating a vRouter as if it were a real core or edge router in my datacenter. Each DC gets its own ASN, each region has a designated hub VPC, each hub VPC has inter-regional connectivity (to at least two regions in-case one fails). The mesh topology ultimately gets determined by the geographical locations of the regions but this method generally shouldn’t be too far off from how you’d design a private internetwork between physical datacenters instead. Here’s a bit of visualization:
image
Granted, there’s an accepted level of risk using this topology. Each region must rely on a “hub” VPC, where the “backbone” inter-region tunnels terminate. Any new VPC launched within that region needs to peer with that designated hub VPC implying that if that hub VPC or its vRouter pair goes down, so does the rest of the region. Starting to sound sort of like a campus network, no? 
For additional redundancy we spin up at least two vRouters per VPC and place those instances in different availability zones. Here’s a closer look with redundancy configured, zoomed in:
image
Since we cannot use first-hop redundancy protocols we achieve fault-tolerance in EC2 by doing one of two methods:
  • Point external routes for instances in subnets of a given AZ toward the vRouter in the same AZ. The idea is that you should deploy your application redundantly across the AZ’s that are available to you. If one AZ goes does, the other should theoretically pick up for it.
  • Point all routes to the same vRouter, employ Amazon’s script ideology and write something to monitor the vRouters and make route adjustments to the VPC route table (via API) if the vRouter or AZ goes down.
Final Thoughts
Quite honestly I haven’t totally figured out the best use of first-hop redundancy inside of AWS. Amazon’s proposed script is designed to be a heart-beat between two network nodes, which may not work in the case of a virtual appliance. Perhaps the best method here is to instead be active-active and balance your AZ’s across their each respective vRouters (Amazon gives you the ability to modify route tables on a per-AZ basis, which is handy for this). I will be doing some more work on that part soon so maybe a follow-up post is needed.
Suffice to say designing connectivity around cloud providers like AWS, RAX and Google is new territory for the traditional network and systems engineer used to having their old bag of tricks handy, not to mention access to physical hardware. That said I’m ready to see one of those guys step up and drop in a native solution. Rackspace already uses Open vSwitch for its cloud, where’s the SDN solution?
Until then the problem remains and the game is slightly different (certainly more confined), forcing you to come up with clever solutions in some cases. When was the last time that you took HSRP/VRRP for granted?
Have you come across these issues before and perhaps have figured out better ideas? Let’s hear about it! 

Popular posts from this blog

Configuring Cisco ASA for Route-Based VPN

Running ASA on Firepower 2100: An End-to-End Guide

Up and Rawring with TRex: Cisco's Open Traffic Generator