GCP Network Design: The Basics


This is a write up on best practice networking basics for Google Cloud Platform with what I've learned over the last year while working on a large company migration to the cloud from AWS. I hope that it comes in handy for anyone new to designing networking in and to Google Cloud, or those who may just be generally interested in the details of the networking architecture stitching their projects together.

Overview

The Google Cloud Platform (the AWS-like division of the larger "Google Cloud", hereon referred to as just 'GCP'') organizational resource layout is generally like this:

Created cloud resources live in a VPC, which are part of a project, which are organized in folders or subfolders.

The flow is... Create a folder for each BU or department in your organization. From there, create a project for nearly everything else. This includes creating a dedicated project for the shared VPC to exist in. Nearly every different use case pretty much gets its own project, even the resources specific to InterConnect. I mean everything!

Shared VPC

First and foremost, Google recommends using their 'Shared VPC' architecture when laying the foundational architecture. To understand what that means, just imagine that all of your networking components (VPC subnets, routing, firewall rules, VPN terminations, physical interconnects, NAT gateways) live in a single console view and/or Terraform workspace that only your network administration team had control of. Meanwhile, individual team projects then are created independently whilst "sharing" the subnets that your network team have already created and designated for use.

Advantages of Shared VPC

  • No need to create new subnets for every team/project. They instead just share the network you've created for the shared VPC by being 'attached' to it as what's referred to as a 'service' project
  • Central point of control for network teams

Disadvantages of Shared VPC
  • GCP VPC peering currently has a limitation where the number of forwarding rules and VMs in a network and all its directly peered networks remains the same and cannot be exceed
  •  Some services need subnets that aren't a shared VPC subnet

Unlike Amazon, a VPC is a global construct and not a regional one. There's also no super net that you need to specify for it during creation. That means that you can can create subnets for the VPC in whatever region you want using whatever subnets you want, and they'll be able to communicate with each-other using Andromeda as the magical, ubiquitous backend data-plane.

Dedicated Interconnects

Dedicated Interconnects locations will connect you not to just that region but to the whole continent. For example, if you connect with GCP via Dedicated Interconnect in Equinix SV1 (Silicon Vally) you don't just have private IP connectivity to us-west1 but to every region in North America. This stands in contrast to AWS where the opposite is true: Where you connect is what you have access to.
 (unless using DirectConnect gateways, but I digress)

Public peering is also a little strange through GCP. Unlike AWS, there's no concept of a public VIF to peer with Google publicly. GCP only really supports private VPC peering over an InterConnect. So, in order to access public GCP services (such as Google Cloud Storage) you need to modify your Cloud Router to advertise Google's dynamic, every-changing IP address advertisements to you, or a single public /30 for use with their 'restricted' Google APIs gateway with a pretty tricky CNAME trick, the latter being what we ended up doing. This is something that Google clearly needs to work more on. If you qualify to publicly peer with Google properly at an IX then it's almost certainly the easier way to go.

High Availability

When it comes time to design how you will physically interconnect your network with Google you have a few standard options:

99.99%

"Four nines" availability for Dedicated InterConnect [generally speakingtranslates to 52.60 minutes of permitted downtime per year, or 4.38 minutes per month, 1.01 minutes per week, 8.64 seconds per day.

This is the recommended design for Google for hosting pretty much anything serious and mission critical. This implies that you already operate a network with PoPs in two different geographical regions, have redundant routers at each PoP, and connect them through some sort of backbone infrastructure.



99.9%

"Three nines" availability for Dedicated InterConnect translates to 8.77 hours downtime per year, or 43.83 minutes per month, 10.08 minutes per week, 1.44 minutes per day. Per Google, this design is only recommended if A.) you just don't have the PoPs or network reach yet or if B.) you're hosting non mission-critical projects.



Conclusion

While there are some things that need to be further developed, I believe they will smooth themselves out over the next few years as this impressive cloud continues to grow and catch up with Amazon and Azure. I also appreciate the amazingly deep technical concepts that GCP have introduced to give probably some of the most technical flexibility possible.

Popular posts from this blog

Configuring Cisco ASA for Route-Based VPN

Running ASA on Firepower 2100: An End-to-End Guide

Up and Rawring with TRex: Cisco's Open Traffic Generator