Inside Amazon’s Massive Data Center

Amazon Web Services is an unrivalled colossus
of the internet age, providing the computational horsepower underpinning countless organisations
from Netflix to BMW, Disney to GE, Tinder to the CIA. It earns tens of billions of dollars a year,
and if you’ve used the internet at all lately, you’ve probably used Amazon Web Services
without even knowing it. But where is it? What is it? And how does it all work? Join us now as we enter the belly of the beast
and take a sneak peek inside an AWS data centre. In stark contrast to its cardboard boxes,
vans, trucks, jet planes and warehouses, Amazon prefers not to brand its vast portfolio of
gargantuan data centres.

But these monumental server farms are the
very guts of the firm’s wildly successful AWS – that’s Amazon Web Services – division. AWS began life at a 2003 meeting in the lakehouse
of CEO Jeff Bezos. The firm had just emerged strong from the
Dotcom bust, but Bezos’ fast-growing retailer was having difficulty setting up and rolling
out its own internal server capacity. A scheme promising fast, scalable server infrastructure
was duly hashed out. And was swiftly identified, by the more enterprising
young guns in the room at least, as a potential spinoff service Amazon could sell on to other
companies. Fast forward to today and AWS rakes in a whopping
60% of all Amazon profits. That’s about $50 billion a year. How? By taking on other companies’ tech problems,
in the ‘cloud’. If you’re a startup in 2021, you no longer
need to invest precious capital in vast banks of noisy, expensive servers and battalions
of nerds to maintain them.

You just rent time on Amazon’s gear instead. That’s the cloud. When you’re busy, yay, you hire more gear. If you’re quiet, whatever, you hire less
gear. The actual gear in question is jealousy hidden
from prying eyes by Amazon. Even clients aren’t allowed to poke around
in the data centres they pay for, and their addresses aren’t made public. But here’s what we know.

Take a typical site, in Loudoun County, Northern
Virginia. AWS is said to run at least 50 individual
data centres in this region alone, covering several million square feet. Believe it or not, Amazon is planning on adding
still more capacity in Northern Virginia, for instance a 100 acre site near Dulles airport
the firm reportedly just shelled out a cool $73 million for. On entry to one such data centre, let’s
say this giant nondescript box next door to a pet resort in Sterling, the first thing
we notice is that security is the over-riding concern for AWS. There’s high fences, guards, and several
layers of intrusion detection systems, most obviously cameras. When access is permitted, to very limited
people for very limited periods, two-factor authorisation is required on at least two
occasions during the visit. If a visit is sanctioned – this is extremely
rare – visitors are accompanied at all times by at least one authorised member of AWS staff. Once this so-called ‘perimeter layer’
is passed, the next stage of the data centre is known as the ‘infrastructure layer’.

This is where crucial systems like backup
power, fire suppression and most importantly HVAC, or air conditioning, is situated. To maintain market dominance, AWS data centres
cannot be allowed to fail, ever, for any reason. As such water, power, telecommunications and
internet links are all designed with full redundancy. And keeping all those servers – as many
as 80,000 in a single data centre – running requires constant monitoring of ambient temperature
and humidity to prevent overheating. If the worst should happen and a fire breaks
out, a suite of systems involving water and gaseous sprinkler systems are rigorously maintained
and good to go in a crisis. Past the infrastructure layer and we’re
finally where the real magic happens – the so-called data layer.

Security is even tighter here, with a secretive
array of intrusion detection systems and even stricter review process for anybody going
in or out. Even if something minor happens, for instance
an internal door is held open slightly longer than usual, alarms are triggered. All access points are fortified with multi-factor
electronic control devices. If a data breach, physical or virtual, is
even so much as suspected, the relevant server is automatically disabled and shut down. These security systems are audited throughout
the year according to a robust 2,600-point checklist supervised by external bodies. These external auditors can request access
to anything in the data centre, from humdrum logs to security footage or even the very
cameras themselves.

AWS staff are sometimes plucked at random
from their shifts and interviewed about topics like the disposal of media, itself a process
that’s held to exacting and robust compliance metrics. Let’s look at the hardware itself. In the midst of all this hyper security, it’s
maybe easy to forget that AWS data centres are fundamentally just enormous banks of servers. A typical AWS data center will hold anywhere
from 50,000 to 80,000 servers, running on a combined 25-30 megawatts of power. It’s said that AWS could quite easily double
this size and make even bigger data centres. However, like everything Amazon related, it’s
all a question of finely-calibrated scale. Distinguished engineer and AWS VP James Hamilton
told attendees at the company’s re:Invent conference in 2016 that beyond a certain point,
increasing the number of servers and racks per centre no longer lowers the incremental
cost. ‘At some point, the value goes down and
the costs go up, and in our view, this is around the right number,’ Hamilton says
of the datacenter size AWS usually goes with. Thrillingly, Hamilton employs the term ‘blast
radius’ to underscore the impact a failure might lead to in a much larger data centre,
hence the relatively modest dimensions.

Still, 80,000 servers is plenty, designed
to take in – according to Hamilton – some 102 terrabytes of data each and every second. Significantly, the engineer also hinted that
bandwidth within the centre is ‘wildly higher’ even than that. Ever evangelists for vertical integration,
Amazon have made great strides in streamlining and perfecting the servers themselves over
the years. Unlike other colocation server infrastructure
providers, AWS has a much clearer understanding of who will be using its hardware, and for
what. Removing this requirement to be versatile
and generalist has enabled the firm to devise ever more efficient server architectures. When mundane software operations are translated
on the hardware, or even the very silicon of the chips, this can shave nanoseconds off
of processing times. And at Amazon scale, that makes a big difference. Initially AWS bought servers ‘off the peg’
as it were from traditional suppliers. But steadily over time the firm began developing
its own single-depth servers, exquisitely optimised for airflow in AWS buildings running
AWS software.

AWS has even developed its own so-called ‘Graviton’
chipsets since acquiring Israeli firm Annapurna. These Graviton chips, currently in their second
generation with a third on the way, use the power-efficient ARM architecture. This represents a break from long-term Amazon
partner Intel and it’s x86 architecture, and is said to give AWS clients 40% better
price performance – essentially putting supercomputer-level power in the hands of
regular commercial users. In terms of storage, a single standard-sized
rack in an AWS data centre can hold 11 petabytes – that’s a million gigabytes – of data
on 1,100 disks. Multiplied, of course, many thousands of times,
across many hundreds of data centres.

All around the world. AWS routers – here comes the science kids
– run custom ASICs, or Application Specific Integrated Circuits, that support 128 ports
with 25 gigabit ethernet connectivity. This diverges, fun fact fans, from the industry
standard 10GbE and 40 GbE networking speeds – but apparently, according to James Hamilton,
offers economies of scale when paired together at 50 GbE. Networking between data centres – essential
to provide security and backup for crucial client info – happens over Amazon’s own
private 100 gigabit Ethernet network, which connects the 25 international regions and
80 so-called ‘Availability Zones’ across the vast AWS worldwide network. AWS even lays its own private undersea cables. But back to our datacenter. Cooling is an overriding priority, and to
minimise environmental impact AWS, in particular in Loudoun County, Virginia, set out to use
as much reclaimed water as possible. This stops the company hogging all the precious
local drinking water, which would obviously be a PR nightmare. Loudoun Water, the local utility, are happy
to provide this reclaimed water at a competitive rate – AWS pay serious taxes here – but
it’s worth noting that elsewhere AWS has installed its own on-site water treatment

These prevent pipes clogging up with mineral
sediment, which in turn causes costly hassle and delay. And if you think that’s crazy, the company
has even started developing its own power substations. All those servers use a fair bit of juice,
so instead of waiting for slow-moving traditional power companies to develop infrastructure
to keep those all-important racks humming at all costs, AWS, simply builds electric
substations itself. Day to day operations within the data centre
are described by some ex-technicians as being akin to an ER department for computers.

With that much hardware, inevitably things
seize up, stop pinging or require a reboot. Fault finding and upgrading equipment is left
to a small army of techs who follow rigorously drilled playbooks that cover almost any conceivable
eventuality. Metrics are set by supervisors encouraging
teams to complete tasks quickly and efficiently. Some have complained that the working environment
is cold, and it’s certainly very loud. In future we can expect to see many more,
and even louder AWS data centres come online. From early 2020, a new AWS centre opened in
the Northern Virginia area alone every three months, totalling around one million square
feet in footprint.

And that’s just one area. All in all, it’s quite the numbers game. What do you think? Do these vast data centres get the credit
they deserve for underpinning so much of modern life? Let us know in the comments, and don’t forget
.

