AWS re: Invent 2019 – Monday Night!

Today, at AWS re: Invent 2019 Monday Night Peter Desantis, VP of Global Infrastructure AWS focused more on Hight Performance Computing (HPC) and usage of it for Machine Learning (ML) and disclosed “Amazon Inferentia” a High-performance machine learning inference chip, custom designed by AWS.

The main agenda for Monday night was to talk about the following 4 points:

  1. Build high speed, low latency, large-capacity data center placement group network
  2. Move all virtualization to optimized AWS chips and hardware
  3. Develop hardware optimized, kernel bypass networking stack
  4. Integrate with popular HPC libraries and applications

Today’s Monday Night was less quantitative on the number of new technology inventions/disclosure in comparison to 2018 Monday Night. It was much more on the improvement, optimization, and scalability that they are doing on the existing platform and it was more on the hardware side.According to them, Super Computers going to be a small part of AWS Placement group as they increased their DC capacity by 20X in last 6 years.

They majorly improved their datacenter(AZ) performance in the throughput and latency side where they gained 20X performance in comparison to the last 6-year stats. In 2013, each datacenter was having the capacity of 460 Tbps and was supporting up to 4600 100Gb servers with 12-microsecond latency where as in 2019, each DC can support 10600 Tbps and supports up to 106000 100Gb servers with 7 microsecond latency.

Above requirement spec is just 5% of total supported resource requirement in AWS DC

They also focused on c5n network optimized instance that will be used for HPC in AWS ParallelCluster where they have reached to 100Gbps speed using Elastic Fabric Adapter (EFA) that can be added to new AWS instances during provisioning and can be attached to special supported instances for HPC cluster.

For IT Infrastructure consultant new protocol to review is SRD (Scalable Reliable Datagram) where they are coming up with a totally new approach to gain performance on network stack where they are bypassing TCP stack in Operating System and communicating with adjacent nodes in AWS Placement group using SRD protocol that has better latency management.

I am pasting below all relevant screenshot taken during that session and going to list down key points discussed during the session:

AWS talked on Climate Pledge on Paris Agreement goal to achieve Net Zero carbon emission and they told that they will achieve it in 2040 instead of 2050 goal marked by Agreement and AWS is ahead by 10 years.They also updated on their new Six renewable energy projects and where they recently launched one renewable energy location in Australia.

Climate Pledge by AWS
1273 MW Current Renewable Energy Capacity of AWS

My last blog on AWS re: Invent 2018 – AWS Network architecture covers AWS infrastructure location and connectivity stats of last year and the following are some of the latest updates on it in Dec, 2019.

As of today, AWS has 22 AWS Regions (4 new AWS Regions Announced), 69 AZ (13 new AZ Announced), 97 AWS Direct Connection location in the world and all traffic in between is encrypted.

2019 Network Links
2018 Network Links

They again focused on their AWS Nitro technology and also disclosed on CPU vs GPU comparison for various ML model comparison.

I will be writing down separately on SRD and EFI about Why AWS removed TCP from Operating System with SRD Protocol? as it is interesting to read about in-depth details of it that tells you about how AWS is bypassing TCP network stack in OS and using it to decrease network latency and how new EFI network interface giving to 100Gbps throughput.

I will be covering another upcoming 3 more sessions of AWS re: Invent 2019 and will be documenting it on my blogs so keep eye on it and source youtube reference is availabe here.