Interview: Peter Sankauskas @ CloudNative.io

CloudNative GraphicI had a chance to catch up with Peter Sankaukas, one of the guys behind CloudNative.io, an emerging consultancy that is creating interesting toolsets for the next-generation of Web Forward architectures. He had some very interesting insights around building for pure cloud formats. Some of the tools they are working on are very interesting as well.

  1. Describe your AWS experience and which parts of an application you work on most often (Backend, Data Layer, Security, Orchestration, etc…)?

Being a startup founder and a consultant means get to I work on all parts of the infrastructure. I’ve been lucky enough to be using AWS since 2008, was an AWS Startup Challenge Finalist in 2009 via Motally, wrote the EC2 inventory plugin for Ansible in 2012, won the NetflixOSS Cloud Prize in 2013 and was made an AWS Community Hero in 2014 by Dr Werner Vogals and team. Today I run the Advanced AWS meetup here in San Francisco and adore the community surrounding AWS.

The term “full stack engineer” truly applies here. I have written React and Android applications, Java, PHP, and Rails web applications, data processing pipelines, and star-schema analytics interfaces, down to countless Ansible playbooks, and even an Arduino based toy microwave/clock/night light for my son.

  1. Do you work on new build or transformative situations typically?

As consultants specializing in AWS, CloudNative has done both. We enjoy green field projects where we are designing from the ground up just as much as migration projects where there are many moving pieces, and downtime is not an option.

In either case, having a deep knowledge of all of AWS’s services is a necessity. Seemingly trivial design choices can drastically alter which services result in the most efficient architecture. Having that knowledge and experience really cuts down on trial and error, and gets our clients on the right path quickly.  

  1. What are the keys to designing an application architecture/infrastructure model for cloud-based deployments from scratch?

There are probably two things we pay close attention to. First is the way the company or team is structured and how they want to deploy code and maintain the new system. Here, Conway’s law comes into effect, and it is important to know where the boundaries and responsibilities will lie. The second is the volume, variety and velocity of data the new system will need to handle. Once those two pieces are understood, coupled with the business objectives, we can come up with a few different designs and discuss the tradeoffs.

  1. What is most underestimated about designing and developing an application for AWS/cloud architectures?

Everything is dynamic now. I cannot repeat this enough. EVERYTHING is DYNAMIC.

If your organization, operations or engineers are use to running in a data center, they expect things to change infrequently. In a well designed cloud application, a server may only exist for a few hours or even minutes. A new set of subnets across multiple availability zones can be provisioned in seconds. A copy of a multi-terabyte database is just a single API call away.

In this environment, your application needs to be able to gracefully handle changes in the underlying infrastructure with ease. If it cannot, you may be running in the cloud, but your hands are tied behind your back.

  1. Are there any tools that you pay for to help you build these sites? Any tools that you recommend your clients buy that help them manage them most effectively? Or is it OSS all the way down?

The basic rule we follow is: don’t run anything yourself that isn’t going to move the needle. Typical examples of services worth paying for are logging, monitoring and alerting. Unless you are Splunk, there really isn’t much of a need to run your own logging service and spend cycles on maintenance and keeping it online 24/7. Sure, ELK clusters have made this a lot easier, but it’s even easier if you let someone else do it. Go focus on your customer’s needs instead.

For monitoring, my personal favorite at the moment is SignalFx. They have an amazing analytics engine. In the old days, you might set static thresholds (alert me when CPU is over 80%) which over time trains yourself and others to ignore the alerts completely. It becomes noise. Now you can alert when the 98th percentile changes by more than 2 standard deviations – that is powerful stuff.

Within our applications, we utilize as much OSS as we can. Never write your own encryption library, or hand-code CloudFormation templates in JSON. We love the combination of Python, boto and troposphere to handle bringing up and tearing down our infrastructure. I’ve tried other solutions, but always come back to this.

  1. I know that you guys are working on some interesting technology at CloudNative.io, in particular what can you tell people about Yeobot and other projects?

Yeobot is the harmonious relationship between AWS and Slack. Yeobot’s ChatOps interface enables people to query their AWS infrastructure, across accounts and regions, to find detailed information about their infrastructure – from health to security and economics.

It is almost exclusively built on top of AWS Lambda and many other AWS services. We are living in a pretty golden time. The only “servers” we have is the small ECS cluster to handle the websockets Slack bots require. That reduces the amount of operations significantly. Of course this is a double-edged sword as well. Being such a cutting edge set of technologies, we have our share of cuts and bloody hands. As part of our Lambda deployments we have created our own open source tooling to help, namely Cruddy and Kappa.

  1. What is the most-advanced/cool/crazy architecture at scale that you know about on AWS and what interesting insights do you take from it?

Adroll have a developed a globally distributed, eventual consistency counter system using only AWS Lambda and DynamoDB. That system has been in production for a while now, and when I caught up with the engineer behind it, he had to take a second to remember what he did. Why? Because once deployed, it was rock solid. He hadn’t looked at the code in months. Seems like a pretty good place to set the bar for all future applications, doesn’t it?

[Editor’s Note: Yes, yes it does.]

Leave a Reply

Your email address will not be published. Required fields are marked *