ReadOn Tech

CloudCheckr’s Amazon Web Services Survey Results – March 2013

Amazon Cloud is all about going big on performance and less on price, atleast that’s what we know till now. However, i always wondered if there could be more done to the way AWS is used to increase savings. Recently, CloudCheckr conducted an extensive research to check on how properly customers of AWS were able to use it. The results were extremely surprising, though not for CloudCheckr. We Would like to release the Amazon Web Services Survey results to the general public on behalf of CloudCheckr. You can also find an interesting infographic that analyzes the AWS performance, price and error issues.

CloudCheckr:
We were heartened when AWS made Trusted Advisor free for the month of March. This was an implicit acknowledgement of what many have long known: AWS is extremely complex and it is challenging for users to provision and control their AWS infrastructure properly.

We took the AWS announcement as an opportunity to conduct an internal survey of our customers’ usage. We compared the initial assessments of 400 of our users’ accounts against our 125+ best practice checks for proper configurations and policies. Our best practice checks span 3 key categories: Cost, Availability, and Security.  We limited our survey to users with 10 or more running EC2 instances. In aggregate, the users were running more than 16,000 EC2 instances.


Click Here to View Full Size

We were surprised to discover that nearly every customer (99%) experienced at least one serious exception. Beyond this top level takeaway, our primary conclusion was that controlling cost may grab the headlines, but users also need to button up a large number of availability and security issues.

When considering availability, there were serious configuration issues that were common across a high percentage of users. Users repeatedly failed to optimally configure Auto Scaling and ELB. The failure to create sufficient EBS snapshots was an almost universal issue.

Although users passed more of our security checks, the exceptions which did arise were serious. Many of the most commons security issues were found in configurations for S3, where nearly 1 in 5 users allowed unfettered access to their buckets through “Upload /Delete” or “Edit Permissions” set to everyone. As we explained in an earlier whitepaper, anyone using a simple bucket finder tool could locate and access these buckets.

Beyond the numbers, we also interviewed customers to gather qualitative feedback from users on some of the more interesting data points.

Conclusions by Area

Conclusions based upon Cost Exceptions:

As noted, our sample was comprised of 16,047 instances. The sample group spent a total of $2,254,987 per month on EC2 (and its associated costs) for average monthly cost per customer of $7516. Of course, we noted the mismatch between quantity and cost – spot instances represent 8% of the quantity but only 1.4% of the cost. This is due to the significantly less expensive price of spot instances compared to on demand.

When we looked at the Cost Exceptions, we found that 96% of all users experienced at least 1 exception (with many experiencing multiple exceptions). In total, we found that users who adopted our recommended instance sizing and purchasing type were able to save an average of $3974 per month for an aggregate total of $1,192,212 per month.

This suggested that price optimization remains a large hurdle for AWS users who rely on native AWS tools. Users consistently fail to optimize purchasing and also fail to optimize utilization. These combined issues meant that the average customer paysnearly twice as much as necessary for resources to achieve proper performance for their technology.

To further examine this behavior, we interviewed a number of customers.  We interviewed customers who exclusively purchased on-demand and customers who used multiple purchasing types.

Here were their answers (summarized and consolidated):

Conclusions based upon Availability Exceptions:

We compared our users against our Availability best practices and found that nearly 98% suffered from at least 1 exception. We hypothesized that this was due to the overall complexity of AWS and interviewed some of our users for confirmation. Here is what we found from those interviews:

Conclusion bases upon Security Exceptions:

Finally, we looked at security. Here we found that 44% of our users had at least one serious exception present during the initial scan. The most serious and common exceptions occurred within S3 usage and bucket permissioning. Given the differences in cloud v. data center architecture, this was not entirely surprising. We interviewed our users about this area and here is what we found:

Underlying Data Summary

Cost:                                                                                                                                                      Any exception 96%

The total of 16,047 instances was broken in the following categories:

The instance purchasing was broken down as follows:

Common Cost Exceptions we found:

Availability:                                                                                                                                       Any exception 98%
Here, broken out by service, are some highlights of common and serious exceptions that we found:

Service Type:                                                                                                             Customers with Exceptions

EC2:                                                                                                                                               Any exception  95%

Auto Scaling:                                                                                                                             Any exception  66%

ELB:                                                                                                                                                Any exception  42%

Security:                                                                                                                                              Any exception  46%
These were the most common exceptions that we found: