Skip to main content

Data Exfiltration from AWS S3 Buckets

Data Exfiltration from AWS S3 Buckets

You will have no doubt heard by now about the recent Booz Allen Hamilton breach that took place on Amazon Web Services – in short, a shocking collection of 60,000 government sensitive files were left on a public S3 bucket (file storage in Amazon Web Services) for all to see. We are all probably too overwhelmed to care, given all the recent breaches we have been hearing about in the news. But with this breach it was different, it involved a trusted and appointed contractor whose job it was to follow security policies, put in place to avoid such incidents. So was this incident accidental or malicious? More, later about the tools we can use to tell the difference between the two. First, lets recap what happened.

The Incident

According to Gizmodo, the 28GB of data that was leaked not only contained sensitive information on recent government projects, but at least a half dozen unencrypted passwords belonging to government contractors with Top Secret Clearance – meaning anyone who got their hands on these unencrypted passwords could in turn move through security controls to get access to further sensitive data. Ok, so now we’re not just talking about 60,000 files sitting on a S3 bucket, but the potential of a bigger breach.

When you have so much sensitive data that is questionable and considered potentially compromised in a breach, where do you start in your investigation?

Scouring the maze of data in the cloud

Amazon Web Services (AWS) has a couple of basic native logging and monitoring tools that help you search for data in its public cloud. AWS CloudTrail is a tool that allows you to monitor API calls made to AWS services in your account. AWS CloudWatch can be used for monitoring and alerting for particular events you designate. Both of these tools are good for a basic level of analysis of a security events. In a small finite data source of events and logs, you can come to some sense of what transpired in a security incident. But where these tools fall short is in scale and visualization. Most data lakes are vast and consolidate data from many different sources. Gaining insights from that data for a particular incident usually involves scouring through a vast amount of data and in a repetitive fashion – each pass through the data brings you one step closer to a potential answer, or sometimes nothing at all. All along, you are running out of time and the pressure is on to find the smoking gun.

Finding The Smoking Gun

Data Exfiltration from AWS S3 Buckets

Sift Security offers not only the capability to manually search through your data lake, but to get starting points for investigations from the anomaly detection and visualize your data in our graph.

The screenshot above shows an example of the type of visualization available in Sift Security’s CloudHunter product. We ingest your CloudTrail data and run anomaly detection to point out behavior that may be malicious with no configuration needed. In this case, we alerted that the user launched an EC2 instance with an AMI that was never used before. By adding that alert to the visualization, we can see that the “demo-contractor” was able to retrieve S3 credentials from the EC2 instance and download the “github-credentials.txt” file.

If Booz Allen Hamilton was using Sift, they would have immediately been alerted when the permissions on the S3 bucket were changed, and even if that was ignored, they would have been alerted again when somebody downloaded any files from the bucket.

The Bottom Line

Humans make mistakes, and cloud infrastructure can change from one minute to the next. As we continue to utilize cloud infrastructure, we will continue to encounter cases like this. Whether a leak like this is accidental or malicious, the job of a security incident responder will only get harder. It’s time to revolutionize the tools that help us do our job in a better way. Sift Security recognizes this problem and is leading the revolution by reinventing the way we search for data and respond to security incidents in the cloud. Learn more about what Sift Security and what CloudHunter can do for you.

Popular posts from this blog

Sift Joins Netskope, the Cloud Security Leader

Four years ago, we started Sift with the mission of simplifying security operations and incident response for the public cloud. In that time, we have assembled a fantastic team, created an innovative cloud detection and response solution, and have worked with many market-leading customers. I’m delighted to share that we’ve taken yet another step forward — as announced today, Sift is now officially part of Netskope. You can read more about this on Netskope CEO Sanjay Beri’s  blog  or in the official  announcement  on the Netskope website. For our customers, investors, partners, and team, this is an exciting new chapter. Let me tell you why we’re so excited.  Since the beginning, Netskope has had an unmatched vision for the cloud security market. Having started in 2012, they initially focused on SaaS security and quickly followed that with IaaS security capabilities. Six years later, they are now more than 500 employees strong and used by a quarter of the Fortune 100. They are a l

Sift Security vs. Elastic Search and Elastic Graph

We are often asked, “What is the difference between Sift Security and Elastic Graph ?” This is a great question that typically comes from folks who are already familiar with Elasticsearch [0] and Elastic Graph [1]. The answer boils down to the following: Elastic Graph is a tool for visualizing arbitrary aggregate search results. Elasticsearch is a Restful search that distributed, and has analytics engine that solves a number of use cases such as mapping from Python to ES REST endpoints. Sift Security uses a graph database to simplify and accelerate specific security use cases. In this blog post, we describe the advantages of each of these approaches, and conclude with a discussion of when to use each. Advantages of Sift Security vs ElasticSearch and Elastic Graph Query speed Sift Security builds a property graph to represent security log events at ingestion time.  We do this work at ingestion time for one reason:  to speed up common investigative queries.  When investi