An Amazon Web Services (AWS) engineer last week inadvertently made public almost a gigabyte’s worth of sensitive data, including their own personal documents as well as passwords and cryptographic keys to various AWS environments.
While these kinds of leaks are not unusual or special, what is noteworthy here is how quickly the employee’s credentials were recovered by a third party, who—to the employee’s good fortune, perhaps—immediately warned the company.
On the morning of January 13, an AWS employee, identified as a DevOps Cloud Engineer on LinkedIn, committed nearly a gigabyte’s worth of data to a personal GitHub repository bearing their own name. Roughly 30 minutes later, Greg Pollock, vice president of product at UpGuard, a California-based security firm, received a notification about a potential leak from a detection engine pointing to the repo.
An analyst began working to verify what specifically had triggered the alert. Around two hours later, Pollock was convinced the data had been committed to the repo inadvertently and might pose a threat to the employee, if not AWS itself. “In reviewing this publicly accessible data, I have come to the conclusion that data stemming from your company, of some level of sensitivity, is present and exposed to the public internet,” he told AWS by email.
AWS responded gratefully about four hours later and the repo was suddenly offline.
Since UpGuard’s analysts didn’t test the credentials themselves—which would have been illegal—it’s unclear what precisely they grant access to. An AWS spokesperson told Gizmodo on Wednesday that all of the files were personal in nature and unrelated to the employee’s work. No customer data or company systems were exposed, they said.
At least some of the documents in the cache, however, are labeled “Amazon Confidential.”
Alongside those documents are AWS and RSA key pairs, some of which are marked “mock” or “test.” Others, however, are marked “admin” and “cloud.” Another is labeled “rootkey,” suggesting it provides privileged control of a system. Other passwords are connected to mail services. And there are numerous of auth tokens and API keys for a variety of third-party products.
AWS did not provide Gizmodo with an on-the-record statement.
It is possible that GitHub would have eventually alerted AWS that this data was public. The site itself automatically scans public repositories for credentials issued by a specific list of companies, just as UpGuard was doing. Had GitHub been the one to detect the AWS credentials, it would have, hypothetically, alerted AWS. AWS would have then taken “appropriate action,” possibly by revoking the keys.
But not all of the credentials leaked by the AWS employee are detected by GitHub, which only looks for specific types of tokens issued by certain companies. The speed with which UpGuard’s automated software was able to locate the keys also raises concerns about what other organizations have this capability; surely many of the world’s intelligence agencies are among them.
GitHub’s efforts to identify the leaked credentials its users upload—which began in earnest around five years ago—received scrutiny last year after a study at North Carolina State University (NCSU) unearthed over 100,000 repositories hosting API tokens and keys. (Notably, the researchers only examined 13 percent of all public repositories, which alone included billions of files.)
While Amazon access key IDs and auth tokens were among the data examined by the NCSU researchers, a majority of the leaked credentials were linked to Google services.
GitHub did not respond to a request for comment.
UpGuard says it chose to make the incident known to demonstrate the importance of early detection and underscore that cloud security is not invulnerable to human error.
“Amazon Web Services is the largest provider of public cloud services, claiming about half of the market share,” Pollock said. “In 2019, a former Amazon employee allegedly stole over a hundred million credit applications from Capital One, illustrating the scale of potential data loss associated with insider threats at such large and central data processors.”
In this case, Pollock added, there’s no evidence that the engineer acted maliciously or that any customer data was affected. “Rather, this case illustrates the value of rapid data leaks detection to prevent small accidents from becoming larger incidents.”