3,000 Permanent Account Number (PAN) cards and National ID (Aadhaar) cards leaked from India

3,000 Permanent Account Number (PAN) cards and National ID (Aadhaar) cards leaked from India
Photo by Kit (formerly ConvertKit) / Unsplash

Summary

In the normal course of scanning for open/exposed/vulnerable Amazon S3 buckets, I discovered a bucket containing roughly 3,000 imaged of Permanent Account Number (PAN) cards and National ID (Aadhaar) cards from India.

After concluding the images and PDFs appeared to be sensitive identification documents I immediately attempted to make contact with India's CERT team to notify them of this data leak. I can confirm roughly 3 weeks after notice was provided that action has been taken to secure the S3 bucket.

It is unclear who the owner of the Amazon S3 bucket is/was. From the types of documents being added, the frequency, and from how some of the images look -- my best guess is that some type of scanner was set to upload (or backup) files to this S3 bucket.

I would like to thank India's CERT team for their assistance in securing this S3 bucket.

Incident Timeline:

DateEvent
December 1, 2018Open bucket discovered.
December 2, 2018Email sent to India's CERT team.
December 2, 2018Acknowledgement email from India's CERT team with a reference number.
December 5, 2018Follow up email sent.
December 5, 2018CERT team indicated they are in touch with Amazon.
December 21, 2018Asked a contact for assistance with resolving this issue.
December 22, 2018CERT team confirms the S3 bucket is now secured.

What's in the bucket:

The S3 bucket contained roughly 4,800+ files -- mostly images, but it also contained PDFs.

/private-documents/absolutegenericdocument-1/2018/7/7/JPEG_2018_06_29_17_20_27_-90041938.jpg

The "absolutegenericdocument-" level folder had 4 folders that contained distinctly different types of documents.

Within each of the "absolutegenericdocument-" level folders the documents were organized by year/month/day/file.

Side note: The irony of a an S3 bucket full of sensitive data being named things like "private-documents" and "absolutegenericdocument" is not lost on me. They could have named the folders something like "totally-meant-to-leak-this".

The "absolutegenericdocument-1" folder appeared to contain primarily images of Permanent Account Number (PAN) cards. It also contains some PDFs of images as well.
Document count: 1,142 images

The "absolutegenericdocument-2" folder appeared to contain images of India‘s national ID card known as “Aadhaar”.
Document count: 1,109 images.

The "absolutegenericdocument-3" folder appears to contain miscellaneous images of people, mostly "head shots" -- like the type of photo you'd use for an ID
Document count: 1,082 images.

The "absolutegenericdocument-4" folder appears to contain "pay in slips" PDFs.
Document count: 942 PDFs.

The S3 bucket was still being actively updated with new images as of December 21st – the date I received assistance from a contact who helped me escalate this report.