Fingerprinting a Organization Using Metadata of Public Documents

Karl Mendelman
Many companies and organizations use Internet for their business activities to make infor-mation about their products and services more available for customers. Often those organi-zations and companies share electronic documents on their websites, such as manuals, whitepapers, guidelines, templates, and other documents which are considered as im-portant to share. Documents which are uploaded on organizations’ websites can contain extra information, such as metadata.
Metadata is defined as data which describes other data. Metadata associated with docu-ments can contain information about names of authors, creators information, documents general properties, the name of the server, or path where the document was modi-fied. Metadata is added into documents mainly by automated process when document is created, and if documents’ metadata is not properly removed before sharing, it could con-tain sensitive information. Usually people are not aware about metadata existence in doc-uments and could unwillingly leak information about their organization or about them-selves. This information can be used for fingerprinting basis or conducting cyber attacks.
In this thesis paper, electronic documents’ metadata which are shared on Estonian gov-ernmental organizations websites were analyzed. More specifically, three institutions’ pub-lic documents’ metadata were observed in order to identify metadata vulnerabilities that can be used for fingerprinting purposes. To achieve that, a fingerprinting method was de-veloped and utilized against observed websites. This thesis is divided into two different stages, where first stage describes the developed fingerprinting method, and second stage presents the outcomes of metadata analysis with the developed method.
The results of the conducted research showed that almost all documents which were ana-lyzed contained information which could be used for fingerprinting purposes. We pro-cessed 2643 documents, where only 12 documents had metadata properly removed. All other documents contained pieces of information that describes environment where docu-ment was created and additionally exposed information that could be used for conducting cyber-attacks.
Graduation Thesis language
Graduation Thesis type
Master - Cyber Security
Olaf Manuel Maennel, Raimundas Matulevicius
Defence year