Information Security : Finding Documents Metadata

 

Information Security : Finding Documents Metadata

A vast quantity of information is embedded in document files by the tools that corporations employ to create their papers. Naturally, the contents of the file comprise a large portion of this information. However, the file also contains a significant amount of metadata, which is information about information or information that describes other information. A large portion of the formatting and presentation of the other data in the file are related to this information. In addition to this formatting information, many applications for creating and modifying files contain other metadata items that penetration testers may find helpful during our reconnaissance stage, for example we can look at below information

  1. For password-guessing and exploitation assaults, penetration testers frequently require usernames which may be embedded in the metadata.
  2.  Understanding the entire trajectory of the original file at the time of creation can provide valuable insights. Information about the intended structure, such as tips regarding crucial folders, frequently mounted file servers, and typical user behaviors.
  3. If spear phishing tests (sending emails to target personnel to determine if they will click links or open attachments) are part of the penetration test scope, this information may be helpful. But such tests should only be conducted if they are specifically permitted for the given target persons who are specifically covered by the test’s scope.
  4. Since client-side exploitation is a prevalent attack vector, penetration testers may find it useful to identify which client-side programs are being used, such as the operating system type, office suite, and PDF-generating tool. Version numbers of this software are frequently also revealed via metadata, however they may not always be the most recent version because they were in use when the document was written or last changed. 
  5. Additional details: Additional helpful information is frequently linked to document content that isn’t visible on the application’s screen, like undo information, earlier edits, and hidden or obscured information (like a spreadsheet’s collapsed column that is hidden or a document’s critical text that is hidden).

A penetration tester must first retrieve files for analysis before beginning metadata analysis. These documents could be gathered using a variety of techniques. 
 
First, during the preparation stages of the testing project, the penetration tester might have already obtained some documents created or modified by target system workers. For instance, the tester might have been provided with contracts, policies and procedures, schematics, non-disclosure agreements, Rules of Engagement agreements, and other material.

Document metadata can be extracted using a number of different technologies. Among the most potent are FOCA, Exif tool, and others.

These free tools are all designed to extract particular, listed metadata categories for a particular collection of files. In other words, these programs extract structured metadata from documents that are arranged according to a particular format, with particular metadata tags and/or places.

But regardless of the structure, the strings command extracts every string from a specified file. Strings may extract several kinds of data from any type of file with an unknown structure; it is not just focused on metadata. 
In a jumble of output with other interesting strings, the strings command will frequently locate metadata from files that other tools (which do not understand the structure of the file) are unable to find. Although it can be challenging to search through the strings command’s output, it frequently yields some helpful nuggets.

The Exif tool application is designed to read, write, or manipulate metadata in over a hundred different file types, such as PDFs, office documents (doc, dot, xis, ppt, and more), audio files, videos, photos, and many more.
Exif tool manages one or more files that are supplied to it upon command-line invocation by default. As an alternative, the program can be configured to recursively traverse a directory structure and analyze every file it encounters, processing entire directories on the local workstation where it operates.


Comments

Popular posts from this blog

Python Asyncio Implementation