In the previous article (Part I), we discussed the context around open source intelligence (OSINT) research – how OSINT fits into the intelligence cycle, the explosion in social media use, and background information about the internet & browsing.
In this second half of the series, we will now discuss tools and concepts that are central to the duties of an OSINT researcher. Highlighting these tools and concepts serves two purposes for readers. First, it brings useful research methods to our attention. Second, it opens our eyes to ways in which a potentially malicious actor could collect OSINT about our organizations or assets (people, facilities, information, reputation, etc.). As an exercise, following each section of this article, readers should ask themselves this question: “How could these collection methods be used to collect information about my organization and its assets?”
The Value Of A Digital Image
Digital images are helpful for online researchers because all images contain EXIF data and because digital images afford us the opportunity to use reverse image searching. First, EXIF data is information about the image file itself. This data may include information about the camera/smartphone that was used to capture the image (make, model, serial number, date/time stamp, camera settings, device name), programs used to manipulate/produce the image, GPS coordinates of the location where the picture was captured, and much more.
To illustrate this point, a screenshot of EXIF data from an image taken from Flickr is pictured here (using “Jeffrey’s Image Metadata Viewer”). You can see that the image originated on a Nokia Lumia 625 smart phone, it was taken on 8/13/16 at 1311hrs, and it was manipulated in Microsoft Paint.
*It should be noted that some online platforms automatically remove EXIF data from images that you upload to their sites (such as Facebook and Instagram), but some do not. For example, Flicker — the largest online repository specifically for photos, does not automatically remove EXIF data.
Reverse image searching is another benefit afforded to online researchers with digital images. What is reverse image searching? Services such as Google and Tineye allow the user to upload an image, then their service will attempt to search for similar pictures in their database. Basically, their computers are analyzing the shapes, colors, and patterns of your image and trying to find similar ones. Generally, this feature is useful when trying to locate other social media profiles of a subject, when you already have one of their profile images. However, there are limitless applications for this technique.
*As a warning, it is risky to upload images into reverse image search engines, when the image is not already publicly available. If you upload a sensitive image or an image that is not already online, you must realize that the image is now out of your hands and those services may store it.
When we use the term geotagging, it generally means the attachment of geographical data to social posts, images, etc. As an example, Instagram users can include geographical information when they post a new image to their account. Unfortunately, (or maybe fortunately), over the past 5 or more years, internet users have become more privacy conscious. As a result, they have dramatically decreased their practice of geotagging their Instagram, Twitter, and other posts. However, searching for and exploiting geotagged information should not be neglected completely. Here’s one example (pictured on the right) worth exploring, courtesy of Twitter: https://onemilliontweetmap.com/
Suppose you wanted to look at the previous version of a website or even a website that previously existed, but has since been deleted. How could you go about it?
There are two options: (1) Google’s cached version of the website or (2) Internet Archive, Way Back Machine. What is a cached version of a site? Every time you search something on Google, you see the results in a specific format: title of website, URL (with a tiny green arrow after it), and a website description. If you click the tiny green arrow and choose “Cached”, then Google will show you the website as it appeared the previous time it was crawled by Google. Alternatively, a researcher could use Internet Archive to view historical snapshots of a particular website with great detail.
Google Search Operators
Yes, for many readers this is research 101, but I have to mention it! Google search operators are a researchers best friend because they allow us to make highly specified queries within Google (and other search engines). This is only an abbreviated introduction.
Below are 3 examples that are highly useful:
1. Using quotes to search a specific phrase.
Suppose you type into Google the words protective intelligence. When you hit return, Google will interpret this query as you asking it to search for web pages that contain the word protective and the word intelligence. However, if you used quotes around the phrase, such as this “protective intelligence” then Google would interpret this as you asking it to find web pages with the whole phrase protective intelligence, rather than the two individual words anywhere on the page.
2. Searching a specific site.
Suppose you wanted to search for the name John Smith on the LA Times website, you could use the site operator. You would type into the search bar site:www.latimes.com “john smith”. Here we combined two search operators. First, we are telling Google to search the exact phrase John Smith, and then we are telling Google to limit the search to the website www.latimes.com.
3. Searching for files.
Suppose you wanted to find online files associated with a particular phrase, in this case we would use the file type operator. If we entered into the search bar filetype:pdf “john smith” resume , how would Google interpret this? Google will look for only PDF files with the exact phrase john smith and also the word resume.
I encourage readers to seek out more information about Google search operators and become proficient. It will save you time and they can be used in other mediums too (such as Google Alerts and various search platforms).
The 10 Keys Of OSINT Research
To conclude this introduction about OSINT research, we will leave you with a graphic that lists a broad set of ideas that guide online researchers.
Collectively, that was a lot of information and it was still far from being exhaustive. However, you should be familiar with some of the most important, broad concepts and tools that OSINT researchers use on a daily basis, as well as foundational information about internet research. The methods of OSINT research are of critical importance because they serve the researcher in their investigations and they highlight how a malicious actor might collect information about our organizations and assets.
Author Credit: This article was written by the Protective Intelligence contributor, Travis Lishok.
Author Bio: Travis Lishok, CPP
The Protective Intelligence team is excited to feature a written piece from Travis Lishok, one of the newest Ontic team members. Travis has nearly 10 years of experience in public and private sector security, to include conducting intelligence research and supporting executive protection teams in GSOC operations. As a professional project, Travis creates protective security related content via EP Nexus, some of which specifically focuses on OSINT, travel risk, and related topics. As you’ve seen in this article, investigative research is a topic that he is enthusiastic about sharing, and that’s why we invited him to contribute this valuable piece.