Forensic Keyword Crawl:
Crawling Pages / Sites with Non-Text Information: Webpages that contain non-text information can be problematic for the Forensic Keyword Crawler function. To avoid extensive delays associated with the download and analysis of webpages that contain information such as audio, video, or application files, the crawler is designed to first check the page content type before downloading the webpage. The results file will show all the pages that were queried and will flag the ones with non-text content types by displaying the content type indicator in red text. For some websites, such as file repositories, there may be entire webpages (URLs) with nothing but non-text content type file links. In these situations, it is often useful to list those URLs in the “Deny Domain Names and URL(s) String” field so that the crawler completely avoids those URLs altogether and focuses the keyword search on more relevant pages.