Forensic Keyword Crawl:

Using the Deny Domain Names and URL(s) String Field: To exemplify the use of the Deny Domain Names and URL(s) Strings field, we'll construct a crawl that will find 10 off-site links that have a specified keyword. We'll use the Simple Keyword Crawl example, and use exactly the same starting URL and specified keyword. Except with this crawl we'll specify "fuzionathletics.com" in the Deny Domain Names and URL(s) Strings field. This indicates to the crawler that it should not follow any links with "fuzionathletics.com" in the URL. It's important to note that this Deny Domain Name and URL(s) Strings field specifies a string that can match anywhere in the URL. So if, say, we were to also specify "polevault" in the deny field, and the string "polevault" was used in another URL, maybe as a web page specifier, then that URL would NOT be followed or crawled. But with "fuzionathletics.com" specified as a deny string, the crawler will effectively not follow any links on the fuzionathletics.com website, and will only follow links that lead off the website.

For this example enter the following.

  • Starting URL: https: "http://www.fuzionathletics.com"
  • Keyword: "fuzion"
  • Deny Domain Names and URL(s) Strings: "fuzionathletics.com"
  • Keep all other input fields with their default values.

For this crawl, the output results will show all instances where the keyword "fuzion" matches text that is displayed on web pages that have links present on the http://www.fuzionathletics.com page (not more than 1 level below this page), and don't utilize "fuzionathletics.com" anywhere in the URLs. So the output results for this crawl will show keyword matches on websites such as Instagram, Twitter, and YouTube, that are linked to from the Fuzion website and show the respective Fuzion information on those sites, and happen to utilize the keyword "fuzion". For this crawl the results will be limited to the first 10 pages found.