Index halved, 9M docs lost, 22,000 hours of post-process

Discussion in English language.
Forumsregeln
You can start and continue with posts in english language in all other forums as well, but if you are looking for a forum to start a discussion in english, this is the right choice.

Index halved, 9M docs lost, 22,000 hours of post-process

Beitragvon davide » Di Jul 21, 2015 11:25 am

My index, previously 18M records and counting, dropped to 9M for no known reason.
Along with this, the post-process ETA raised from 0 minutes to 22,000 hours, and this figure has remained constant for days.
Disk activity is maxed out.

This has being going on for 3 days now, since I found my index halved.

The only particular actions I took back that day were:
  • click "Delete Load Errors" from http://192.168.1.109:8090/HostBrowser.html?admin=true&hosts=
  • and click "Re-load load-failure docs (404s etc)" from http://192.168.1.109:8090/HostBrowser.html?admin=true&path=example&facetcount=1632682 for a few (3÷5) domains.

Last few logs: (more logs here: https://pastebin.mozilla.org/8840113)

Code: Alles auswählen
I 2015/07/21 11:44:29 REJECTED http://www.newegg.com/Product/Product.aspx?Item=9SIA3912D67427&SortField=0&SummaryType=0&PageSize=10&SelectedRating=-1&VideoOnlyMark=False&IsFeedbackTab=true - cannot load: load error - java.io.IOException: CRAWLER Redirect of URL=http://www.newegg.com/Product/ProductReview.aspx?Item=9SIA3912D67427&nm_mc=OTC-RSS to http://www.newegg.com/Product/Product.aspx?Item=9SIA3912D67427&SortField=0&SummaryType=0&PageSize=10&SelectedRating=-1&VideoOnlyMark=False&IsFeedbackTab=true#scrollFullInfo placed on crawler queue for double-check

I 2015/07/21 11:44:29 LOADER CRAWLER ..Redirecting request to: http://www.newegg.com/Product/Product.aspx?Item=9SIA3912D67427&SortField=0&SummaryType=0&PageSize=10&SelectedRating=-1&VideoOnlyMark=False&IsFeedbackTab=true#scrollFullInfo

I 2015/07/21 11:44:29 LOADER CRAWLER Redirection detected ('HTTP/1.1 301 Moved Permanently') for URL http://www.newegg.com/Product/ProductReview.aspx?Item=9SIA3912D67427&nm_mc=OTC-RSS

I 2015/07/21 11:44:28 HostQueue forcing crawl-delay of 245 milliseconds for www.amazon.com: minimumDelta = 500, flux = 0, host.average = 2171, robots.delay = 0, ((waitig = 1085) - (timeSinceLastAccess = 840)) = 245

I 2015/07/21 11:44:28 REJECTED http://www.amazon.ca/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:28 REJECTED http://ad.doubleclick.net/jump/tigerdirect.com/ROS_160x600;sz=160x600;page=powerprotectionabr=!ie4;abr=!ie5;abr=!ie6;ord=273030400276? - denied by robots.txt

I 2015/07/21 11:44:28 HostBalancer (re-)initialized the round-robin queue; 4 hosts.

I 2015/07/21 11:44:28 REJECTED http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1816_88166_88166 - denied by document-attached noindexing rule

I 2015/07/21 11:44:28 SWITCHBOARD Not Condensed Resource 'http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1816_88166_88166': denied by document-attached noindexing rule

I 2015/07/21 11:44:28 SWITCHBOARD CRAWL: ADDED 725 LINKS FROM http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1816_88166_88166, STACKING TIME = 33, PARSING TIME = 58

I 2015/07/21 11:44:28 REJECTED https://get.adobe.com/flashplayer - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:28 REJECTED http://www.amazon.com.br/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:28 REJECTED http://www.amazon.co.uk/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:28 HostQueue forcing crawl-delay of 357 milliseconds for www.tigerdirect.com: minimumDelta = 500, flux = 0, host.average = 2756, robots.delay = 0, ((waitig = 1378) - (timeSinceLastAccess = 1021)) = 357

I 2015/07/21 11:44:28 REJECTED http://ad.doubleclick.net/adi/tigerdirect.com/ROS_160x600;sz=160x600;page=powerprotectionord=273030400276? - denied by robots.txt

I 2015/07/21 11:44:28 HostBalancer (re-)initialized the round-robin queue; 4 hosts.

I 2015/07/21 11:44:28 SWITCHBOARD Excluded 32 words in URL http://www.amazon.com/Pivotal-Living-Tracker-Generation-Black/dp/B00VMPVQDC

I 2015/07/21 11:44:28 SWITCHBOARD CRAWL: ADDED 263 LINKS FROM http://www.amazon.com/Pivotal-Living-Tracker-Generation-Black/dp/B00VMPVQDC, STACKING TIME = 289, PARSING TIME = 59

I 2015/07/21 11:44:28 REJECTED http://www.amazon.es/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:28 REJECTED http://www.amazon.com/s?ie=UTF8&bbn=6358552011&page=1&rh=n:6358551011,n:7141123011,n:7147443011,n:6358552011,p_n_size_three_browse-vebin:2205707011 - cannot load: load error - java.io.IOException: CRAWLER Redirect of URL=http://www.amazon.com/s?ie=UTF8&bbn=6358552011&page=1&rh=n:7141123011,n:7147443011,n:6358551011,n:6358552011,p_n_size_three_browse-vebin:2205707011 to http://www.amazon.com/s?ie=UTF8&bbn=6358552011&page=1&rh=n:6358551011,n:7141123011,n:7147443011,n:6358552011,p_n_size_three_browse-vebin:2205707011 placed on crawler queue for double-check

I 2015/07/21 11:44:28 LOADER CRAWLER ..Redirecting request to: http://www.amazon.com/s?ie=UTF8&bbn=6358552011&page=1&rh=n:6358551011,n:7141123011,n:7147443011,n:6358552011,p_n_size_three_browse-vebin:2205707011

I 2015/07/21 11:44:28 LOADER CRAWLER Redirection detected ('HTTP/1.1 301 Moved Permanently') for URL http://www.amazon.com/s?ie=UTF8&bbn=6358552011&page=1&rh=n:7141123011,n:7147443011,n:6358551011,n:6358552011,p_n_size_three_browse-vebin:2205707011

I 2015/07/21 11:44:27 HostQueue forcing crawl-delay of 943 milliseconds for www.newegg.com: minimumDelta = 500, flux = 0, host.average = 5728, robots.delay = 0, ((waitig = 2864) - (timeSinceLastAccess = 1921)) = 943

I 2015/07/21 11:44:27 REJECTED http://ad.doubleclick.net/jump/tigerdirect.com/CAT_300x250;sz=300x250;page=powerprotection;ord=273030400276? - denied by robots.txt

I 2015/07/21 11:44:27 HostBalancer (re-)initialized the round-robin queue; 4 hosts.

I 2015/07/21 11:44:27 REJECTED http://ad.doubleclick.net/ad/tigerdirect.com/CAT_300x250;sz=300x250;abr=!ie4;abr=!ie5;abr=!ie6;page=powerprotection;ord=273030400276? - denied by robots.txt

I 2015/07/21 11:44:27 HostBalancer (re-)initialized the round-robin queue; 4 hosts.

I 2015/07/21 11:44:27 REJECTED http://www.amazon.com/gp/offer-listing/B00GXXJTAK - denied by robots.txt

I 2015/07/21 11:44:27 HostQueue forcing crawl-delay of 5 milliseconds for www.amazon.com: minimumDelta = 500, flux = 0, host.average = 2183, robots.delay = 0, ((waitig = 1091) - (timeSinceLastAccess = 1086)) = 5

I 2015/07/21 11:44:26 SWITCHBOARD *Indexed 39 words in URL http://ecx.images-amazon.com/images/I/41xh7RDhfQL._AA160_.jpg [2WNqyq0ARL5a] Description: 41xh7RDhfQL._AA160_.jpg MimeType: image/jpeg | Charset: UTF-8 | Size: 493 bytes | LinkStorageTime: 0 ms | indexStorageTime: 0 ms

I 2015/07/21 11:44:26 Fulltext indexing: 2WNqyq0ARL5a http://ecx.images-amazon.com/images/I/41xh7RDhfQL._AA160_.jpg

I 2015/07/21 11:44:26 SWITCHBOARD Excluded 1 words in URL http://ecx.images-amazon.com/images/I/41xh7RDhfQL._AA160_.jpg

I 2015/07/21 11:44:26 SWITCHBOARD CRAWL: ADDED 1 LINKS FROM http://ecx.images-amazon.com/images/I/41xh7RDhfQL._AA160_.jpg, STACKING TIME = 0, PARSING TIME = 2

I 2015/07/21 11:44:26 HostQueue forcing crawl-delay of 1054 milliseconds for www.tigerdirect.com: minimumDelta = 500, flux = 0, host.average = 2756, robots.delay = 0, ((waitig = 1378) - (timeSinceLastAccess = 324)) = 1054

I 2015/07/21 11:44:26 HostBalancer (re-)initialized the round-robin queue; 4 hosts.

I 2015/07/21 11:44:26 REJECTED http://www.tigerdirect.com/cgi-bin/order.asp?EdpNo=8565377&QTY=1&ClickSource=SLC - denied by robots.txt

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/masthead-nav-vert_v3a.png);height: - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/rightnav/liveHelpIcon160_off_v2.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.se/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/campaigns/homeautomation/HomeNav_ad3.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.be/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.twitter.com/tigerdirect - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.youtube.com/tigerdirectblog - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/Luggage.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/footer/logos-mc.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/footer/logos-visa.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.pt/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1354_88332_88332 - denied by document-attached noindexing rule

I 2015/07/21 11:44:26 REJECTED http://www.misco.co.uk/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 SWITCHBOARD Not Condensed Resource 'http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1354_88332_88332': denied by document-attached noindexing rule

I 2015/07/21 11:44:26 SWITCHBOARD CRAWL: ADDED 533 LINKS FROM http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1354_88332_88332, STACKING TIME = 19, PARSING TIME = 37

I 2015/07/21 11:44:26 REJECTED http://www.misco.ie/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/button-slc-addtocart.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/footer/trustwave_logo.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/favicon.ico - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/mastNav-icon-new.png) - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/deals-gifts-dealslasher.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED https://sealserver.trustkeeper.net/compliance/cert.php?code=ea97a8b6d8d755f41b78d04aa242d7f1&style=normal&size=105x54&language=en - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/tlc/BLUnavBanner.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.it/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.nl/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/gamingReloadedAccessories.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/gamingReloaded.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/Jewelry.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/footer/systemax.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/mastNav-link-arrow.png) - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/shopGPS_nav.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED https://plus.google.com/114822625291786269495 - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/campaigns/misc/pcComponentBundles.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.es/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.systemax.com/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://tigerdirect.applicantpro.com/pages/careershome/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://direct.digitallanding.com/?PromoID=5009008 - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/skuimages/medium/CNET-H24-A393129.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/masthead-innercircle.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/footer/hp-supplies-medallion.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/footer/logos-bbb-new.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/masthead/txtMobile.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/efitness.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED https://trustsealinfo.websecurity.norton.com/splash?form_file=fdf/splash.fdf&dn=www.tigerdirect.com&lang=en - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.bbb.org/south-east-florida/business-reviews/general-merchandise-retail-by-internet/tigerdirect-in-miami-fl-27000500 - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.at/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead/masthead-bg_HPElite.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.de/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.facebook.com/TigerDirect - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://syx.client.shareholder.com/releases.cfm - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.fr/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/footer/seal.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/nav-email-group.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/shopToys2_nav.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.misco.ch/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/office-supplies.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/pixel-clear.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/shoplinks_v3.png);} - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/loading.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/mastNav-sub-bg-left.png) - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.tigerdirect.ca/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/wholesale_products.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:26 REJECTED http://www.newegg.com/Common/CommonReCaptchaValidate.aspx?referer=http://www.newegg.com/Product/Product.aspx?Item=15-124-116&nm_mc=OTC-RSS&cm_sp=OTC-RSS-_-Add-On%20Cards-_-Syba-_-15-124-116 - cannot load: load error - java.io.IOException: CRAWLER Redirect of URL=http://www.newegg.com/Product/Product.aspx?Item=15-124-116&nm_mc=OTC-RSS&cm_sp=OTC-RSS-_-Add-On Cards-_-Syba-_-15-124-116 to http://www.newegg.com/Common/CommonReCaptchaValidate.aspx?referer=http://www.newegg.com/Product/Product.aspx?Item=15-124-116&nm_mc=OTC-RSS&cm_sp=OTC-RSS-_-Add-On%20Cards-_-Syba-_-15-124-116 placed on crawler queue for double-check

I 2015/07/21 11:44:26 LOADER CRAWLER ..Redirecting request to: http://www.newegg.com/Common/CommonReCaptchaValidate.aspx?referer=http://www.newegg.com/Product/Product.aspx?Item=15-124-116&nm_mc=OTC-RSS&cm_sp=OTC-RSS-_-Add-On%20Cards-_-Syba-_-15-124-116

I 2015/07/21 11:44:26 LOADER CRAWLER Redirection detected ('HTTP/1.1 302 Found') for URL http://www.newegg.com/Product/Product.aspx?Item=15-124-116&nm_mc=OTC-RSS&cm_sp=OTC-RSS-_-Add-On Cards-_-Syba-_-15-124-116

I 2015/07/21 11:44:26 REJECTED http://www.amazon.com/gp/pdp/profile/A38NEDIGZZ2ZFT - no response body (http return code = 403)

I 2015/07/21 11:44:25 HostQueue forcing crawl-delay of 752 milliseconds for www.tigerdirect.com: minimumDelta = 500, flux = 0, host.average = 2791, robots.delay = 0, ((waitig = 1395) - (timeSinceLastAccess = 643)) = 752

I 2015/07/21 11:44:25 HostBalancer (re-)initialized the round-robin queue; 5 hosts.

I 2015/07/21 11:44:25 SWITCHBOARD *Indexed 40 words in URL http://ecx.images-amazon.com/images/I/41TaBjzH0lL._AA160_.jpg [eFp32q0ARL5a] Description: 41TaBjzH0lL._AA160_.jpg MimeType: image/jpeg | Charset: UTF-8 | Size: 493 bytes | LinkStorageTime: 0 ms | indexStorageTime: 0 ms

I 2015/07/21 11:44:25 Fulltext indexing: eFp32q0ARL5a http://ecx.images-amazon.com/images/I/41TaBjzH0lL._AA160_.jpg

I 2015/07/21 11:44:25 SWITCHBOARD Excluded 1 words in URL http://ecx.images-amazon.com/images/I/41TaBjzH0lL._AA160_.jpg

I 2015/07/21 11:44:25 SWITCHBOARD CRAWL: ADDED 1 LINKS FROM http://ecx.images-amazon.com/images/I/41TaBjzH0lL._AA160_.jpg, STACKING TIME = 0, PARSING TIME = 2

I 2015/07/21 11:44:25 HostQueue forcing crawl-delay of 286 milliseconds for www.newegg.com: minimumDelta = 500, flux = 0, host.average = 5728, robots.delay = 0, ((waitig = 2864) - (timeSinceLastAccess = 2578)) = 286

I 2015/07/21 11:44:25 REJECTED http://ad.doubleclick.net/ad/tigerdirect.com/CAT_300x250;sz=300x250;page=powerprotection;ord=5337880253791? - denied by robots.txt

I 2015/07/21 11:44:25 HostBalancer (re-)initialized the round-robin queue; 5 hosts.

I 2015/07/21 11:44:25 HostQueue forcing crawl-delay of 278 milliseconds for ecx.images-amazon.com: minimumDelta = 500, flux = 0, host.average = 1259, robots.delay = 0, ((waitig = 629) - (timeSinceLastAccess = 351)) = 278

I 2015/07/21 11:44:25 REJECTED http://www.amazon.com/gp/voting/cast/Reviews/2115/R29SZMPGJI5S48/Helpful/1?ie=UTF8&target=aHR0cDovL3d3dy5hbWF6b24uY29tL2dwL3Byb2R1Y3QvQjAwTTU1QzBOUw&token=3F8618568B7E7E8870C435B0E16257BD90C0B89A&voteAnchorName=R29SZMPGJI5S48.2115.Helpful.Reviews&voteSessi= - denied by robots.txt

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/masthead-nav-vert_v3a.png);height: - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/rightnav/liveHelpIcon160_off_v2.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.se/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/campaigns/homeautomation/HomeNav_ad3.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.be/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.twitter.com/tigerdirect - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/skuimages/medium/Etilize-H24-A893447.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.youtube.com/tigerdirectblog - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/Luggage.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/footer/logos-mc.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/footer/logos-visa.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.pt/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.co.uk/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.ie/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/button-slc-addtocart.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/footer/trustwave_logo.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/favicon.ico - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/mastNav-icon-new.png) - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/skuimages/medium/CNET-LBQ-103006876.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/deals-gifts-dealslasher.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/skuimages/medium/CNET-LBQ-103002576.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED https://sealserver.trustkeeper.net/compliance/cert.php?code=ea97a8b6d8d755f41b78d04aa242d7f1&style=normal&size=105x54&language=en - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/tlc/BLUnavBanner.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.it/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.nl/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/gamingReloadedAccessories.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/gamingReloaded.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/Jewelry.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/footer/systemax.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/mastNav-link-arrow.png) - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/shopGPS_nav.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED https://plus.google.com/114822625291786269495 - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/skuimages/medium/CNET-YYI1-BU3743.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/campaigns/misc/pcComponentBundles.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.es/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.systemax.com/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://tigerdirect.applicantpro.com/pages/careershome/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://direct.digitallanding.com/?PromoID=5009008 - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/masthead-innercircle.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/footer/hp-supplies-medallion.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/footer/logos-bbb-new.png - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1812_88548_88548 - denied by document-attached noindexing rule

I 2015/07/21 11:44:24 SWITCHBOARD Not Condensed Resource 'http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1812_88548_88548': denied by document-attached noindexing rule

I 2015/07/21 11:44:24 SWITCHBOARD CRAWL: ADDED 562 LINKS FROM http://www.tigerdirect.com/applications/category/guidedSearch.asp?CatId=20&sel=Detail;364_1812_88548_88548, STACKING TIME = 20, PARSING TIME = 44

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/masthead/txtMobile.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/efitness.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED https://trustsealinfo.websecurity.norton.com/splash?form_file=fdf/splash.fdf&dn=www.tigerdirect.com&lang=en - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.bbb.org/south-east-florida/business-reviews/general-merchandise-retail-by-internet/tigerdirect-in-miami-fl-27000500 - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.at/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.de/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.facebook.com/TigerDirect - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/skuimages/medium/CNET-LBQ-102972503.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://syx.client.shareholder.com/releases.cfm - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.fr/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/nav-email-group.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/footer/seal.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/shopToys2_nav.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead/masthead-bg_HPElite.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.misco.ch/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/office-supplies.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/pixel-clear.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/shoplinks_v3.png);} - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/loading.gif - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/mastNav-sub-bg-left.png) - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://www.tigerdirect.ca/ - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)

I 2015/07/21 11:44:24 REJECTED http://images.highspeedbackbone.net/td/masthead_v2/promo/wholesale_products.jpg - url does not match must-match filter (\.(jpg|jpeg|gif|giff|png|tif|tiff)$)|(.*\bamazon.com(/.*)?)|(.*\bbestbuy.com(/.*)?)|(.*\bfutureshop.ca(/.*)?)|(.*\bnewegg.com(/.*)?)|(.*\btigerdirect.com(/.*)?)



overview.jpg
Crawler overview
overview.jpg (94.2 KiB) 1329-mal betrachtet


graph.jpg
Crawler graphic
graph.jpg (32.32 KiB) 1329-mal betrachtet


atop.gif
GNU atop
atop.gif (240.78 KiB) 1329-mal betrachtet


Can you explain what I am facing? Reading on the forum, other members have experienced some kind index corruptions in the past. Is YaCy itself responsible for these data loss?
I can safely exclude hardware corruption: ECC ram, no UDMA CRC errors on disks nor other SMART errors, and overall proved HW stability.
davide
 
Beiträge: 84
Registriert: Fr Feb 15, 2013 8:03 am

Zurück zu English

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 2 Gäste