FTP crawl depth

Forum for developers

FTP crawl depth

Beitragvon luc » Sa Nov 26, 2016 9:52 am

Hi everyone,
currently when an FTP URL is in a crawl start list, YaCy adds the whole FTP repository files list to the crawl stack, even if crawl depth parameter is set to zero. Isn't it a bit excessive?
At least when the crawl depth is zero, couldn't we consider to only add files in the specified path level, and not the whole FTP site?

Best regards
luc
 
Beiträge: 283
Registriert: Mi Aug 26, 2015 1:04 am

Re: FTP crawl depth

Beitragvon reger » So Dez 04, 2016 12:49 am

Hi,

luc hat geschrieben:couldn't we consider to only add files in the specified path level, and not the whole FTP site?


Haven't tested it, but I'd agree,
if there is a crawl depth limit, imho it should apply to FTP, too.
reger
 
Beiträge: 46
Registriert: Mi Jan 02, 2013 9:23 am

Re: FTP crawl depth

Beitragvon luc » Mo Jan 02, 2017 10:30 am

Ok I pushed some changes (commits part 1 and part 2) to produce a behavior on FTP starting point crawl URLs I hope to be as close as possible as the HTTP crawl start points behavior.
luc
 
Beiträge: 283
Registriert: Mi Aug 26, 2015 1:04 am


Zurück zu YaCy Coding & Architecture

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 2 Gäste