API to change Process Scheduler

Hier finden YaCy User Hilfe wenn was nicht funktioniert oder anders funktioniert als man dachte. Bei offensichtlichen Fehlern diese bitte gleich in die Bugs (http://bugs.yacy.net) eintragen.
Forumsregeln
In diesem Forum geht es um Benutzungsprobleme und Anfragen für Hilfe. Wird dabei ein Bug identifiziert, wird der thread zur Bearbeitung in die Bug-Sektion verschoben. Wer hier also einen Thread eingestellt hat und ihn vermisst, wird ihn sicherlich in der Bug-Sektion wiederfinden.

API to change Process Scheduler

Beitragvon DNcrawler » Do Jan 05, 2017 9:24 pm

Hi,

I'm crawling an intranet of thousands of websites. My process scheduler currently lists 433 entries. I need to change them to daily execution. I can't find a way to change all existing jobs to 1 day, from the default of 7. Any ideas?

I haven't wanted to go directly to the table to mess with the data yet.

In the web interface, this page is the list: http://localhost8090/Table_API_p.html

Thank you.
DNcrawler
 
Beiträge: 18
Registriert: Mi Dez 21, 2016 1:48 am

Re: API to change Process Scheduler

Beitragvon luc » Fr Jan 06, 2017 2:36 pm

Hi DNcrawler,
unfortunately as far I as understand the current code behind Table_API_p.html or Tables_p.html, only one line can be edited at once. And parameters need to passed as HTTP post data.

You could eventually edit manually the api.bheap file but you might easily corrupt it...

To my mind your best option would be either to modify the Table_API_p.java code or to write a script which would request each desired table line edition trhough the http API in a loop...

Have a nice day
luc
 
Beiträge: 265
Registriert: Mi Aug 26, 2015 1:04 am

Re: API to change Process Scheduler

Beitragvon luc » Fr Jan 06, 2017 6:23 pm

Indeed you could also process differently supposing you crawl your websites with the same crawl settings : write your websites URLs to a file and use this file as a crawl start point ("From file" option in /CrawlStartExpert.html), or write directly your websites list in the "list of URLs" text area of the /CrawlStartExpert.html. Then you will be able to control the scheduled frequency for this crawl in one step.
luc
 
Beiträge: 265
Registriert: Mi Aug 26, 2015 1:04 am

Re: API to change Process Scheduler

Beitragvon DNcrawler » Fr Jan 06, 2017 8:35 pm

Thank you for the responses. This confirms what I have found on my own. I'm not looking forward to editing a bheap file, so I decided against it. The 433 entries are a mix of single sites and multiple sites. I'll just update them all manually for now.

Thanks.
DNcrawler
 
Beiträge: 18
Registriert: Mi Dez 21, 2016 1:48 am

Re: API to change Process Scheduler

Beitragvon DNcrawler » So Apr 09, 2017 12:59 am

Hi,

I can't find documentation on the autocrawl settings in yacy.init, https://github.com/yacy/yacy_search_ser ... .init#L547

If it does what it implies, all submitted sites would be crawled every day. Seem correct?

Thank you.
DNcrawler
 
Beiträge: 18
Registriert: Mi Dez 21, 2016 1:48 am


Zurück zu Fragen und Antworten

Wer ist online?

Mitglieder in diesem Forum: Exabot [Bot] und 2 Gäste