How do I know if a scheduled crawl has finished

Hier finden YaCy User Hilfe wenn was nicht funktioniert oder anders funktioniert als man dachte. Bei offensichtlichen Fehlern diese bitte gleich in die Bugs (http://bugs.yacy.net) eintragen.
Forumsregeln
In diesem Forum geht es um Benutzungsprobleme und Anfragen für Hilfe. Wird dabei ein Bug identifiziert, wird der thread zur Bearbeitung in die Bug-Sektion verschoben. Wer hier also einen Thread eingestellt hat und ihn vermisst, wird ihn sicherlich in der Bug-Sektion wiederfinden.

How do I know if a scheduled crawl has finished

Beitragvon oneaty » Mo Feb 04, 2013 12:50 pm

Hi,
How do I know if a scheduled crawl has finished?
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: How do I know if a scheduled crawl has finished

Beitragvon Orbiter » Mo Feb 04, 2013 1:35 pm

- While a crawl is running you can see that as an entry in /Crawler_p.html; when the crawl is finished it disappears there
- the fact that the crawl is started by the scheduler can be seen in /Table_API_p.html where the call count went up by one
- and finally: in /CrawlProfileEditor_p.html is an entry with the status "Finished"
Orbiter
 
Beiträge: 5787
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: How do I know if a scheduled crawl has finished

Beitragvon oneaty » Mo Feb 04, 2013 2:22 pm

Thanks, Orbiter.

Well, something may be wrong with my settings or something else.
Today, I scheduled a crawl which, soon after, showed in /Table_API_p.html (not sure if this is the page titled "Recorded Actions"; is it?). It showed today's date as Last Exec Date.
What do you mean by "call count" in the /Crawler_p.html page? There are many counts there (please, be patient, I'm a dummy)
Also, in /CrawlProfileEditor_p.html there is no entry regarding that new crawl; actually, there are no new entries for the other scheduled crawls I had previously scheduled, that should have run in the late few days.
Thanks for the prompt reply.
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: How do I know if a scheduled crawl has finished

Beitragvon Orbiter » Mo Feb 04, 2013 4:51 pm

oneaty hat geschrieben:Today, I scheduled a crawl which, soon after, showed in /Table_API_p.html (not sure if this is the page titled "Recorded Actions"; is it?). It showed today's date as Last Exec Date.

thats correct, this interface is just a recording of the request. It does not mean that the request is completely worked off.
oneaty hat geschrieben:What do you mean by "call count" in the /Crawler_p.html page? There are many counts there (please, be patient, I'm a dummy)

Thats about the /Table_API_p.html page. There is just one count and I mean the column 'call count'
oneaty hat geschrieben:Also, in /CrawlProfileEditor_p.html there is no entry regarding that new crawl; actually, there are no new entries for the other scheduled crawls I had previously scheduled, that should have run in the late few days.

That should be in the table "Crawl Profile List" at the bottom. If there is an entry there with the "Status" column with 'Running', it shows also in the /Crawler_p.html page. If this is finished then the entry status is just changed to 'finished' and the entry remains there.
Orbiter
 
Beiträge: 5787
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: How do I know if a scheduled crawl has finished

Beitragvon oneaty » Mo Feb 04, 2013 8:26 pm

Thats about the /Table_API_p.html page. There is just one count and I mean the column 'call count'

ok
That should be in the table "Crawl Profile List" at the bottom. If there is an entry there with the "Status" column with 'Running', it shows also in the /Crawler_p.html page. If this is finished then the entry status is just changed to 'finished' and the entry remains there.


As you can see in this picture, the new crawl I scheduled today is highlighted in red:
Bild

But I didn't see any entry at the "Crawl Profile List" regarding that new crawl task, as you can see in the following pic:
Bild

So my concern is if this crawl has been successfully finished or not, and why (and how to monitor its running progress)
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: How do I know if a scheduled crawl has finished

Beitragvon Orbiter » Di Feb 05, 2013 2:28 pm

well that looks like there went something wrong. There should be an error in the log; this can be seen either in /ViewLog_p.html directly after the crawl start or in DATA/LOG/yacy00.log

A different approach (if the crawl actuall would have been started, which seems not to be the case) is a look inside the rejected urls at /IndexCreateParserErrors_p.html

But anyway: from the auto-generated Must-match entry I can see that you must be running an non-updated 1.3er version. right? If yes, please try an update using the auto-udater at /ConfigUpdate_p.html

To reproduce the problem I started the same crawl at blogdosakamoto . blogosfera . uol . com . br but this was successful! Please have a look if updating solves the problem.
Orbiter
 
Beiträge: 5787
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: How do I know if a scheduled crawl has finished

Beitragvon oneaty » Di Feb 05, 2013 3:44 pm

I'll do that, thanks a lot for your help.
Unfortunately, I've run out of disk space, among other reasons, because of Yacy (I believe it produces a growing local index, or DHT, or both, or something else that I'm not aware of, right? thus contributing to running out of disk space).
As soon as I finish a disk clean up procedure - which hopefully will take only a couple of days - I'll follow your advice, upgrade Yacy and try to see what happens.
And let you know.
See you soon, please don't close this topic yet.
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: How do I know if a scheduled crawl has finished

Beitragvon oneaty » Fr Feb 08, 2013 1:02 pm

Ok, I'm back.
I have followed your advice on updating Yacy, but I've chosen the option Automatic Install, on the Manual Update section.
Now, my question is:
Does this install process takes significant time?
Because it's been now some 10 minutes after Yacy started the install process and opened a CMD window (I'm running over Windows Vista), which seems to be filling the scren with infinite dots, and nothing seems to happen.
Is this normal?
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: How do I know if a scheduled crawl has finished

Beitragvon oneaty » Fr Feb 08, 2013 2:03 pm

I aborted the install process (Closed the cmd line window and purged Javaw process).
Then I shut dwon my Firewall (Commodo).
I restarted Yacy and I confirmed that the upgrade didn't work, since the system version number wasn't changed.
Then I selected Install Release from Manual System Update section and what I got was this two messages:
Bild

and

Bild

Again, it seems that the install procedure was in an infinite loop.
Again, I canceled the cmd line window and purged the Javaw process.
Then I restarted Yacy and this time I choose Automated System Update, like you've said before.
This time, nothing seem to happen and, by the end, I'm still running 1.04/9000 version.
As far as my Linux knowledge goes, tar.gz seems to be a Linux file. Does this has something to do with my inabbility to upgrade Yacy on Windows Vista?
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: How do I know if a scheduled crawl has finished

Beitragvon Orbiter » Sa Feb 23, 2013 2:54 pm

the tar.gz file is the generic release and used by all system-specific YaCy versions for updating. The system core is same everywhere, only the start wrapper is different.
Orbiter
 
Beiträge: 5787
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main


Zurück zu Fragen und Antworten

Wer ist online?

Mitglieder in diesem Forum: Crystalgazer und 2 Gäste