Settings & previous crawl data lost after power outage

Discussion in English language.
Forumsregeln
You can start and continue with posts in english language in all other forums as well, but if you are looking for a forum to start a discussion in english, this is the right choice.

Settings & previous crawl data lost after power outage

Beitragvon oneaty » Do Mai 15, 2014 1:24 pm

I'm setting my Yacy server to automatically restart after a power outage.

After changing some BIOS settings, I simulated a power outage by powering off the room while the server (and Yacy) was running. After some minutes, I turned the room's power on again and the server automatically boot.

However, Yacy didn't start.

After some research, I found out that the file /DATA/SETTINGS/yacy.conf was empty, causing startYACY.sh to misspell the command line that would bring Yacy up.

The command line was showing this:

/usr/bin/java - - -server -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Dsolr.directoryFactory=solr.MMapDirectoryFactory -classpath .:htroot:lib/J7Zip-modified.jar:lib/activation.jar:lib/apache-mime4j-0.6.jar:lib/bcmail-jdk15-145.jar:lib/bcprov-jdk15-145.jar:lib/commons-codec-1.7.jar:lib/commons-compress-1.4.1.jar:lib/commons-fileupload-1.2.2.jar:lib/commons-io-2.1.jar:lib/commons-jxpath-1.3.jar:lib/commons-lang-2.6.jar:lib/commons-logging-1.1.3.jar:lib/fontbox-1.8.4.jar:lib/geronimo-stax-api_1.0_spec-1.0.1.jar:lib/guava-16.0.1.jar:lib/htmllexer.jar:lib/httpclient-4.3.3.jar:lib/httpcore-4.3.2.jar:lib/httpmime-4.3.3.jar:lib/icu4j-core.jar:lib/jakarta-oro-2.0.8.jar:lib/jaudiotagger-2.0.4-20111207.115108-15.jar:lib/jcifs-1.3.17.jar:lib/jcl-over-slf4j-1.7.2.jar:lib/jempbox-1.8.4.jar:lib/jetty-client-8.1.14.v20131031.jar:lib/jetty-continuation-8.1.14.v20131031.jar:lib/jetty-http-8.1.14.v20131031.jar:lib/jetty-io-8.1.14.v20131031.jar:lib/jetty-security-8.1.14.v20131031.jar:lib/jetty-server-8.1.14.v20131031.jar:lib/jetty-servlet-8.1.14.v20131031.jar:lib/jetty-servlets-8.1.14.v20131031.jar:lib/jetty-util-8.1.14.v20131031.jar:lib/jetty-webapp-8.1.14.v20131031.jar:lib/jetty-xml-8.1.14.v20131031.jar:lib/jsch-0.1.50.jar:lib/json-simple-1.1.1.jar:lib/jsoup-1.6.3.jar:lib/log4j-over-slf4j-1.7.2.jar:lib/lucene-analyzers-common-4.6.1.jar:lib/lucene-analyzers-phonetic-4.6.1.jar:lib/lucene-classification-4.6.1.jar:lib/lucene-codecs-4.6.1.jar:lib/lucene-core-4.6.1.jar:lib/lucene-facet-4.6.1.jar:lib/lucene-grouping-4.6.1.jar:lib/lucene-highlighter-4.6.1.jar:lib/lucene-join-4.6.1.jar:lib/lucene-memory-4.6.1.jar:lib/lucene-misc-4.6.1.jar:lib/lucene-queries-4.6.1.jar:lib/lucene-queryparser-4.6.1.jar:lib/lucene-spatial-4.6.1.jar:lib/lucene-suggest-4.6.1.jar:lib/metadata-extractor-2.6.2.jar:lib/noggit-0.5.jar:lib/pdfbox-1.8.4.jar:lib/poi-3.9-20121203.jar:lib/poi-scratchpad-3.9-20121203.jar:lib/servlet-api-3.0.jar:lib/slf4j-api-1.7.2.jar:lib/slf4j-jdk14-1.7.2.jar:lib/solr-core-4.6.1.jar:lib/solr-solrj-4.6.1.jar:lib/spatial4j-0.3.jar:lib/webcat-0.1-swf.jar:lib/wstx-asl-3.2.9.jar:lib/xercesImpl.jar:lib/xml-apis.jar:lib/yacycore.jar:lib/zookeeper-3.4.5.jar: net.yacy.yacy


(Note that the first two parameters after /usr/bin/java are empty, thus preventing java from executing)

After that, I reinstalled Yacy in a temporary directory, just to produce a new yacy.conf file.

After that, I copied this file back into /DATA/SETTINGS and could finally get Yacy running again.

But then I realized that all data regarding previous crawls were missing, as if I were running Yacy for the first time.

My questions are:

1 - Is there a way to recover previous crawl data?

2 - What files/directories should I backup so that I'm able to restore Yacy to its prior status?

Note

The piece of startYACY.sh that failed due to the empty yacy.conf was

if [ -f $CONFIGFILE ]
then
# startup memory
for i in Xmx Xms; do
j="`grep javastart_$i $CONFIGFILE | sed 's/^[^=]*=//'`";
if [ -n $j ]; then JAVA_ARGS="-$j $JAVA_ARGS"; fi;
done

# Priority
j="`grep javastart_priority $CONFIGFILE | sed 's/^[^=]*=//'`";

if [ ! -z "$j" ];then
if [ -n $j ]; then JAVA="nice -n $j $JAVA"; fi;
fi

PORT="`grep ^port= $CONFIGFILE | sed 's/^[^=]*=//'`";

# for i in `grep javastart $CONFIGFILE`;do
# i="${i#javastart_*=}";
# JAVA_ARGS="-$i $JAVA_ARGS";
# done
else
JAVA_ARGS="-Xmx600m -Xms180m $JAVA_ARGS";
PORT="8090"
fi
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: Settings & previous crawl data lost after power outage

Beitragvon oneaty » Do Mai 15, 2014 2:24 pm

In regards to "missing crawls", I have some new facts:

1 - What led me to think they were missing was the message that shows up whenever I hover the mouse over System Status, in the left vertical bar "You did not yet start a web crawl! You do not see all monitoring options here, because some belong to crawl results monitoring. Start a web crawl to see that."

2 - However, in /Crawler_p.html, all the "missing crawls" are showing and running.

So, apparently, the sudden power off seems to have created an inconsistency.

I still keep my previous questions:

1 - Is there a way to recover previous crawl data? (Now I would rephrase, "... to create consistency among crawling data and the System Status information")

2 - What files/directories should I backup so that I'm able to restore Yacy to its prior status?
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro

Re: Settings & previous crawl data lost after power outage

Beitragvon davide » Mi Jun 17, 2015 12:40 pm

Were you using a journaled filesystem?

If so, that would be a bad prospective. Otherwise, the FS would be the culprit.
davide
 
Beiträge: 84
Registriert: Fr Feb 15, 2013 8:03 am

Re: Settings & previous crawl data lost after power outage

Beitragvon oneaty » Mi Jun 17, 2015 2:51 pm

I'm running Yacy over Ubuntu 14.04, ext4 file system.


Is that a journaled one?
oneaty
 
Beiträge: 66
Registriert: Mo Feb 04, 2013 12:47 pm
Wohnort: Rio de Janeiro


Zurück zu English

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 2 Gäste