Yacy grabbing all the RAM - too many JVM processes

Discussion in English language.
Forumsregeln
You can start and continue with posts in english language in all other forums as well, but if you are looking for a forum to start a discussion in english, this is the right choice.

Yacy grabbing all the RAM - too many JVM processes

Beitragvon layst » So Mai 10, 2015 9:58 pm

Hi there,

I used too run Yacy on a dedicated small A20 Olinuxino board. Now I start to run a more powerful server, and still intend to use some of its resources to contribute to the index. However, when running Yacy, all the RAM (4Gb) is used, whereas I set the limit of the JVM to the standard 600 Mb. I saw from "pstree" and "htop" that quite a number of java processes (~190 for instance) are running, which I guess explain that all the RAM is used. I had the same problem before, though I assumed this issue was related to the resources of the board.

Beside, the "free" command shows that quite a part of the RAM is used in cache (~1Gb). When stopping Yacy and clearing the cache, I get 3,4 Gb of free RAM, while when Yacy runs all the memory is eaten up (message of disabling DHT at less than 50 free mb of memory by Yacy).

I think the closest post on the topic is the following http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5266&p=30258&hilit=memory#p30258; it did not receive an answer at the time.

Did anyone encountered this kind of problems, and what would be the solution? I would like to limit the use of Yacy to, say, 1Gb in memory.

Thanks,
layst
 
Beiträge: 11
Registriert: Di Okt 28, 2014 7:59 pm

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon davide » Mo Mai 18, 2015 6:10 pm

My 2¢ in case you won't receive any acknowledgeable answer.

  • the number of crawler threads: have you increased the amount of crawler threads? Hypothesis: perhaps these threads might actually be separate processes.
  • start script: do you start yacy via any custom script? Hypothesis: perhaps you are starting multiple instances.
davide
 
Beiträge: 78
Registriert: Fr Feb 15, 2013 8:03 am

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon layst » Mo Mai 18, 2015 8:54 pm

Thank you for your suggestions.

I do not use any custom script to start yacy, and it does not feel yacy start several times (I assume some strange stuff would appear on my ports, and the log would tell me some things about it, or the new instances would crash maybe).

Your suggestion about config first surprised me, since I can not remember tinkering with the config. However, when I try a fresh Yacy install, only 60 processe show up in "pstree" (up to 85 if I display the web portal), for about 300 Mb. Hence I guess you 've got a point there: something must be wrong with my config.

So thanks again for your helpful reaction; I shall tell you more when further investigated.
layst
 
Beiträge: 11
Registriert: Di Okt 28, 2014 7:59 pm

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon davide » Mo Mai 18, 2015 9:16 pm

After a fresh install, my pstree for the whole VM looks like:

Code: Alles auswählen
init-+
     |-init-+-console-kit-dae---3*[{console-kit-dae}]
     |      |-cron
     |      |-dbus-daemon
     |      |-ddclient - slee
     |      |-exim4
     |      |-java---81*[{java}]
     |      |-polkitd---{polkitd}
     |      |-postgres---5*[postgres]
     |      |-rsyslogd---3*[{rsyslogd}]
     |      |-saslauthd---saslauthd
     |      |-sshd
     |      |-udevd---2*[udevd]
     |      |-upstart-socket-
     |      |-upstart-udev-br
     |      `-xinetd


The single java process is YaCy during crawling. The nesting of init is simply due to virtualization.
I'd assume there are multiple instances of YaCy running concurrently in your computer. Maybe cronjob?
davide
 
Beiträge: 78
Registriert: Fr Feb 15, 2013 8:03 am

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon layst » Mo Mai 18, 2015 9:22 pm

My pstree looks much the same; when I talked about the multiple processes, I thought of the '81' number you show. My mistake it seems, is this second number the number of threads?

EDIT:
'man pstree' says it all :
"Child threads of a process are found under the parent process and are shown with the process name in curly braces, e.g.
icecast2---13*[{icecast2}]
"

sorry for my lack of precision in terming the issue
layst
 
Beiträge: 11
Registriert: Di Okt 28, 2014 7:59 pm

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon davide » Mo Mai 18, 2015 9:35 pm

That's the number of spawned threads, see: http://man.cx/pstree .
However I'm not sure on how to reduce YaCy RAM footprint.
BTW, how many documents are in your index?
davide
 
Beiträge: 78
Registriert: Fr Feb 15, 2013 8:03 am

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon layst » Mo Mai 18, 2015 9:39 pm

I think my index was at something like 10 millions documents, for ~22Gb. I am trying to restart it right now, but the log shows there were a number of... well I do not know how to term this then ... works unfinished? see there a typical line of the end of the log right now:
I 2015/05/18 22:35:30 MEMORY performed explicit GC, freed 1 KB (requested/available/average: 102400 / 95820 / 46 KB)
E 2015/05/18 22:35:31 TABLE 0003.stack: not enough RAM (93MB) left for index, deleting allocated table space to enable index space allocation (needed: 100MB)


So I will let yacy work tonight and see if I can do anything to limit the number of threads once it has started for good.

EDIT
it has started, showing 110 threads, and a lot of memory used for cache
layst
 
Beiträge: 11
Registriert: Di Okt 28, 2014 7:59 pm

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon davide » So Mai 24, 2015 5:59 pm

I guess you are not receiving an "authoritative" answer because the problem is simply lack of ram.
10M records seem to commonly take 2÷4GB of ram, I guess most of them are for the reverse words index, which won't shrink in size by reducing the number of threads.

I'm facing the same problem right now, and I'm buying new hardware as a solution.
Alternatively, Linux has a (stable?) module to compress ram and/or swap pages.

Besides, have you checked your VM memory settings? Maybe the 600MB limit you mentioned is just the starting memory:
Code: Alles auswählen
egrep 'Xmx|Xms' ./DATA/SETTINGS/yacy.conf
davide
 
Beiträge: 78
Registriert: Fr Feb 15, 2013 8:03 am

Re: Yacy grabbing all the RAM - too many JVM processes

Beitragvon layst » Mo Mai 25, 2015 10:19 pm

I do agree with you about the reverse words index cause, for some hints I came across when looking at the logs and trying to set proper settings to reduce the RAM footprint.

About the initial memory, I did set it to different values, but it does not have any impact on the RAM used for caching.

As for changing the hardware, I just upgraded my install. I intended to dedicate some of its resources to Yacy, but if I have to commit nearly all the RAM to this single process it might affect my other uses of the server. Hence at the moment I do not really know if I keep on going with Yacy.

I wish there was a way to free all this cached RAM, but I do not know the mechanics of the software and do not feel like I have enough time to learn it. Still, maybe storing the RWI in a file (or several) could offer a solution, even if it would be slower than caching? I guess that developpers have already thought about this problem and this type of solution, and have made a decision about it.

So I suspect there is indeed no easy solution within my reach, beside upgrading the hardware, which I won't do in a near future.

In any case, thank you for your concern.
layst
 
Beiträge: 11
Registriert: Di Okt 28, 2014 7:59 pm


Zurück zu English

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 2 Gäste

cron