Beitragvon davidk » Mi Jan 08, 2014 2:54 pm


I wonder if anyone have got YaCy to run stable? With stable I mean days / weeks without problems, and that YaCy are down only when it is shut down. If so, what kind of environment have it run in? Linux, Mac or Windows? What version of java (eg Open JDK / Java SE). Is it a dedicated machine? What kind of memory specs on the machine? And what kind of memory changes from the default setup have you done to java? (changes in javastart_Xmx and other tuning in the GUI)


Re: Stable YaCy

Beitragvon David » Do Jan 09, 2014 8:22 pm

davidk hat geschrieben:I wonder if anyone have got YaCy to run stable? With stable I mean days / weeks without problems, and that YaCy are down only when it is shut down.

If you have yacy up and running, you can browse to: Administration / Yacy Network / Active peers and then look at the uptime column to find out for how long the active peers are running. One seems to be online since 163 days. I think there was once a peer up for over a year, without downtime.

davidk hat geschrieben:If so, what kind of environment have it run in? Linux, Mac or Windows?

In my opinion, it runs very well under linux. I'm using the mint distro for my own yacy.

davidk hat geschrieben:What version of java (eg Open JDK / Java SE).

I don't know. Some people say Java SE is faster, but I'm using Openjdk, because it's open source and it's preinstalled in linux mint.

davidk hat geschrieben:What kind of memory specs on the machine? And what kind of memory changes from the default setup have you done to java?

16 GB RAM installed in my computer, and in the performance settings of yacy I have set 8 GB. (The bigger your index gets, the more RAM it needs. As far as I know, with 15 GB of RAM you should be able to maintain an index with around 50 millions of links, but it also depends on other factors. However, If yacy runs out of RAM, it will crash and you probably wont be able to start it up anymore.)
Re: Stable YaCy

Beitragvon davidk » So Jan 19, 2014 1:10 pm

I am starting to get a bit fed up, to put it nice. I have been working professionaly with Linux for 10 years, and I asses YaCy to be very unstable. Maybe you could get it stable somehow by having expert insight into the code as a developer, but it don’t work right out of the box like it is stated on the web page. The result will be that users download the software to try it out, don’t get it to work and just delete it. That in turn will put an effective stop on recruitment of YaCy users.

Since mid of december I have tried to put up a stable YaCy installation, and my setup is right now as follows. Dedicated server Core 2 Duo E6750 CPU, with 8GB ram, 240GB 10k RPM disks RAID1. Linux Debian Wheezy (kernel 2.6.32) with Java SE 1.7.0 environment. I am running YaCy 1.66 with an external solr 4.6.0 instance running under tomcat 6.0.35. Both tomcat and YaCy is running with java args Xmx2g and the system have a script running every hour as a cron job that free up system memory. Beside apache with a mod_proxy allowing access through port 80, the system is running the YaCy installation exclusively. Internet connection is a 50/50mbit/s dedicated fiber line.

Since mid december I have encountered the following problems:

• Unstable YaCy, not configureded (or stable enough) to run right out of the box
• Not good enough documentation. It lacks English documentation, and the german documentation which I have translated with google translate lack the following:
------o Detailed information about how the system actually work. Which files does what? How is it all logical build up? It is far to superficial and most of it describes only how the GUI work and seem aimed at newbie users (which this system is not stable enough to be aimed as a user base in mine opinion)
------o What do the configuration alternatives do? How do I customize yacy log files to output the logging I need? (e.g whats the difference between PROXY.level and PLASMA.level). What do all the alternatives in the yacy.conf files mean? How can I “downtune” it, so it don’t crash? (if it is even able)
------o Detailed info about how to configure my node to weight the results the way I would like to present them. As a Norwegian node I would weight content from .no domains very high, and I would like to create a index of relevant sites.
• Language. The forums is mainly in German. That makes non-german speaking people having trouble asking for help. The first thing that meet me on the English forum was loads of spam. Most of the admins where inactive and the owner was long gone, but I got in touch with one of the maintainers and gave him some advice on running forums and much of the spam is now sorted out.

I am fed up, and the only reason I am not deleting the software right now is that I would like to contribute to create a counter weight to foreign states survival regimes.

So. The first thing I would like to comment on is the project aim. Stop aiming the software to home users. The software is not for home users, it is really not. You need good hardware to be able to run it and you can’t use memory intensive programs while you are running it (e.g. games). The software should really be aimed at people running a dedicated server and want to contribute.

Be concrete about the environment needs. Which java version that is recommended, and what is the minimum memory. (it is a lot more than 4GB, that’s for sure.)

The next thing you should do is to optimize the package to run in such a environment. I really start to wonder about your competency when you are distributing the package with 600mb start memory. It will run out of memory in hours. It makes it impossible to use out of the box in a dedicated system as well as a home user system, and novice server owners will not be able to tune the memory and give up. Effectivly rejecting user mass you actually could get hold off. The same goes to the settings in yacy.conf. Optimize it to work on a dedicated system with enough resourses and flag to the community that you need that hardware specs.

This was a long post, but I am angry. I feel tricked into using a month on a system that don’t work as it is promised. I am not angry about you guys making YaCy, because YaCy is free and the idea is good. But I am really angry about how you communicating this software as a “search engine that anyone can use to build a search portal for their intranet or to help search the public internet” with “installation takes only three minutes. Just download the release, decompress the package and run the start script.”

Last I would just say that I did some statistics about the active peers in the YaCy network. Yes, some has over 100 days uptime. But the average uptime was 4 days. That is really bad. More bad is that the median uptime is 1 day. ONE DAY! Only 3,2% of the members have a uptime over 20 days and only 7,4% have uptime over 10 days. That means that 92,6% of the peers have 9 days uptime or less! It is not good at all, to be honest it is really awfull.

So to conclude: I would like some comment from the developers on this. What is the strategy? I would really like to contribute to this project. But things really need to change regarding strategy, distribution and documentation if you want this project to succeed.

Re: Stable YaCy

Beitragvon Yududi » So Jan 19, 2014 5:35 pm

Hi David,

what I can read out of your lines is not a problem of YaCy but a problem of more supporters.
As you can see on the commits here: ... 0220733f55 there are just 2-3 main developers working on YaCy.
Those may have a normal job and can't work the whole day on this project.
What they can do, they do, when you have a look at the forum and the wiki.
To change this situation there exists 2 or more alternatives:
1. Help YaCy through coding.
(Download the Source, import the project into Eclipse, identify code which you can optimize, optimize it and push it to the repository )
2. Help by donating money.
(So the core developer do not need to work so much and spend that time on YaCy or pay developers)

To the english section:
YaCy is a decentralized search engine. So it would be good if this also would work for the rest.
Have a look at: ... 8249160704
There has been 1 guy I think who created an english forum you mentioned but he stops working.
Its no problem to setup a forum but mainly a legal problem to run websites in several languages it costs a lot of time and money as you can see viewtopic.php?f=12&t=4872 in the last post.
So the easiest way would be if everyone who wants a forum in his language just setup a forum and this will be linked here -> problem solved.
The main developers also could visit these forums to answer question as if they were posted here.

To the uptime:
I dont think the uptime can be used for any statistics about the stability.
I have one local peer running which is offline when I shut down my computer.
I also have 1 remote peer running and I used it with 2 GB for 1 month with crawler. After an update (which causes my peer to start running from 0 again) I tried 600 MB for YaCy without a crawler just DHT and it works for 3 1/2 days now and it keeps running.

My conclusion:
Continue to support the community by whether you suggest it to friends or start contributing code (optimization) or you setup a forum or ...
I for example recently found a bug in the RSS-Feed and never worked with Gitorious and the last time I programed with Java was really years ago ... but I downloaded Eclipse and corrected the code and submitted it to the repository to try it out. After some days it was checked and brought into the actual release. I really would contribute more to YaCy but as long as I do this in my spare time its not possible.
I suggest you if you run into trouble with your YaCy peer fill in a bug report or maybe start a thread in this forum.
YaCy is the only Open Source Search Engine that I know at the moment which also would work if you just have 1 Computer available and you would live in a war-zone just like in Syria where maybe the Internet is shut down at the borders.
When I look at your post you really want the best for YaCy and this is good because YaCy needs more people like you so thank you for your support so far.

PS: I really wonder why no state supports the idea behind YaCy.
Imagine what yould be done with just 1 Million Euro/USD.
Europe has no alternative to Google.
Why is such a project not supported?
We spend millions for bridges at highways where just 2 people or frogs crosses the street in 1 whole year.
But nevertheless the YaCy developers will continue their work on YaCy and do what they can for it-
Re: Stable YaCy

Beitragvon krzyszp » Mo Jan 20, 2014 11:49 am

I'm afraid that davidk is right - stability IS a problem.
I have had setup YaCy on VPS machine, with dedicated domain for it. Works fine - for couple of days. Then site made unacessible ("Service Temporarily Unavailable" error). I still see in "top" that all YaCy processes are running, memory used in 50%, HDD in 56%, system load 1%... I have no idea why YaCy doesn't show it's page. Yacy is set on Ubuntu server (50GB SSD drive, 4 cores, 4GB RAM).

Also, I have set YaCy on second virtual server (on top of dedicated machine, with only this one VPS on it) with 8GB RAM and 120GB HDD. Ubuntu 12.04. Same situation.

Most stable is YaCy on my Windows desktop machine, but this not a solution for me...
Re: Stable YaCy

Beitragvon davidk » So Jan 26, 2014 6:39 pm

I have done some more tests. Of several tests I find this one very interesting:

Server is doing one crawl, 30PPM. Network is set peer-to-peer mode. Inbound traffic on port 8090 has been firewalled during the test period. The installation was working fine and got remote results from other peers just fine for a whole week. (Sunday to Sunday)

Then, I opened port 8090. Two hours later the software was unresponsive. The process was running and port 8090 was open but when telneting to the port the server was accepting the connection but not responding to any HTTP-commands. The process could be killed without force. (using normal kill PID, not kill -9 PID)

This shows that the server are not responding well to inbound peer-to-peer requests. My investigation shows that during the timeframe of about 2 hours the port was open the server got about 8.000 requests on port 8090. I will do some more tests, trying to analyze the traffic more using some network tools.

What do you think? Is this just YaCy overwhelmed with traffic from its peers, or could it be some kind of DoS attack towards the network? Why is it the software just stop responding? (not crashing/exiting)
Re: Stable YaCy

Beitragvon CaptainPsycho » Mo Jan 27, 2014 12:42 pm

Hi davidk,

i think you are right in many o fyour points.

I think it would be good to provide different configurations.
- just searching
- just DHT
- just crawling
- combinations of the upper ones

and each one for different memorysizes

The normal home user might just want to search.

My conclusions on stability:
- never use the dev versions
- oracle java seems to run more stable then icedtea
- when crawling you have to reduce crawling speed to a point where the io is less then 100% of what the disks can manage /Performance_p.html
- when crawling my yacy was crashing within 5 days with a chanche of 100%

My current stragedy:
- just DHT distribution
- set timeout for DHT to 1000ms /PerformanceQueues_p.html
- just 10000 words in wordcache /PerformanceQueues_p.html
- runs stable for over two weeks with just 1G RAM and currently 11Mio. documents

NoGos wich should get fixed:
- yacy eating up ram und stalling with 100% cpu utilization
-- perhaps some kind of watchdog wich tries to restart yacy and sens mail if problem occurs more the x times in x days / hours
- yacy not starting again with same amount of ram it ran before, cause this normaly means you are loosing your index :(

Communication in the forum should be english in general. I think most people writing in german just because it's easier.

To summarize: It would be nice to develop yacy more like a product. Perhaps it would be an idea to provide a out of the box version like openelec does for xbmc. So you can ran yacy in an special configurated VM.
