Schlechte PErformance in der Suche / Frontend trennen?

Keine Scheu, hier darf alles gefragt und diskutiert werden. Das ist das Forum für YaCy-Anfänger. Hier kann man 'wo muss man klicken' fragen und sich über Grundlagen zur Suchmaschinentechnik unterhalten.
Forumsregeln
Hier werden Fragen beantwortet und wir versuchen die Probleme von YaCy-Newbies zu klären. Bitte beantwortete Fragen im YaCy-Wiki http://wiki.yacy.de dokumentieren!

Schlechte PErformance in der Suche / Frontend trennen?

Beitragvon Choey » So Okt 09, 2016 8:47 am

Hallo zusammen
Ich habe das seit Jahren mal wieder aufgesetzt, lief auch alles reibungslos.
Die Performance wird allerdings schlechter, je länger das läuft, warum weiß ich nicht so ganz. Läuft in einer VM, RAM ist gut, CPU ist gut. Ich vermute die Festplatte könnte hier ein flaschenhals sein (IOWait ziemlich hoch)?
Jedenfalls, für die Administration ist das noch ausreichend. Fürs Suchinterface nicht. Ich habe irgend wo schon gelesen das es möglich ist, die Suchseite und das System zu trennen. Würde das helfen? Wenn ich das auf einen anderen Server mache, bringt das aber vermutlich relativ wenig (Wäre dann ja eigentlich nur ein Yacy das nicht crawled)? Und auf dem selben Server wäre es auch langsam wenn das an der Festplatte liegt?
______________ENG
Hello
After years i decided to set up YaCy again, everything worked fine.
But the Performance is decreasing the longer its running, im not entirely sure why. Its running in a VM, RAM and CPU are ok, im guessing HDD (High IOWait)?
Its OK for the Admin-Interface, but not for the search engine. Ive read its possible to have seperate search- and systemsites. Would this be useful? If i have the Search on a different Server, it wont help me much (would be a Yacy thats not crawling)? And on the same server it would just be as slow because of the slow HDD?
Choey
 
Beiträge: 30
Registriert: Di Mär 24, 2009 8:58 pm

Re: Schlechte PErformance in der Suche / Frontend trennen?

Beitragvon luc » So Okt 09, 2016 7:38 pm

Hello Choey,
Can we have an idea of how many documents you have in your local Solr index?
Do you perform your own crawls and if so do you have performance problems when crawls are concurrently running?

Please note also that when searching in p2p mode, other YaCy peers are realtime requested, so a poor network bandwidth will affect performances...
luc
 
Beiträge: 294
Registriert: Mi Aug 26, 2015 1:04 am

Re: Schlechte PErformance in der Suche / Frontend trennen?

Beitragvon Choey » Mo Okt 10, 2016 5:06 am

The Documents in the SOlr Index is just the Documents? Then its 722.000. Are multiple Crawls a Problem?
The Performance is not consistently bad. Image Search sometimes results in almost no images displayed and sometimes works just fine (same goes for text search, but its more visible with images)
Network bandwith should be ok, its a hired root server on a 100 MBit symmetrical connection. (Its a Gameserver wich currently isnt running games...)
Choey
 
Beiträge: 30
Registriert: Di Mär 24, 2009 8:58 pm

Re: Schlechte PErformance in der Suche / Frontend trennen?

Beitragvon luc » Di Okt 11, 2016 7:53 am

Multiple crawls running at the same time is not a problem. I was rather wondering if you were always performing search queries while crawls were running... This also should not be a problem, but obviously crawling consume resources on your YaCy peer, and therefore when you crawl and search at the same time, all resources are not exclusively available to perform the search.

In web portal mode, you are likely to have consistent performances across search queries (only your YaCy server is requested on its own data), but in peer-to-peer mode many factors may have some impact :
- the network is made of eventually very different kinds of nodes : even if your CPU, memory and network bandwidth are high, it is not necessarily the case of the other peers at the time you search something
- the size of your own peer index (a mix a Solr index and parts of the globally distributed index (RWI)) : with a large local index, your own peer may find rapidly what you search in its own index without need to query the others. On the other hand, when your own index is becoming huge it can affect negatively performances (722000 indexed documents should not be a problem, but with some millions I wouldn't bet)
- the popularity of your search terms : YaCy global index is designed to do its best to distribute homogeneously the index between peers, but every peer eventually choose what is in its own index (by choosing what to crawl, blacklisting, index cleaning operations...). So if you search something that is not already on your peer and is only indexed on a few peers, it may take some time to answer...
- images : images URLs are in the index like any other text resource, but need to be loaded to display some preview. Your YaCy peer has a cache which may contain some images you previously crawled or searched. But when not in the cache, each website originally hosting the image has to be queried concurrently to retrieve it, and some may not answer as quickly as you expect.

I do not pretend to know exactly all implementation details of the current YaCy, but I hope this points will help you have a better understanding of YaCy behavior.

Best regards
luc
 
Beiträge: 294
Registriert: Mi Aug 26, 2015 1:04 am

Re: Schlechte PErformance in der Suche / Frontend trennen?

Beitragvon Choey » Di Okt 11, 2016 10:42 am

This indeed helps a lot understanding how this works and what happens during search. Thanks.
Choey
 
Beiträge: 30
Registriert: Di Mär 24, 2009 8:58 pm


Zurück zu Hilfe für Einsteiger und Anwender

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 1 Gast

cron