Reducing the SOLR index

Keine Scheu, hier darf alles gefragt und diskutiert werden. Das ist das Forum für YaCy-Anfänger. Hier kann man 'wo muss man klicken' fragen und sich über Grundlagen zur Suchmaschinentechnik unterhalten.
Forumsregeln
Hier werden Fragen beantwortet und wir versuchen die Probleme von YaCy-Newbies zu klären. Bitte beantwortete Fragen im YaCy-Wiki http://wiki.yacy.de dokumentieren!

Reducing the SOLR index

Beitragvon otter » So Jan 15, 2017 8:32 pm

Hi all,

as I need some hard drive space, I was thinking about ways to reduce the size of the index.
So I removed some fields from the schema and started a re-indexing.
To my surprise the size of the index was quickly INcreasing not decreasing.
What can I do?

Andreas
otter
 
Beiträge: 15
Registriert: Mo Feb 10, 2014 9:33 pm

Re: Reducing the SOLR index

Beitragvon luc » Mo Jan 16, 2017 8:00 am

Hi otter,
an easy way to quickly gain some disk space can be to delete older documents in the Index Administration (/IndexDeletion_p.html) page, "Delete by Age" section.

Best regards
luc
 
Beiträge: 283
Registriert: Mi Aug 26, 2015 1:04 am

Re: Reducing the SOLR index

Beitragvon sixcooler » Mo Jan 16, 2017 8:07 pm

Hi,

in order to see the effect of less hdd usage,try to merge the index at /IndexControlURLs_p.html into one.
This will force to write the content into a new index file (which may be huge!) without Documents or Fields that are just marked as deleted.

Cu, sixcooler.
sixcooler
 
Beiträge: 490
Registriert: Do Aug 14, 2008 5:22 pm

Re: Reducing the SOLR index

Beitragvon otter » Mi Jan 18, 2017 9:18 pm

Thanks, sixcooler!
I reduced the number of segments step-by-step (from 14 to 7) and already gained 40GB!!

Two follow-up questions:
a) Does the number of segments have any impact on performance?
b) Wouldn't it be useful to have a feature that rewrites an index segment without merging them?

Thanks and have fun, Andreas
otter
 
Beiträge: 15
Registriert: Mo Feb 10, 2014 9:33 pm

Re: Reducing the SOLR index

Beitragvon sixcooler » Mi Jan 18, 2017 11:40 pm

Hi Andreas,

a) according the docs and my experience the less the count of segments, the better will be the performance - but I didn't really benchmark that
b) there was such a feature,but we decided to remove that, because solr does its job during merge better than an optimize after crawls
A manual Optimize is usefull only when a change was done at index, like removing Fields or deleting a lot of documents.

Cu, sixcooler.
sixcooler
 
Beiträge: 490
Registriert: Do Aug 14, 2008 5:22 pm

Re: Reducing the SOLR index

Beitragvon otter » Sa Jan 21, 2017 12:16 pm

Thanks, sixcooler!
After I reduced to four segments, all old segments were replaced by new ones. So I stopped there and gained 50GB in total.
Take care!
otter
 
Beiträge: 15
Registriert: Mo Feb 10, 2014 9:33 pm


Zurück zu Hilfe für Einsteiger und Anwender

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 1 Gast