I have learned German in the past, but I will continue in English sorry for that !
At first, let me say that I discovered Yacy a very short time ago, and I can say, it looks very powerfull
. Furthermore, the windows installation is very easy and it is available in several languages which is quite impressive ! Some pages of the website are translated in French, some video tutorials exist in english, that is good too. Sadly I can't find any French Yacy community
. I have found some old articles from 2011 but it seems that Yacy has been strongly improved since this time.
I have red a little the documentation and have watched the tutorial videos. But I have not undestood well the default behaviours of a fresh install (after the basic configuration done) :
Does Yacy nodes crawl permanently and automatically all the world wide web, or should I manualy define the websites which should be crawled on my computer ?
In my understanding, by default, Yacy don't index anythink until you configure some websites or sources to be crawled.
If that is true, I think it could be very interesting to develop a feature which allows all nodes to crawl automaticaly all the web, following some basic rules on which pages should be crawled in priority (frequently updated pages, banned or priority topics definedb y default or by node owners, etc..) and introducing may be some coordination between nodes (don't crawl again a page which has just been crawled by an other node).
I understand that this feature could require some new developments
, but I imagine the power of this kind of system: very quickly, much more pages would be indexed by Yacy and we could expect to do not use any proprietary search indexer anymore
and invite our non geeg friends and family to use and install Yacy (themselve indexing the web without configuring anything)!
Thank you if you can answer me, and sorry if I have not well understood the functionality of Yacy and how it should be used
Let discuss of this feature on this topic if you are interested