I came across Yacy recently and I must say what the project has achieved sofar is quite impressive for me. I'm trying to evaluate Yacy as a local file / local web indexer in an intranet scenario, because I see a very valid use case for such a project where you can mix both index data from internal web-based resources like websites, wiki pages, problem trackers, etc, and unstructured data from fileservers and present the results in one consistent UI.
Firstly, I understand that this use case is probably not the first focus of this project, so I can understand that some features, like data access rights for specific content and the subsequent omission from the results for certain users, will be unavailable from this search engine. So I'd like to limit the scope of the indexed data to information that should be available for all users of the implementation.
But even with this in mind, Yacy doesn't seem to be able to do what I want it to do, for two simple reasons :
-> When it comes to indexing the content of file servers in the LAN, I suppose I should use an smb:// link. However, no sane person ever makes file servers available without using a password, and I haven't found any way yet to make yacy authenticate to a remote file server with a login and password combination.
-> When it comes to internal wiki pages, hardly anyone ever setting up an internal information system in a multi-user environment will set this up without using some kind of authrorization to the content. Again, I've yet to find a way to make handle this kind of authorization.
Maybe this functionality can be implemented in a future version? It's probably not very hard to do since all other indexers that have this kind of use case do it and would mean the difference between Yacy being usable or not in an intranet setup. Again, if there's another way to do this, I'd like to learn about it, the Wiki wasn't any help in this regard.