I'm using Yacy to add a search functionnality on a static web site. I've added a search form on my site home page which triggers a jQuery ajax request to my local Yacy server. The static site has been indexed by this Yacy server.
I've a problem with the crawler. Indexing starts from my root URL, let's say http://my.domain.com/
. This page references other pages which are not indexed. From example, in my home page, I've a link that points to http://my.domain.com/documentation
. This page is not indexed.
In fact, when I access to this page, I'm redirected to http://my.domain.com/documentation/
(with a slash at the end). The crawler doesn't seem to manage this case. In the log, I found this :
I 2016/09/13 10:40:24 REJECTED http://my.domain.com/documentation/
- cannot load: load error - java.io.IOException: CRAWLER Redirect of URL=http://my.domain.com/documentation to http://my.domain.com/documentation/
placed on crawler queue for double-check
I 2016/09/13 10:40:24 LOADER CRAWLER ..Redirecting request to: http://my.domain.com/documentation/
I 2016/09/13 10:40:24 LOADER CRAWLER Redirection detected ('HTTP/1.1 301 Moved Permanently') for URL http://my.domain.com/documentation
I 2016/09/13 10:40:24 LoaderDispatcher waited 5002 ms for http://my.domain.com/documentation
Is there a way to index that kind of page ? A crawler parameter for example ?
Thanks in advance for your help.