1.4 ranking problem

Hier finden YaCy User Hilfe wenn was nicht funktioniert oder anders funktioniert als man dachte. Bei offensichtlichen Fehlern diese bitte gleich in die Bugs (http://bugs.yacy.net) eintragen.
Forumsregeln
In diesem Forum geht es um Benutzungsprobleme und Anfragen für Hilfe. Wird dabei ein Bug identifiziert, wird der thread zur Bearbeitung in die Bug-Sektion verschoben. Wer hier also einen Thread eingestellt hat und ihn vermisst, wird ihn sicherlich in der Bug-Sektion wiederfinden.

1.4 ranking problem

Beitragvon yugongtian » So Mär 24, 2013 7:42 am

Hello

I have used 1.4 last version to crawl the drupal sites.
Like a the img photo .The default drupal home page not the top result.

I have try many way , how can make the result right?
RankingRWI_p.html set Authority of Domain 15 no eff
RankingRWI_p.html set URL Length 15 no eff

How to make the home page is the top result? :|
Dateianhänge
11.PNG
11.PNG (50.32 KiB) 2611-mal betrachtet
yugongtian
 
Beiträge: 11
Registriert: Sa Mär 09, 2013 6:28 am

Re: 1.4 ranking problem

Beitragvon Low012 » Mo Mär 25, 2013 12:07 pm

http://localhost:8090/RankingSolr_p.html should be the place to tweak the ranking of content contained in the "new" integrated Solr instance.

Unfortunately I have not played around with it myself yet and I don't see any field for URL-length.
Low012
 
Beiträge: 2214
Registriert: Mi Jun 27, 2007 12:11 pm

Re: 1.4 ranking problem

Beitragvon yugongtian » Mo Mär 25, 2013 3:20 pm

Thanks.
Do not know how to control it .It's hard to me.but thanks.
RankingSolr_p.html and RankingRWI_p.html what's the diff?
solr boosts default title vl is 100 , change it to 10 ,and change url_paths_sxt to 100 ,result also the same.
The top result can't show homepage. :roll:
Have a good day.
yugongtian
 
Beiträge: 11
Registriert: Sa Mär 09, 2013 6:28 am

Re: 1.4 ranking problem

Beitragvon Orbiter » Mo Mär 25, 2013 7:08 pm

the field for the url length is named in the solr schema. to use it, it must be put into a function query. this is actually the default for queries with site-operator and then urls with bigger length is ranked higher. you may want the opposit for general search. please wait two weeks until I am back from holiday and then I have the tools to explain in detail. please remind me.

Sent from my HTC Vision Using ForumTouch for Android
Orbiter
 
Beiträge: 5792
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: 1.4 ranking problem

Beitragvon yugongtian » Di Apr 02, 2013 4:25 pm

:D
Thanks waiting for you.
yugongtian
 
Beiträge: 11
Registriert: Sa Mär 09, 2013 6:28 am

Re: 1.4 ranking problem

Beitragvon yugongtian » Do Apr 11, 2013 1:18 pm

:?: Are you back?
yugongtian
 
Beiträge: 11
Registriert: Sa Mär 09, 2013 6:28 am

Re: 1.4 ranking problem

Beitragvon Orbiter » Do Apr 11, 2013 1:43 pm

yes I am just fixing some problems with the clickdepth and references counters which are important for the ranking...
Orbiter
 
Beiträge: 5792
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: 1.4 ranking problem

Beitragvon Orbiter » Fr Apr 12, 2013 3:36 pm

now after some checks with the ranking, bug fixes and some extensions I will start to write a wiki article about ranking.
There are now three new attributes about counters for external references:
references_internal_i, references_external_i, references_exthosts_i

With these, I am currently testing the following formula for a ranking function:
Code: Alles auswählen
div(add(references_internal_i,product(references_external_i,references_exthosts_i)),add(clickdepth_i,1))
Orbiter
 
Beiträge: 5792
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: 1.4 ranking problem

Beitragvon yugongtian » Sa Apr 13, 2013 1:55 pm

Good news . Thanks a lot. ;)
yugongtian
 
Beiträge: 11
Registriert: Sa Mär 09, 2013 6:28 am

Re: 1.4 ranking problem

Beitragvon Orbiter » Sa Apr 13, 2013 4:57 pm

ok, this needs a bit of explanation: the new fields must be filled with a web crawl to make it functional, and the formula as given above is purely experimental. It considers the number of external links to a web pages and the number of different external domains as important and increases the ranking further if the web page has a low click depth. All values which appear in the forumla are computed in a two-pass process:

- first the documents are indexed and a web structure index is generated in parallel. The references and clickdepth values are filled with dummy values and the document gets also a 'ready for postprocessing' flag.
- when all crawls are finished, a postprocessing step is performed: all documents with the postprocessing flag are then filled with the actual values after a clickdepth computation and a reference count. This can only be done after all crawls because only then the information is present.

That means right after the crawl is finished the ranking formula using this values will not work, you must wait additionally until the postprocessing is finished. This can currently only be monitored in the log, not in the web interface. However, this process is pretty fast.

The counting of external references and the clickdepth can be consideres as something like a 'poor mans citation rank' which can be the basis for a page-rank-like second postprocessing step. Before the development for this can start we need more experience with the current formula.

Please do your own experiments with the formula and give a feed-back for enhancements here!
Orbiter
 
Beiträge: 5792
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: 1.4 ranking problem

Beitragvon Orbiter » Mo Apr 15, 2013 12:27 pm

the wiki document about the new ranking rules is here (at this time unfinished):
http://www.yacy-websearch.net/wiki/index.php/En:Ranking
Orbiter
 
Beiträge: 5792
Registriert: Di Jun 26, 2007 10:58 pm
Wohnort: Frankfurt am Main

Re: 1.4 ranking problem

Beitragvon yugongtian » Mi Apr 17, 2013 3:50 pm

Thank you very much helpful documents.
I am learning solr sort and yacy ranking, although some difficulty, thank you very much enthusiastic reply.
:)
yugongtian
 
Beiträge: 11
Registriert: Sa Mär 09, 2013 6:28 am

Re: 1.4 ranking problem

Beitragvon yugongtian » Sa Apr 20, 2013 1:20 pm

Thanks your are right. Can make most homepage show on top result.
But some small problem , the sub domian will top of homepage some case.

Like this :

ad.xxx.com
http://www.xxx.com
shop.xxx.com

Any helps?
Dateianhänge
ero-fix.JPG
ero-fix.JPG (74.12 KiB) 1988-mal betrachtet
yugongtian
 
Beiträge: 11
Registriert: Sa Mär 09, 2013 6:28 am


Zurück zu Fragen und Antworten

Wer ist online?

Mitglieder in diesem Forum: 0 Mitglieder und 5 Gäste