rayhab — 2011-04-26T11:29:10-04:00 — #1
I am trying to understand why my fulltext queries are returning these scores, they don't make sense to me.
First off, here is my fulltext create
ALTER TABLE main ADD FULLTEXT(col1, col2, col3);
In my table i have
main('Hello world / even more stuff', '', '')
main('Hello world / even more stuff', 'testing', 'tester')
main('Hello world / even more stuff', 'Hello world', 'Hello world')
Now here is my query that gets the scores
MATCH(col1, col2, col3) AGAINST('+"hello world"' IN BOOLEAN MODE) as blnScore,
MATCH(col1, col2, col3) AGAINST('hello world') as intScore
MATCH(col1, col2, col3) AGAINST('hello world')
ORDER BY blnScore DESC, intScore DESC
You would think that the row with 'Hello world' in every column would have the highest score, since the index is on every column, but that is not the case.
main('Hello world / even more stuff', '', '') //Score - 6.48660516738892
main('Hello world / even more stuff', 'testing', 'tester') //Score - 6.34852838516235
main('Hello world / even more stuff', 'Hello world', 'Hello world') //Score - 3.5588550567627
Any ideas of what the problem could be?
r937 — 2011-04-26T13:54:42-04:00 — #2
how many total rows in the table?
rayhab — 2011-04-26T14:36:10-04:00 — #3
Those are the only three rows.
r937 — 2011-04-26T14:53:23-04:00 — #4
load your table with 3000 rows, and try again
the mysql manual specifically states that "For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results."
rayhab — 2011-04-26T14:56:06-04:00 — #5
Sorry, i meant in my test scenario(hello world) there was only three.
In my actually database is is over 1m
rayhab — 2011-05-09T10:36:52-04:00 — #6
r937, do you have any more recommendations?
r937 — 2011-05-09T12:36:53-04:00 — #7
other than you shouldn't put any stock in the actual number, no
the "relevance" score is only supposed to return the most relevant rows first -- the number itself doesn't mean anything
starlion — 2011-05-09T12:52:48-04:00 — #8
Also, the words "Hello" and "world" are in more than 50% of your rows (100%), and are therefore excluded from your search as per the Natural Language search rules... which makes your search somewhat irrelevant.
spacephoenix — 2011-05-09T14:00:15-04:00 — #9
What storage engine is the table using?
scallioxtx — 2011-05-09T17:38:14-04:00 — #10
Since MyISAM is the only engine supporting full text, I'm guessing MyISAM