bostboy — 2010-12-02T01:58:27-05:00 — #1
What is the best way to find results that are equal to or sound like a string without having to use exact character matches?
I have tried soundex but it seems a bit inconsistent. It returns cheese if you put in chez but won't return waffle if you put in wafl.
Like is too restrictive because it requires exact character matches.
Anything else I can use?
wwb_99 — 2010-12-02T11:33:07-05:00 — #2
SOUNDEX or something of that sort is your best bet. There are a few other analysis and tokenization methods that might work better -- soundex was defined for english spoken names for the US Census, so it is slanted towards that sort of thing.
I'd generally bet that you won't find them in MySql, you'll need beefier stuff.
bostboy — 2010-12-02T11:47:44-05:00 — #3
Thanks, I'm looking at metaphone in php now and want to do some testing with that. My app is relatively straight forward in that I can build a table of single words that I want to search.It will come from a couple of different places, including some hierarchical category information that is static in the system in other tables and some words from a title that the user will put in for their full text entry. I am not really interested in searching the full text at this point. This is more of a keyword search to find the text and their is plenty of information in the categories and title to do that.
I'm thinking I can parse the user title to get the non-common words and explode that out and then enter them in a table as single entries, and also enter my static words in the same lookup table associated to the id for their text entry.
So the only real problem is being flexible in the search for misspellings, pluralization, etc.
Do you have experience with metaphone or know of something on the php side that would be better?
bostboy — 2010-12-02T11:51:24-05:00 — #4
One more question. If I do a full text index on a column and then search for a term that has more than one word, will it return results if it matches on word out of the two?
Like match (field) against (word1 word2) will it return entries with only word 2 in it? And does it then have to be an exact match to word 2?