Multiple keyword search across multiple tables

I have been searching for DAYS for a solution to my problem and so far I’ve come up empty-handed. I’ve seen FULLTEXT MATCH AGAINST statements with OR, different JOIN combos, LIKE conditions, hash tables, and other suggestions that don’t quite solve my problem. I hope someone here has some mysql Kung Fu to show me!

I have normalized tables containing categorical data for a coupon site. Occupation table with id, occupation fields, topcategory table with id, topcategory fields, coupon table with title, description fields (among others, but I have a FULLTEXT index on those fields).

My client wants a search box where a user could type “teacher restaurant” and receive relevant results. We have discussed and advanced search option (which I could easily do) but he wants this quick search option. Since FULLTEXT searches only work on single tables, that has t been a viable option. The only thing I could think of is to use php to prescreen the search query and preg_match any results from the occupation and category tables and then create a query based on those results. But that seems like I’m making it more complicated than it has to be.

Am I missing an easy solution? I hope I’ve described my problem adequately enough. Any advice?

please do a SHOW CREATE TABLE for all tables involved

please also show some sample data

please explain how “teacher restaurant” is supposed to be interpreted against the sample data you provided – as a phrase or as separate keywords, and whether both keywords have to be present, and whether they have to be in that order

then please indicate which of your sample rows should be returned by the query

My apologies. This is my first post to this specific board, and I posted from my iPhone in haste. Here are some tables I want to search across:

occupations
occupationid TINYINT PK,
occupation VARCHAR(50)
(ex: 3, “Teacher”)

topcategories
topcategoryid INT PK,
topcategory VARCHAR(50)
(ex: 6, “Restaurant”)

perks
perkid BIGINT PK,
occupationid TINYINT FK,
topcategoryid INT FK,
title VARCHAR(150),
description TEXT(500)
(ex: 13, 3, 6, “50% off burgers and fries!”, “Teachers, join us every Wednesday from 3:00pm to 7:00pm and you’ll receive 50% off any burger/fry combination!”)

If a user searches “teacher”, the above perk should show along with any other perk with an occupationid of teacher. If a user searches “restaurant”, a similar thing should occur. If a person searches “teacher restaurant” then this perk would be at the top of the list because it is categorized as both AND has the word “teachers” in the description. However, perks categorized as teacher with a different topcategory should also be in the result set, as well as any perk categorized as a restaurant with a different occupation.

Wouldn’t “views” be an option here?
I’m not sure if you can have fulltext indexes on them, so see the docs.

try this –


SELECT occupations.occupation
, topcategories.topcategory
, perks.title
, perks.description
FROM perks
INNER
JOIN occupations
ON occupations.occupationid = perks.occupationid
INNER
JOIN topcategories
ON topcategories.topcategoryid = perks.topcategoryid
WHERE occupations.occupation = ‘Teacher’
OR occupations.occupation = ‘Restaurant’
OR topcategories.topcategory = ‘Teacher’
OR topcategories.topcategory = ‘Restaurant’
OR perks.title LIKE ‘%Teacher%’
OR perks title LIKE ‘%Restaurant%’
OR perks.description LIKE ‘%Teacher%’
OR perks description LIKE ‘%Restaurant%’
ORDER
BY CASE WHEN occupations.occupation = ‘Teacher’
THEN 3 ELSE 0 END
+ CASE WHEN occupations.occupation = ‘Restaurant’
THEN 3 ELSE 0 END
+ CASE WHEN topcategories.topcategory = ‘Teacher’
THEN 3 ELSE 0 END
+ CASE WHEN topcategories.topcategory = ‘Restaurant’
THEN 3 ELSE 0 END
+ CASE WHEN perks.title LIKE ‘%Teacher%’
THEN 2 ELSE 0 END
+ CASE WHEN perks.title LIKE ‘%Restaurant%’
THEN 2 ELSE 0 END
+ CASE WHEN perks.description LIKE ‘%Teacher%’
THEN 1 ELSE 0 END
+ CASE WHEN perks.description LIKE ‘%Restaurant%’
THEN 1 ELSE 0 END
DESC


notice how different weights are assigned depending on wher a search term was found

Thanks for the reply r397, I appreciate your time.

After a couple syntax fixes, the query ran successfully but only returned 1 result, and that was when the occupation is ‘Teacher’ and the topcategory is ‘Restaurant’ (see image below: “Free Salads on Saturdays!”). However, there are 2 more perks in the database with topcategory as ‘Restaurant’ that were not returned.

Then I tried omitting all but the first two WHERE clauses looking for perks that are classified as either ‘Teacher’ or ‘Restaurant’ or both. Again, only that single perk classified as both came up.

So I omitted all the WHERE clauses to bring up all perks I currently have in my testing DB to see if the ordering worked using the ranking system and it appeared not to. I have attached an image of the result set. You can see all the perks I currently have in the DB.

At first glance it would appear to be a good query to do what I need it to do, but I have this nagging suspicion about the OR conditions. I have a hunch that the reason the first record comes up may be that it’s the first record mysql comes across and it satisfied the occupations.occupation = ‘Teacher’ condition and so maybe mysql stopped looking for any other records? I’m probably wrong but thats the only explanation I could come up with. What do you think?

G.Schuster: I am not familiar with VIEWS. I will research that a bit and see if I can come up with anything.

what syntax fixes?

could you dump the tables for me please (all three of them) and i’ll test this myself

the mysqldump command will do it, as can every mysql gui program (e.g. phpmyadmin, heidisql) using friendly options

just the CREATE TABLE and INSERT statements please

i hate getting a query wrong :slight_smile:

Haha sure, the syntax fixes were in the WHERE clause, you had ‘perks description’ as opposed to ‘perks.description’. Same issue with the perks title.

I’ve attached the 3 dumps as a .zip archive. The dumps are in .sql format. I hate gettings stuff wrong as well. I’m happy to have a kindred spirit helping me out!

one of the reasons you were getting poor results is because you said one of the top categories was ‘Restaurant’ so i wrote my query with that

however, in the data you dumped, there is no top category called ‘Restaurant’, however, there is one called ‘Restaurants’

that would make a big difference, yeah?

I did a bit of research based on previous suggestions…

G.Schuster: I was able to successfully create a view will columns I want to search across but I can’t create a FULLTEXT index on them (or at least the error telling me I couldn’t because my view wasn’t a “base table”) so I would have to resort to breaking up the search terms like r937 did anyway. I figure creating a view, searching it similarly, and then deleting it or replacing it would not be as efficient.

r937: Ahh yes my mistake. However, if a user searched “restaurant” those results set to equal “Restaurants” wouldn’t show right? In any case, I modified your code slightly and got it to do what I needed it to do! I changed some of the operators to LIKEs instead and it worked out! Here is the final query:


SELECT o.occupation, t.topcategory, p.title, p.description
  FROM perks p
INNER JOIN
  occupations o
  ON o.occupationid = p.occupationid
INNER JOIN
  topcategories t
  ON t.topcategoryid = p.topcategoryid 
WHERE o.occupation LIKE 'teacher%'
    OR o.occupation LIKE 'restaurant%' 
    OR t.topcategory LIKE 'teacher%'
    OR t.topcategory LIKE 'restaurant%' 
    OR p.title LIKE '%teacher%'
    OR p.title LIKE '%restaurant%'
    OR p.description LIKE '%teacher%'
    OR p.description LIKE '%restaurant%'
ORDER
    BY CASE WHEN o.occupation LIKE 'teacher%' 
            THEN 3 ELSE 0 END
     + CASE WHEN o.occupation LIKE 'restaurant%' 
            THEN 3 ELSE 0 END
     + CASE WHEN t.topcategory LIKE 'teacher%' 
            THEN 3 ELSE 0 END
     + CASE WHEN t.topcategory LIKE 'restaurant%' 
            THEN 3 ELSE 0 END
     + CASE WHEN p.title LIKE '%teacher%'
            THEN 2 ELSE 0 END
     + CASE WHEN p.title LIKE '%restaurant%'
            THEN 2 ELSE 0 END
     + CASE WHEN p.description LIKE '%teacher%'
            THEN 1 ELSE 0 END
     + CASE WHEN p.description LIKE '%restaurant%'
            THEN 1 ELSE 0 END, o.occupation
       DESC

Thanks again for all your help!!

Hey r397, I have run into another issue. Not sure if you’re still subscribed to this thread or not, but here goes.

I’m trying to develop some pagination but I have been unable to get the total number of rows for my queries. The way they are grouped must have something to do with it. Here is my query:


SELECT COUNT( * ) AS total, o.occupation, t.topcategory, s.subcategory, p.perkid, p.clientid, p.businessid, p.enddate, b.businessname, p.title, p.description, p.hits, a.zip, p.status
FROM perks p
INNER JOIN occupations o ON o.occupationid = p.occupationid
INNER JOIN topcategories t ON t.topcategoryid = p.topcategoryid
INNER JOIN subcategories s ON s.subcategoryid = p.subcategoryid
INNER JOIN businesses b ON b.businessid = p.businessid
INNER JOIN addresses a ON a.businessid = p.businessid
WHERE a.zip
IN ( 46220, 46240, 46226, 46205, 46250, 46218, 46208 )
GROUP BY p.perkid

I get three rows (as I expect for this particular query), but the total in each one is the weird part. The first row has a count of 2, the second a count of 1, the third a count of 2. The reason for this is that each business can have multiple addresses associated with it so each address is creating a repeated row if it falls in the zipcode list. So one business can be at two addresses that both fall in that zip list. Sorting by p.perkid allows the page results to display just fine, but I can’t figure out how to just get the total number of rows (3 in this case). Any ideas? Anyone?

look up the SQL_CALC_FOUND_ROWS option in the manual, perhaps that is what you are asking?

I saw that and a couple of pages I researched said that there’s a significant performance hit once the database gets larger. Is that true?

Also, I tried one query like “SELECT SQL_CALC_FOUND_ROWS p.perkid FROM table; SELECT FOUND_ROWS() AS total” and i got a mysql_error about syntax. I HAVE to run two separate queries through php? What if multiple people are performing searches at the same time? Will that pose a problem?

sorry, i don’t do php, but from other threads i’ve seen, i think there is a restriction on only passing one SQL statement at a time over the interface

The mysql “interface” can only be passed on query at a time whilst the mysqli “interface” can be passed multiple queries.

Alright thanks for your help! It’s currently working just fine! I really wish Google would lend a few pointers on this issue…haha.