I’m building a search on one of my sites and I have the following MySQL query. It’s currently taking 9.5 sec in phpMyadmin to execute the code, but actually takes even longer when run on the actual site. Do I have the right database structure of tables etc.
SELECT users . * , ui.industry, jt.job_type
FROM users
LEFT OUTER JOIN (
SELECT user_id, job_type
FROM users_job_types
LEFT JOIN job_types ON job_types.id = users_job_types.job_type_id
GROUP BY user_id
) AS jt ON jt.user_id = users.id
LEFT OUTER JOIN (
SELECT user_id, industry
FROM users_industries
LEFT JOIN industries ON industries.id = users_industries.industry_id
GROUP BY user_id
) AS ui ON ui.user_id = users.id
WHERE users.status = '1'
AND users.cv_hide = '0'
ORDER BY cv_date DESC
Below are the create statements:
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(100) CHARACTER SET utf8 NOT NULL,
`cv_date` datetime DEFAULT NULL,
`cv_hide` enum('0','1') CHARACTER SET utf8 NOT NULL DEFAULT '0',
`status` enum('0','1') CHARACTER SET utf8 DEFAULT '1',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1292692 ;
CREATE TABLE IF NOT EXISTS `job_types` (
`id` smallint(6) NOT NULL AUTO_INCREMENT,
`job_type` varchar(100) NOT NULL,
`job_type_description` text,
`job_type_url` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1006 ;
CREATE TABLE IF NOT EXISTS `users_job_types` (
`user_id` int(11) NOT NULL,
`job_type_id` int(11) NOT NULL,
KEY `user_id` (`user_id`),
KEY `job_type_id` (`job_type_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `users_industries` (
`user_id` int(11) NOT NULL,
`industry_id` int(11) NOT NULL,
KEY `user_id` (`user_id`),
KEY `industry_id` (`industry_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `industries` (
`id` smallint(6) NOT NULL AUTO_INCREMENT,
`industry` varchar(100) NOT NULL,
`industry_description` text,
`industry_url` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1052 ;
Please let me know if you know any way this can be speeded up.
your junction tables should be redesigned to enable the use of covering indexes – the primary key plus another index with the same two columns in the reverse order
CREATE TABLE users_job_types
( user_id INTEGER NOT NULL
, job_type_id INTEGER NOT NULL
, PRIMARY KEY ( user_id , job_type_id )
, INDEX job_type_user ( job_type_id , user_id )
);
CREATE TABLE users_industries
( user_id INTEGER NOT NULL
, industry_id INTEGER NOT NULL
, PRIMARY KEY ( user_id , industry_id )
, INDEX industry_user ( industry_id , user_id )
);
this way, the actual rows of these two tables are never accessed, just the indexes
as for your query, how come all you want is a single arbitrary job type and a single arbitrary industry for each user?
Firstly, thank you for your post. Your comments make absolute complete sense.
I’ve added the new indexes / primary keys to the table. Now with the query itself, the aim is to return users who match a certain job type or industry. I’ve stripped that part of the sql out for now, so this in essence is the query that will be run when a user does a blanket search (i.e. they have no search criteria). Ideally, this query should return all users. I’m joining the tables as I need to display the industry and job type on the search results page. Does this make sense?
[COLOR=#000000][COLOR=#007700] [/COLOR][COLOR=#0000BB]SELECT user_id[/COLOR][COLOR=#007700], [/COLOR][COLOR=#0000BB]job_type
FROM users_job_types
LEFT JOIN job_types ON job_types[/COLOR][COLOR=#007700].[/COLOR][COLOR=#0000BB]id [/COLOR][COLOR=#007700]= [/COLOR][COLOR=#0000BB]users_job_types[/COLOR][COLOR=#007700].[/COLOR][COLOR=#0000BB]job_type_id
GROUP BY user_id
This part of your query alone will return a single job type per user_id and it is arbitrary which one it picks. If there are five rows in the table for you and your id=3 and two rows in the table for me and my id=4 it will return any one of your rows and any one of my rows.
[/COLOR][/COLOR]
Yes, I used GROUP_CONCAT on a similar query - not this one.
I guess I could do this then, but it still works slow;
SELECT users . * , ui.industry, jt.job_type
FROM users
INNER JOIN (
SELECT user_id, group_concat( job_type
SEPARATOR ' ' ) AS job_type
FROM users_job_types
LEFT JOIN job_types ON job_types.id = users_job_types.job_type_id
GROUP BY user_id
) AS jt ON jt.user_id = users.id
INNER JOIN (
SELECT user_id, group_concat( industry
SEPARATOR ' ' ) AS industry
FROM users_industries
LEFT JOIN industries ON industries.id = users_industries.industry_id
GROUP BY user_id
) AS ui ON ui.user_id = users.id
WHERE users.status = '1'
AND users.cv_hide = '0'
ORDER BY cv_date DESC
How do I speed it up using my new indexes that I added?
How can I return all users even if they don’t have any industries or job types?
i’m not a performance expert but perhaps the ORDER BY in the outer query is contributing to the slowness
this is just a guess, but the query might speed up really nicely if you pull in all the job type and industries separately ahead of time, and do the GROUP_CONCAT on the ids only (i.e. use just the junction tables in the subqueries), expanding the ids to names in your application language – but this would be a last resort
SELECT users.*
, jt.job_type_ids
, ui.industry_ids
FROM users
LEFT OUTER
JOIN ( SELECT user_id
, GROUP_CONCAT(job_type_id) AS job_type_ids
FROM users_job_types
GROUP
BY user_id ) AS jt
ON jt.user_id = users.id
LEFT OUTER
JOIN ( SELECT user_id
, GROUP_CONCAT(industry_id) AS industry_ids
FROM users_industries
GROUP
BY user_id ) AS ui
ON ui.user_id = users.id
WHERE users.status = '1'
AND users.cv_hide = '0'
ORDER
BY cv_date DESC
hopefully that runs faster, and you would then swap out the ids using the pre-fetched job types and industries using your application language
We use PHP to build an SQL query depending on what the users search criteria is. For example, if they select a job type, then the query will narrow by job type. The query I posted just happens to be a blanket search (i.e. the user didn’t have any search criteria and therefore all results should be returned).
Because we need to see the Job Type and Industry next to the user on the search results page.
It will run every time someone does a search, so probably quite a lot.
SELECT
u.id
,GROUP_CONCAT(DISTINCT jt.job_type SEPARATOR ' ')
,GROUP_CONCAT(DISTINCT i.industry SEPARATOR ' ')
FROM
users u
LEFT OUTER
JOIN
users_job_types ujt
ON
u.id = ujt.user_id
LEFT OUTER
JOIN
job_types jt
ON
ujt.job_type_id = jt.id
LEFT OUTER
JOIN
users_industries ui
ON
u.id = ui.user_id
LEFT OUTER
JOIN
industries i
ON
ui.industry_id = i.id
WHERE
u.`status` = 1
AND
u.cv_hide = 0
GROUP
BY
u.id
ORDER
BY
u.cv_date DESC
The explain for that should be using all the indexes.