How to optemise the query having group by on two colomun

i have one table say A

±-----------±-------------±---------------+
| value1 | value 2 | value 3 |
±-----------±-------------±---------------+
| 5 | 4adb83c1d71b | 1 |
| 5 | 44fcb93589fe | 2 |
| 5 | 4182b6e2100c | 1 |

| 6 | 4bfd8db46ab1 | 1 |

| 7 | 4b0f87c68808 | 1 |
| 7 | 413fa4646dfe | 2 |
| 7 | 4585b5671b68 | 3 |
| 7 | 4486abc0ddce | 2 |
| 7 | 4b01b048f07c | 5 |
| 7 | 4cebb754ff21 | 2 |
±-----------±-------------±---------------+

Now i need an optemise query that will give number of duplicate in value 3 colomun for each value 1 colomun

mean i need a result as

±-----------±-------------±---------------+
| value1 | value 3 | count |
±-----------±-------------±---------------+
| 5 | 1 | 2 |
| 6 | 1 | 1 |
| 7 | 2 | 3 |
±-----------±-------------±---------------+

mean
value 1(5) have value 3(1) repeate 2 time
value 1(6) have value 3(1) repeate 1 time
value 1(7) have value 3(2) repeate 3 time

for this i write a query but that query is not optemise so if have any better solution for query.

select value1,value3,count(value1) as ct from A where value1>0 and value1 != ‘’ group by value1,value2 having ct >1 limit 3;

where value 1 and value 3 are indexed

ALTER TABLE a ADD INDEX(value1,value2)

select value1,value3,count(value1) as ct from A where value1>0 and value1 != ‘’ group by value1,value3 having ct >1 limit 3;

value 1 and value 3 are already index then also it scan the whole row

ALTER TABLE a ADD INDEX(value1,value3)

:slight_smile:

value1 and value 3 are already index in table so this does not work

sure it does

i am not suggesting separate indexes on those columns individually, i am suggesting a single composite index on both columns

±—±------------±--------------------±------±----------------------------------------------------±-------------±--------±-----±--------±----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
±—±------------±--------------------±------±----------------------------------------------------±-------------±--------±-----±--------±----------------------------------------------------------+
| 1 | SIMPLE | A | range | value1_3 | value1_3 | 8 | NULL | 3124145 | Using where; Using index; Using temporary; Using filesort |
±—±------------±--------------------±------±----------------------------------------------------±-------------±--------±-----±--------±----------------------------------------------------------+

so as we see from here value1_3 is index of (value1 and value3)

Still expensive

compared to what? you’re reading three million rows!!!

3 million rows entries are in table A, so when we explain this query it give us this result
±—±------------±--------------------±------±----------------------------------------------------±-------------±--------±-----±--------±----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
±—±------------±--------------------±------±----------------------------------------------------±-------------±--------±-----±--------±----------------------------------------------------------+
| 1 | SIMPLE | A | range | value1_3 | value1_3 | 8 | NULL | 3124145 | Using where; Using index; Using temporary; Using filesort |
±—±------------±--------------------±------±----------------------------------------------------±-------------±--------±-----±--------±----------------------------------------------------------+

So its scanning the whole table

actually, it’s not accessing the table at all, just the index :slight_smile:

you’re using an ORDER BY clause, right?

its group by clause

so there is no ORDER BY clause?

Dear Just go through this query,we are not using order by we are using group by

select value1,value3,count(value1) as ct from A where value1>0 and value1 != ‘’ group by value1,value2 having ct >1 limit 3;

there’s your problem… you’re grouping by something, but selecting something else

sorry its typo error
select value1,value3,count(value1) as ct from A where value1>0 and value1 != ‘’ group by value1,value3 having ct >1 limit 3;

okay, i don’t think i can keep dancing around the real issue here with fake table and column names

is there any chance you can show your actual table and actual query?

do a SHOW CREATE TABLE for me please

Table Structure is as follows

| gallery_event_image | CREATE TABLE gallery_event_image (
gallery_id bigint(20) unsigned NOT NULL,
image_id varchar(20) NOT NULL,
image_position smallint(5) unsigned DEFAULT NULL,
UNIQUE KEY gallery_id_2 (gallery_id,image_id),
KEY gallery_id (gallery_id),
KEY image_id (image_id),
KEY image_position (image_position),
KEY gallery_id_3 (gallery_id,image_position)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 |

and real query

select gallery_id,image_position,count(gallery_id) as ct from gallery_event_image WHERE image_position < 255 and gallery_id!=0 and gallery_id != ‘’ group by gallery_id,image_position having ct >1 order by ct asc limit 500;

well, whaddya know… there ~was~ an ORDER BY clause after all

:slight_smile:

ok, so how can we solve it

you can’t solve it

if you want the lowest 500 counts, you’re going to have to compute all the counts, right?

that’s what using temporary and using filesort mean – each gallery_id/image_position combination is counted, the results are placed into a temporary file, and then this file is sorted into ascending sequence by the count

that’s what you asked for, and your query is as optimized as it’s going to get