MSSQL: best indexing techniques

bewise · July 15, 2011, 9:53pm

I have two database table, one with of around 4 million rows and the other with a thousand rows.


SELECT TOP 2000 c.id, o.*
FROM customers c
INNER JOIN orders o ON o.customer_id = c.id
ORDER BY o.cost DESC, c.id ASC

When looking at the execution plan for the query, the majority of the execution time (around 85%) is taken up with the SORT operation.

I have relevant indexes in place to deal with sorting attributes from one table only (“ORDER BY o.cost DESC” or “ORDER BY c.id ASC”). However unsure what indexes to create in order to speed up the above SORT operation

Any ideas / suggested reading material?

Regards

r937 · July 16, 2011, 1:40am

i’m going to make an assumption here, that since you’re using an INNER JOIN, you’re not going to be interested in any orders that have a customer_id that ~isn’t~ in the customers table

otherwise you’d be using a RIGHT OUTER JOIN instead of INNER JOIN

but now watch this…

since we know o.customer_id = c.id, and since you’re not actually retrieving any columns from the customers table (other than the id, just to make sure this is an order for a valid customer), therefore you can simplify your query as follows:

SELECT TOP 2000 o.*
FROM orders o 
ORDER BY o.cost DESC[COLOR="Red"], o.customer_id ASC[/COLOR]

in fact, if it were up to me, i would remove the second column in the ORDER BY

why? because you’re interested in the 2000 biggest orders, and if several orders actually have the same cost, how critical is it that they be sequenced by ascending customer_id? customer_id is not supposed to have any meaning whatsoever, and so the sequence shouldn’t matter

try your query with these changes and see how much the execution time improves

bewise · July 18, 2011, 7:42pm

Hello r937,

FIrst of all thank you for your details reply.

The example that I provided was an example of the type of queries that I want to run. The query is created based on what the users decides they want to see. There are a number of other (which include various attributes being selected and sorted).

For example:


SELECT TOP 10000 o.*, c.*
FROM orders o 
INNER JOIN customers c ON c.id = o.customer_id
ORDER BY o.cost DESC, c.name ASC

If the inner join and order by must include two tables. Is there any potential indexes that could be used? For example, if there is a non-clustered index of “o.cost DESC” and “c.name ASC” respectivley, will this help speed up the sort section of the execution plan?

Hope this all makes sense.

Regards

r937 · July 18, 2011, 7:57pm

i don’t think any relational database will allow you to create an index that spans more than one table