cpradio — 2013-04-19T09:51:30-04:00 — #1
I recently found a table where I work and they have a cluster index that contains a date/time column. This column contains date (year, month, day), and time (hour, minute, second, millisecond).
I can't see how that would ever be useful or not hinder the performance of the table instead of helping it. Most lookups to that table involve its primary key (which is not a date/time, it is an identity integer), second most used lookup involves another integer column which is a foreign key reference.
Thoughts? Can any good come from having a date/time in a clustered index?
cpradio — 2013-04-19T10:07:53-04:00 — #2
Just to add a bit more. This column is not guaranteed unique (the date/time one), we do multiple inserts all within the same millisecond. It also has 9 more columns stored with it in its' clustered index.
cpradio — 2013-04-19T12:22:14-04:00 — #3
Sigh, nevermind, I think I figured out what the original intent was for this index (they wanted to help some of the queries that do date ranges). Back to the drawing board, I guess.
r937 — 2013-04-19T12:37:48-04:00 — #4
that situation seriously argues against a clustered index on datetime
a clustered index stores new rows in clustering sequence, so high frequency inserts is likely to cause contention on the disk
r937 — 2013-04-19T12:38:32-04:00 — #5
that's fine, but it still shouldn't be a clustered index
cpradio — 2013-04-19T12:56:47-04:00 — #6
Yay! So my original thought was still spot on. I'm not the most "competent" when it comes to indexes or Database Administration but I like to think I have "some" know how (granted I've been told I have a lot more and should stop doubting myself -- I should really work on that).
My original thought was to remove the clustered index, replace it with one on the ID column, and then create non-clustered indexes for the top used columns in our queries (the ones used to filter down data). We do a LOT of inserts into this table daily and although the Timeouts were actually part of a patch gone bad on our Prod SQL Server, it still quickly identified there are issues with this one table.
Does using the ID make sense for a clustered index or should I be looking for other things? Almost all of the other columns will definitely be non-unique.
r937 — 2013-04-19T14:06:02-04:00 — #7
not sure if that's going to be any better, assuming that the id is an auto_increment
a clustered index is best when inserts are spread across the entire dataset (which is often why people choose a UUID as a clustered index)
cpradio — 2013-04-19T14:13:39-04:00 — #8
So in theory, it might be good to not have a clustered index on this table, and review it again via execution plans and monitoring to see if there is a scenario that needs it. This table is primarily ONLY inserted into, it is a reporting table (most reports are run at night), some reports are run sporadically during the day, but most of the time, it is only used for inserting records.
cpradio — 2013-04-19T14:17:59-04:00 — #9
Out of curiosity, does your book cover Clustered Indexes in depth? I've found that most of my research online, still has left me with more questions than answers (some day I might get enough time to sit and ponder them first so I can then ask them here)
r937 — 2013-04-19T14:35:00-04:00 — #10
no, not at all