SQL and RDBMS Database Design DO's and DON'Ts

r937 · March 23, 2007, 11:51pm

well, i think you missed the question

“what query would guarantee that all data in all columns would be returned?”

nobody asked that question

except perhaps you

system · March 24, 2007, 12:01am

O.K. let’s go back to this then:

LIKE ‘%anything’ gives a table scan

Yes, this is true. ‘%anything’ is not sargable.

LIKE ‘anything%’ doesn’t

Absolutely not true. It depends upon whether the optimizer decides a table scan is more efficient depending upon the distribution of data (statistics that the optimizer keeps), and the number of rows that are maintained in a block. Often the optimizer will choose a table scan or default to a table scan if insufficient distribution data being collected.

the dreaded, evil “select star” should never be used, full stop

A matter of opinion. Clearly there is a place for ‘*’ if an user wants to ensure that they are returning all columns from a table. It is used all the time, and has an obvious place in the SQL syntax. However, I would agree that unless there is a requirement to “always return all columns”, then one should use enumerated column names.

Hope this clears things up. You posed all sorts of questions, such as performance issues, which have been all muddled up. But suffice to say, there are no performance issues when using '*".

r937 · March 24, 2007, 12:15am

you know rich, i was really tempted to leave it at that, so that you have the last word, but i wanted to add one more thing, so please do reply again, so that indeed you will have had the last word

here’s what i wanted to add:

richrf says: using SELECT * will have “no performance issues… full stop”

system · March 24, 2007, 12:31am

Yes, when compared to Select a, b, c, … , as you suggested in your post. Both will read the whole table.

Rich

Practicality · December 17, 2007, 10:29pm

I have to agree with the really old post from Asterix.

From a standpoint of someone who often creates custom interfaces for various database tables, I would say the practice of always using one auto-increment PK is a massive time saver.

One of the many reasons is that functions only return one value. When you want to query for an index, using a function (in PHP, for instance), what do you return if you have a compound key?

I understand that this is very easy to get around, I understand that there are many options for dealing with this situation, but the fact remains that having one single number that always references a specific field is simply easier to work with.

Not to mention that if you always use a single primary key, you can make your functions more “generic” and it is easier to reuse code.

While some might call my databases “bloated” or “slow” because they have this “extra” key, this design has saved me massive development time, and the performance hit is negligible, not even worth mentioning.

Along the same lines, SELECT * is a great tool when you are writing queries that work on tables where you don’t know how many fields there are. The time it would take to query the database for the names and/or number of fields is certainly longer than just a single SELECT *.

While my tables might be faster if I used the above suggestions, the development time (and therefore cost) is more valuable to me, as well as my customers (who wants to pay for all this extra development time, to save .0001 seconds on your queries?).

The caveat is often used queries (which, if you are honest, are truly the outliers). In often used queries you really should take the time to optimize as much as possible.

For most of us the vast majority of the queries you write will only be executed occasionally and have no significant effect. It is those tables and queries that are critical to load time that actually have significant enough return on investment that it is worth optimizing.

I really think many developers undervalue their own time.