This is a document I’ve been slowly adding to as I come across things ‘in the wild’. Maybe it can be of some use?
SQL and RDBMS Database Design DO’s and DON’Ts
Generic DOs:
- Develop a backup strategy before you encounter a catastrophic database failure, and TEST it regularly so you will know what to do when disaster strikes.
- Always sanitize your input from the user to strip out quotes for strings and non-numeric data for number types. (http://www.sitepointforums.com/showthread.php?s=&threadid=60643)
- Perform table normalization in your DESIGN phase. It is much easier to change your tables ‘on paper’ than when it is in production use. (http://www.sitepointforums.com/showthread.php?s=&threadid=63582)
- Choose datatypes which are logical and fit your model. Designating a US Social Security number as a CHAR( 255 ) is very wasteful since they will not exceed 11 characters (XXX-XX-XXXX). CHAR( 11 ) is a perfect match.
- Run every query through your RDBMS’ query tuning tool and ensure that correct indexes are being used and that a table scan or Cartesian product (in a join) is not occurring when you do not want it to. MS SQL, <snip/> MySQL, [url=“http://dbforums.com/showthread.php?s=&postid=212692#post212692”]Sybase
Generic DON’Ts:
- Do not fall into the trap of what I like to call ‘Auto_Increment Induced Insanity’. Not every table needs a sequential integer as the primary key. Design your primary key logically based upon your data, and if need be then consider a sequential integer if traditional primary key design fails.
- Do not try and join mismatched datatypes. For instance, if you have an INT as a primary key in one table and a CHAR( 10 ) as the foreign key in another, problems will occur when you try and join the two columns, usually manifesting as a table-scan on one of the mis-matched tables. Or to put it another way – ensure datatypes match across tables.
- Remember queries which use wild cards such as SELECT * WHERE myCharColumn LIKE ‘%anything’ will always result in a table scan, so only use leading wild cards when absolutely necessary.
- Don’t create redundant indexes. In most, if not all, RDBMS’, designating something as a PRIMARY KEY will also create a unique index on the column(s). Creating an additional index on the leading column will be redundant and a waste of space. For example, creating an index on Col1, Col2 will be used on queries of type: WHERE Col1 = 34 AND Col2 = ‘Something’. It will also be used on queries such as WHERE Col1 = 123.
- Don’t use the $dbms_seek() functions in PHP to simulate pagination of a result set. Use LIMIT in MySQL, TOP n in MS SQL, etc. to achieve true pagination.
MySQL Specific DON’Ts:
- Do not enclose numeric values in quotes. This is very non-standard and ONLY works on MySQL. For example, WHERE someIntegerColumn = ‘1’;. This also pertains to integer values in CREATE TABLE statements such as CREATE TABLE bob ( bobID INT DEFAULT ‘0’ ) <– bad. DEFAULT 0 is the correct method.
- <snip/> Do not use the INT( M ) syntax unless you are using the ZEROFILL MySQL proprietary SQL extension.
edit: added another Don’t