How to efficiently design this database?

acidbox · March 18, 2010, 7:56pm

I’m building a “car part” inventory system… The system contains thousands of parts. However, each part has completely different criteria… I.e. a Tire might have max pressure, expiration, size, tread, etc… whereas a car battery might have voltage, expiration, etc…

Many fields are different, but many are the same. So for example, if I wanted to search my inventory database for all parts that are expiring soon, I’d want
both the battery and the tire to return in the results.

My question is, what is the best way to organize this in a database? Here are the three possibilities that I’ve come up with so far:

Have all possible fields in one table regardless if only some are used

Pros: Only 1 table is needed, and only 1 row is required for each part entry.
Cons: Could be up to like 100+ fields in the parts table. Only some are used for each part, based on the type.

Have a table for each type of part.

Pros: Only needs the specific columns that are required for this type of part.
Cons: Will have a lot of tables, one table for each type of part.

Have a table every field be custom, and store the needed fields in a global database for each entry.

Pros: Only need 1 record in the master table for each part.
Cons: The many-to-many table that holds the part, field id, and value, would be large. Each part entry could have up to 50 fields that are entered.

So my question is, which idea is the best? Maybe something completely different? Keep in mind that this will house like 100,000 parts, so speed is very important. Any suggestions on how to structure this database so its the most efficient for storage, and most importantly, the most efficient for searching across all parts in the database for the matches.

Thanks!

rustybuddy · March 18, 2010, 8:10pm

Just throwing this out as an idea. I do not know what solution would work best for this but am curious to see what everybody has to say. What about a table that is more of a spec table.

id(key) | parent(item number) | name | value

1 | 2342 | pressure | 30lbs

In a previous project I opted with your first option with some of the fields related to other databases.

For example an item may be a car. The car has a certain set of colors red, blue, white…etc. So in my db I had a table called lists_car_colors. Then in my product table I had a col for color but it was just an integer that related to the lists_car_colors table.

Vali · March 18, 2010, 8:17pm

You want a system that can grow without code modifications. So, adding columns / tables all the time is a nono. (tables are OK for logging/performance/archive).

This will work for you:

[part] # ex: car
part_id
part_name
part_description

[component] # ex: rims
component_id
component_name
component_description

[part_component] # ex: a car has 4
[part_id]
[component_id]
[component_value]

[component_property] # ex: rim size
[property_id]
[property_name]
[property_description]

[component_properties] # ex: 18" for that rim
[component_id]
[property_id]
[property_value]

You might want to fine tune it to your exact requirements, but that is the basic idea.

acidbox · March 18, 2010, 9:33pm

Vali:

You want a system that can grow without code modifications. So, adding columns / tables all the time is a nono. (tables are OK for logging/performance/archive).

This will work for you:
[part] # ex: car
part_id
part_name
part_description

[component] # ex: rims
component_id
component_name
component_description

[part_component] # ex: a car has 4
[part_id]
[component_id]
[component_value]

[component_property] # ex: rim size
[property_id]
[property_name]
[property_description]

[component_properties] # ex: 18" for that rim
[component_id]
[property_id]
[property_value]
You might want to fine tune it to your exact requirements, but that is the basic idea.

It seems like both of you guys opted for a variation of option #3.

Vali, question for you: Assuming the number of fields would never change (we’d never add more parts to the database), but there would be a lot of fields to start off with. Would you still opt to do it this way?

IBazz · March 19, 2010, 1:26am

Just as a tip… it’s the relationship that should determine what tables you create and what goes into each.

as acidbox showed, each main table has data that is connected on a 1-to-1 basis with the PK. and some tables have two foreign keys eg ‘component_properties’ which has cols that relate to the PK of two different tables. That’s an example of a many to many relationship where many rims can have many sizes.

Once you get the hang of the relationships, the db structure sort or makes you work it out correctly, I think.

it may help to read up on database normalisation.

bazz

r937 · March 19, 2010, 3:38am

you can probably forego the normalization stuff, most of it is written in relational mumbo-jumbo anyway

more germane to this situation is to do some research on supertype/subtype

Vali · March 19, 2010, 2:51pm

The never add more parts to the database is wrong, you need to be able to add parts to your system at will, without having to modify your scripts / database structure.

Ex: if you add a simplified “door” (component) to your simplified “car” (part), you would:

INSERT door INTO component;
INSERT door_id, car_id, 4 (we have 4 on each car) INTO part_component
see if you have a “color” property in your component_property, else add it.
INSERT door_id, color_id, ‘red’ INTO component_properties

So, you just added a new component without changing any code / db structure. (you can make a script add this for you).

Also, your tables will get quite a few rows in them, since “component_property” will have all the possible properties in your system (color, size, etc). But, all the selects are done on primary keys / indexes, so even if you have a billion records, this will be pretty fast (worked with 4.2bill records, and selects on keys are fast).