Breaking up big tables; Good Idea?

system · July 28, 2010, 8:56pm

If you want to keep a tables contents in order as new rows are added, then when you add another row you need to add an empty row at the end of the table and copy each rows contents to the next row starting from the row that is the place the new data will be placed. Then you put the new data in the row and all the rows are in order.

The problem with this is that you have to shift all the rows down every time you add new rows. If you had a very big table, say 300,000,000 rows in size then I thought it might be better to split the table into smaller tables. For example if the table lists peoples names, you could have a table for names that start with A, one for names that start with B and so on. That way when you add new rows, not as many rows need to be shifted.

Is this a good idea? Or are there drawbacks?

Thanks
Ro0bear

r937 · July 28, 2010, 11:03pm

you’re interested in database internals? that’s great, but very few people are (i know i certainly am not)

if you don’t mind my saying so, you should really master more basic concepts first, like optimizing queries through indexes

r937 · July 29, 2010, 12:10am

that’s it in a nutshell, yes !!!

no, you can’t make that conclusion – of course it would be a benefit, but it isn’t necessary

building something that’s efficient does not require an understanding of internals, just a good understanding of indexes

oh, and by the way, databases are not efficient or inefficient in and of themselves – efficiency really has no meaning unless you’re talking about the SQL statements, not the database, despite the fact that you must do certain things in the database (like declare indexes) to make specific queries efficient

did you notice how many times in this thread i talked about indexes?

Dr_John · July 28, 2010, 9:14pm

If you want to keep a tables contents in order as new rows are added, then when you add another row you need to add an empty row at the end of the table and copy each rows contents to the next row starting from the row that is the place the new data will be placed. Then you put the new data in the row and all the rows are in order.

No you don’t.
You order the data using sql when it is retrieved.

What IS this right order? by age, shoe size, no. of children, years married/divorced, engine size, wing span??? id???

And then you try a join using a foreign key…
WHICH of the tables do you think has rows with that foreign key…
HOW do you right the query to cover all possible options…

system · July 28, 2010, 10:51pm

I see so how does a database work behind the SQL? What data structure is used? I would like to find out more about it, it hasn’t been covered in my lectures so far

felgall · July 28, 2010, 9:11pm

Databases don’t store the data in any particular order so there is no need to move anything. Just specify the appropriate order you want the data in when retrieving it from the table and the database will present it in that order.

Splitting a table by rows except where you are splitting it into active and inactive tables will just decrease the efficiency of the database by a significant amount.

Anyway few places require any of their tables to contain more thaqn a small number of rows anyway (rarely more than a few million at most).

system · July 28, 2010, 11:46pm

So really all most people know is that SQL deals with the data in the database, and that the actual data is stored in no particular order using mysterious methods detailed in database internals?

I presume therefore learning about database internals would provide no benefit in building efficient databases?

ro0bear

system · July 28, 2010, 10:43pm

Do you mean that databases store data in the order that it is added or litrally they store it in no particular order? If they litrally store it in no particular order, do you know where I can find out more about how the data is actually stored?

Thanks

r937 · July 28, 2010, 10:43pm

now hoooooold on thar, baba looey

you ~cannot~ store data in a particular order

all you can do, for searching purposes, is declare indexes which potentially can be used to access the data quickly

all you can do, for sorting purposes, is use an ORDER BY clause (which, if indexes are declared, can potentially be very efficient)

system · July 29, 2010, 12:16am

Ok, thanks for clearing that up

system · July 28, 2010, 10:17pm

If the data is stored in order then you can retrieve it using a binary search which is far more efficient than a linear one. I know you can order it when you retrieve it but surely sorting the data into the correct order every time you do a query is going to be less efficient than if you keep the data stored in order?