MySQL - Mass deleting bad?

I have a table in a MySQL database that holds information that is updated at least once a day, maybe as much as every 6 hours. The information being put in the table mostly stays the same, but there are instances where information that’s in the table is no longer valid and needs to be removed and there are instances where new information that doesn’t exist in the table needs to be added.

Currently I’m deleting all of the data in the table and then adding the information back into the table from the updated list. This way, information that needs to be removed is removed and new information is added. Information that was already in the table is also included in the updated list, so it is added back as well.

Is this a bad form? I’m not talking about a lot of data, but there are quite a few records, about 6000 to 9000 per update.

I guess my thinking is that deleting and re-adding is a bit wasteful, but I’m not sure if it’s any more inefficient than any other method.

Is it harmful to do a lot of deletes and a lot of inserts in this fashion?

[quote=“sparek, post:1, topic:113286, full:true”]Currently I’m deleting all of the data in the table…

Is it harmful to do a lot of deletes and a lot of inserts in this fashion?
[/quote]harmful? never

however, if you are using DELETE to delete all the rows in the table, it will be a lot faster if you use TRUNCATE TABLE

Thanks

I guess I misspoke, I’m not really deleting all of the data in the table, but MOST of it. A certain subset of it changes once per blue moon. That subset is not deleted en masse. But as long as deleting and readding data isn’t harmful, I’ll keep doing that.

Deleting frequently can cause a lot of performance issues. MyISAM will lock the entire table and InnoDB will dedicate more resources to it and slow down the entire server, it also has to check for FK constraints even if there isn’t any (I think).

I’m not a DB guy, so that’s about the limit of my knowledge about this. :smile:

Select = Easy
Insert = Moderate
Update/Delete = Hard

Well, I’m not sure what qualifies as “frequent”. I need to do it once per day (overnight) and I might consider doing it every 6 hours, which would be 4 times a day. I don’t think that qualifies as “frequent” but maybe I’m wrong. And it’s not just a handful of records, it’s probably about 8000, which I don’t know if that qualifies as a lot or not either.

In file system structures you don’t want to delete and reuse over and over again. I wasn’t sure if the same philosophy applied to MySQL structures as well.

Perhaps I’m being dense here, but why are you deleting, then adding the data back in. Why not just UPDATE the rows instead? It’s a lot less overhead, and a much smaller chance of your tables getting FUBARed.

Seems to me UPDATE should help reduce the work involved re-indexing the table.

Updating doesn’t remove the records that are no longer applicable.

If the table has the data:

1 | resource | cold
2 | resource | warm
3 | resource | chill

And when I get updated information to put into the table, and it turns out that #2 is removed, how am I suppose to know to remove #2? The updated information contains:

1 | resource | cold
3 | resource | chill
4 | resource | new

I need to put that information exactly into the table. The simpliest way is to remove everything and then add in the above information. That way #2 is removed and #4 is added.

How else could this be accomplished? The only other way I can think of, would be to add a removal flag field

1 | resource | cold | 0
2 | resource | warm | 0
3 | resource | chill | 0

run an update to set the removal flag:

1 | resource | cold | 1
2 | resource | warm | 1
3 | resource | chill | 1

INSERT or UPDATE all new information, setting the removal flag to 0 on affected row:

1 | resource | cold | 0
2 | resource | warm | 1
3 | resource | chill | 0
4 | resource | new | 0

then deleting records that have the removal flag set

1 | resource | cold | 0
3 | resource | chill | 0
4 | resource | new | 0

I’m not sure if that’s any better than deleting and readding. This seems to consist of more computational overhead, but it may be better. I’m no expert on this.

Maybe a DELETE for only those that need to be deleted and an
INSERT IGNORE ?

That would do away with deleting rows only to insert them again.

Maybe I’m the odd one here but I think it’s ok. For example, say you got 6000 records and need to delete 5900 under a single transaction. If that’s the case then I think it’s ok.

What I would do is add 1 more column ‘Archived’ w/ boolean value. Any record you want to delete, you would update that column value to be ‘true’. Then you would execute daily/weekly db clean up that deletes all records that have Archived=true. Sometimes you may not want to delete…incase you’ve done it accidentally. This approach will also give some reliability to rollback the data.

1 Like

You’re right. It really depends on the needs.

I would do updating and insert but then… 9000 records is nothing. The time to do the changes is important so the database doesn’t interrupt any important job. But apart from that…

In this particular case, and with such a low number of records, it doesn’t matter that much.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.