Help with this simple? query

chemicaluser · October 2, 2012, 5:23pm

Ok, I am hoping someone could help me out here. I have a query which is not working properly and is driving me crazy.
I have two tables
data_table = my main working table
daily_removed = same structure as main table, but less records

what I would like to do is
update the column (Status) in the data_table, for all records where the PiNumber is listed in the daily_removed table

This is where it gets weird for me… when I do

select count(data_table.PiNumber) from data_table
inner join daily_removed ON
data_table.PiNumber=daily_removed.PiNumber

I get a total of 1528 (which is the correct number) of matching records in the two tables
But when I go to update the Status column using

UPDATE data_table
INNER JOIN daily_removed
ON data_table.PiNumber = daily_removed.PiNumber
SET data_table.Status='removed';

only 1214 Records get updated. ’

Also, as a side note I added exactly 1214 records to the daily_removed table today …

What could possibly be happening??
Let me know if I need to post additional info, but any advice would be greatly appretiated.

KyleWolfe · October 2, 2012, 5:45pm

This is most likely because you have duplicates in the daily removed table.

rcashell · October 2, 2012, 6:18pm

If you look at the UPDATE query how many were already on the status ‘removed’? For example:


mysql> select * from users;
+-----+----------+------+
| uid | username | name |
+-----+----------+------+
|   1 | Tom      | Ron1 | 
+-----+----------+------+
1 rows in set (0.00 sec)

mysql> update users set name = 'Ron1' where username = 'Tom';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1  Changed: 0  Warnings: 0

You can see in this example, there was a matching row but nothing was changed.

chemicaluser · October 2, 2012, 6:47pm

Hi,

Thank you for taking the time to reply.
I also thought about how many records currently have the Status ‘removed’ - but none do … i even changed the value of all of the records (in both tables) to have status ‘active’ … and tried the update query then, but still got the same result.

In fact, when I run my update query above … and then do a seperate query to count the number of records in my data_table with Status removed… I get 1215 which is the same number that was updated …(but should have been 1528)

The duplicate suggestion may have something to it, I definetely have duplicates in my daily_removed (and data table) … and I have tried joining on other fields as well but with same result … I guess I will have to try and join on all of the fields (perhaps I joined on ones that have same values)

This is driving me crazy!

KyleWolfe · October 2, 2012, 7:05pm

Try this to see how many records should be updated:



[[COLOR=#990099][B]select[/B][/COLOR]](http://search.mysql.com/search?site=refman-%35%31&q=SELECT) 
[[COLOR=#000099]count[/COLOR]](http://dev.mysql.com/doc/refman/%35.%31/en/group-by-functions-and-modifiers.html)[COLOR=#FF00FF]([/COLOR]data_table.PiNumber[COLOR=#FF00FF])[/COLOR] 
[[COLOR=#990099][B]from[/B][/COLOR]](http://search.mysql.com/search?site=refman-%35%31&q=FROM) data_table
WHERE
data_table.PiNumber IN (
SELECT
data_table.PiNumber
FROM
daily_removed
GROUP BY 
data_table.PiNumber
)

I’m betting this will return your 1214. once confirmed, you should throw in some criteria to make sure status does not already equal removed.

chemicaluser · October 2, 2012, 10:21pm

Hi,

Thanks … you were right, your query returned 1214 rows
But I am still stuck, lol

When I use this query, I still get the same number of rows 1214
What am I doing wrong?

BTW, before I run this query RecordStatus for all records is set to ‘active’ in both data_table and daily_removed

UPDATE data_table
INNER JOIN daily_removed
ON data_table.PiNumber = daily_removed.PiNumber &&
data_table.DDate = daily_removed.DDate
SET data_table.Status='removed'
WHERE data_table.Status = 'active'

guelphdad · October 2, 2012, 11:04pm

Where are you running this query? in the mysql client? in phpmyadmin or heidisql or some other GUI or in a front end application you haven’t mentioned?

Secondly, why the redundant table? why not simply update the main table and forget about the second table altogether? Unless you are talking 10s or 100s of millions of rows of data performance won’t be affected simply by having a few thousand or hundred thousand rows extra.

chemicaluser · October 3, 2012, 1:31am

Hi, I am running this query in phpmyadmin
I have 4 working tables

data_table = main working table
data_table_temp = daily data comes in
daily_removed = every day, all records in data_table that are not in data_table_temp
daily_added = every day, all records in data_table_temp that are not in data_table

Basically, every day I get a file which i load into the data_table_temp … I then take all new records and store in table daily_added, I also add a copy of the new records to my main table (data_table)

For the removed records, I keep a copy of all the records removed in the daily_removed table, but I also want to update my main working table in a column to say removed.

The reason I am keeping a copy of the removed/added records in separate tables = no particular reason, I guess I could keep everything in the same table, just seemed more organized.

But I still don’t understand why the query above did not update all 1528 rows

KyleWolfe · October 3, 2012, 2:12pm

chemicaluser:

Hi,

Thanks … you were right, your query returned 1214 rows
But I am still stuck, lol

When I use this query, I still get the same number of rows 1214
What am I doing wrong?

BTW, before I run this query RecordStatus for all records is set to ‘active’ in both data_table and daily_removed
UPDATE data_table
INNER JOIN daily_removed
ON data_table.PiNumber = daily_removed.PiNumber &&
data_table.DDate = daily_removed.DDate
SET data_table.Status='removed'
WHERE data_table.Status = 'active'

You’re not doing anything wrong. 1214 IS the amount of rows you should be updating. Just because you have dupes in the daily removed table doesnt mean your going to make an update multiple times in the data table.

If there are no duplicates in the data table, but there are duplicates in the daily removed, then an inner join between the two will always inflate your number greater than whats actually in your data table (WHICH IS WHAT YOUR UPDATING!) That’s why I gave you this code:

[[COLOR=#990099][B]select[/B][/COLOR]  [URL="http://dev.mysql.com/doc/refman/%35.%31/en/group-by-functions-and-modifiers.html"][COLOR=#000099]count[/COLOR][COLOR=#FF00FF]([/COLOR]data_table.PiNumber[COLOR=#FF00FF])[/COLOR]  [URL="http://search.mysql.com/search?site=refman-%35%31&q=FROM"][COLOR=#990099][B]from[/B][/COLOR]](http://search.mysql.com/search?site=refman-%35%31&q=SELECT) data_table WHERE data_table.PiNumber IN ( SELECT data_table.PiNumber FROM daily_removed GROUP BY  data_table.PiNumber )

Notice the group by clause? This eliminates dupes from daily removed. This query that I just gave you will return the CORRECT number of records that should be affected in your update query.

chemicaluser · October 10, 2012, 6:12pm

Hi,

Thanks very much, makes sense
I had to fix the code which adds data to the daily_removed table to prevent it from storing duplicates.