PHP Mysql - Merge Tables Except Duplicates

I have a two sports db tables. I would like to merge both without duplicate entries as defined by one of the columns – $row2[2]', – as shown below. I think I have my code correct, but when I run the script, it just stalls … the program appears to be processing, but the browser timer icon just spins. When I refresh the database, no rows between the two tables have been updated or copied.

I figured that as the script ran, it would more or less update the table automatically, and these updates would be visible upon refresh.

Any help would be greatly appreciated.



<?php
require_once('config.php');

// Give the script unlimited time to run
set_time_limit(0);

// Report all errors except E_NOTICE   
error_reporting(E_ALL ^ E_NOTICE);

// Set Error Handler treat Errors as Exceptions
set_error_handler("exception_error_handler");

// Check if Form is Submitted or not
if(isset($_POST["submit"])) {
    
        // Read all rows from second table (Source / appended table containing many duplicate rows)
        $sql2 = "SELECT * FROM `".$DB_TABLE_NAME_2."`";

        // Execute query and get result set
        $result2=mysql_query($sql2);
                    
        // Check for result
        if($result2) {
            // Get number of rows found
            $num_rows2=mysql_num_rows($result2);
                        
            // Check number of rows
            if($num_rows2>0) {
                        
                // Loop each row from the result set
                while($row2=mysql_fetch_array($result2)) {
                    
                    // Find duplicate data in first table (Destination / Original table containing unique rows).
                    $sql1 = "SELECT `".$DB_TABLE_VOTER_BASE_COLUMN_NAME          // Column from Destination File
                            ."` FROM `".$DB_TABLE_NAME_1                                    // Destination File
                            ."` WHERE `".$DB_TABLE_VOTER_BASE_COLUMN_NAME            // Column from Destination File
                            ."` = '".$row2[$DB_TABLE_VOTER_BASE_COLUMN_NAME]."'";       // Column from Source File

                    // Execute query and get result set
                    $result1=mysql_query($sql1);
                    
                    // Check for result
                    if($result1) {
                        // Get number of rows found
                        $num_rows1=mysql_num_rows($result1);
                            
                        // Check number of rows
                        if($num_rows1==0) {
                            
                            // Add new record into first table from second table 
                            $sql = 
                            "

                            INSERT INTO `($DB_TABLE_NAME_1)` 

                            (
                            `id`,
                            `ranker`,
                            `referee`,
                            `position`,
                            `refereeStyle`,
                            `baseLoad`,
                            `baseCount`,
                            `baseAverage`,
                            `eraClass`,
                            `refereeTraining`                
                            ) 
                            
                            VALUES 
                            (
                            '$row2[0]',
                            '$row2[1]',
                            '$row2[2]',
                            '$row2[3]',
                            '$row2[4]',
                            '$row2[5]',
                            '$row2[6]',
                            '$row2[7]',
                            '$row2[8]',
                            '$row2[9]'
                            )
                            "
                            ;
                            
                            // Execute insert query
                            mysql_query($sql);
                        }
                    }
                
                    // wait for 500 milliseconds (0.5 Sec)
                    waitFor(500);
                
                }
            }
        
        }
    
    echo "<br>Completed.<br><a href='index.php'>More.</a>";
} else {
    header('Location: index.php');        //Redirect to index.php if Form not submitted
}

// Function to Delays program execution for the given number of milliseconds.
function waitFor($s) {
    usleep($s*1000);    // Value in microseconds
}

// Function to handle Errors
function exception_error_handler($errno, $errstr, $errfile, $errline ) {

    // Throw new Exception on Error
    throw new ErrorException($errstr, $errno, 0, $errfile, $errline);
}

?>

How many rows does this query produce? You seem to be sleeping 0.5 seconds for each of them.

Also, I think you should be able to achieve what you’re looking for with a single query using a left join - worth reading up on :wink:

Also also - running queries inside a loop is always painful!

1 Like

The query will result in about 100K rows. Too many?

I decreased the time from 500 to 10 and from 1000 to 100. Still no luck.

IF I’ve understood your problem correctly (that you only want rows from table2 where there isn’t a match on that $DB_TABLE_VOTER_BASE_COLUMN_NAME column in table1), then this query should find only the rows from table2 that you’re interested in.

Note also, that I’m assuming the columns are identically named but maybe that’s wrong - by all means modify the select fields as required! :wink:

SELECT 
t2.`id`,
t2.`ranker`,
t2.`referee`,
t2.`position`,
t2.`refereeStyle`,
t2.`baseLoad`,
t2.`baseCount`,
t2.`baseAverage`,
t2.`eraClass`,
t2.`refereeTraining`
FROM " . $DB_TABLE_NAME_2 . " AS t2
LEFT JOIN " . $DB_TABLE_NAME_1 . " AS t1
ON (t1.`".$DB_TABLE_VOTER_BASE_COLUMN_NAME` = t2.`".$DB_TABLE_VOTER_BASE_COLUMN_NAME."`)
WHERE t1.`".$DB_TABLE_VOTER_BASE_COLUMN_NAME."` IS NULL

The translation of that query (if I’ve written it correctly!) is "Find all of the rows from table 2 and optionally find matching rows from table 1 where the $DB_TABLE_VOTER_BASE_COLUMN_NAME values match. And then, use the where clause to invert it - i.e. ignore the rows where there is a match and just show me the rows where no match exists.

Something else to note that is that ‘id’ column - is it a primary key? You’ll most likely end up with collisions if you try to insert an id value from table2 that’s already present in table1.

1 Like

Er, yes - at 0.5 secs per row, that’s a total sleep time of nearly 14 hours :wink:

You can do all that in a single SQL call. @RavenVelvet has given you the select part and all you need to do is to wrap an insert around that to update the other table with the results from that select. With only 100000 records it shouldn’t take more than a few seconds to run as the database will be able to optimise the process far better without any PHP to slow it down.

If you’re not already aware you should be aware that the mysql_* extension is deprecated as of version 5.5 of PHP and is being removed in version 7 of PHP

2 Likes

Yes

“(that you only want rows from table2 where there isn’t a match on that $DB_TABLE_VOTER_BASE_COLUMN_NAME column in table1)”

When I run this with IS NULL I get rows returned = 0.

Table 2 has 800K rows with one row I want in Table 1. Table 1 has 100K rows of complete information. So it’s not as if the rows = NULL. They simply do not exist. Will try WHERE NOT EXISTS and see what happens. I’m really new at this.

My apologies, but how do I wrap it an insert around the select query?

Just prefix the SELECT with the first half of a regular insert statement… Something like this

"INSERT INTO `" . $DB_TABLE_NAME_1 . "` (col1, col2, col3.....)  <-- specify col names correctly
SELECT ....  <--- This is where the select query goes

Effectively what you’re doing is replacing the VALUES (...) part of an ordinary INSERT statement with a SELECT statement that will provide the values instead.

2 Likes

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.