Matching in records

Hi,

I am trying to something like below and hope some guru can advise how to accomplish this :

In my db, I have a column with all random numbers : e.g : 3213, 1234, 1235, 3132, 8231 and etc, with more than 300 records

Now I want to do something like match all the numbers like 3213 and permuted numbers of 3213 with all other numbers and count +1 if the other numbers matched other rows in the table.

e.g :
I have these rows in the table :
1)3231
2)3211
3)1213
4)1231
5)1234

I am thinking to do something like this to get what I want : We take 1)3231 as an example :
First, I will get all permuted numbers for the 1) 3231
*3231

  • 3213
  • 1323
    and etc

then assign each of the number into an array

from the array, I will do another round of matching with all other records from other rows for each number stored in the array.
e.g : take *3231 to match with other rows in the table
2)3211
3)1213
4)1231
5)1234

if not found, then count is 0

then move on to the second permuted number *3213 to match with other records
2)3211
3)1213
4)1231
5)1234

Same goes to the others.

After all the permuted number for 1) 3231 have been matched with other rows, we will move on to the other row number - 2) 3211 and do the same process all over again… get all permuted numbers for 3211, assign to array and do matching with all other rows.

I am wondering is there any more simple way to accomplish this function?

Hope gurus can get what I am trying to do and share with me your ideas. Thanks

How often do you need to run this process?

It’s much easier (and faster) to implement an algorithm that complex in a regular programming language than a query, and 300 rows is very few.

quite frequent… sometimes we will have more results.

If I’m following your logic here, I would return all rows to the application and use a trivial regular expression (or if your DBMS supports them, such as Postgres does).

Using “3231” from your example, you would just construct the following regex (“[1]+$”) which would match any value that only contains 3’s, 2’s and 1’s. If the value is of a fixed length (in this case 4), change the above to “[2]{4}$”.

Cheers,
D.


  1. 3231 ↩︎

  2. 3231 ↩︎

thanks.

tried this one :

if (preg_match(“^12+$”, “1234”)) {
echo “<br><br>yes”;
} else {
echo “no”;
}

but not working… ;please advise. thanksl

Not working how?

^12+$ means a string beginning with the number 1, followed by one or more instances of the number 2, then ending

Matching strings would be in the form of 12, 122, 1222, 12222, 122222, 122222…

^12+ would match 1234
^12[0-9]+$ would match 1234
[1]{4}$ would match 1234 and also 1111, 2222, 3333, 4444, 1222…

You might find this online regular expression tester useful


  1. 4321 ↩︎

I tried this :
if (preg_match(“[1]{4}$”, “1234”)) {
echo “<br><br>yes”;
} else {
echo “no”;
}
?>

then receive this error :
Warning: preg_match() [function.preg-match]: No ending delimiter ‘^’ found in C:\wamp\www\preg\index.php on line 2


  1. 4321 ↩︎

The entire expression needs to be surrounded by delimiters, which can be any character you won’t use in the expression. Forward slashes and pound signs are common as other languages use them by default.

if (preg_match("#^[4321]{4}$#", "1234")) {

thanks. it is working fine now.

just need some explanation from you if u have time… may I know where I can find more information when to use the following symbols in the function :

^
[
$

thanks

has no meaning in the example, it is just used to mark the beginning and end of the pattern

^ matches the start of a string, so ^1234 does not match the string 51234

encloses a group of characters, + means match one or more instances of anything in the group

$ matches the end of a string, so 1234$ does not match the string 12345

thank you very much.

so this one :

+ is used to do matching like permutation? so basically the + here is the {4} that I used right ?

4 kind of number positioning ?

thanks in advance .

[1234]{4} means match any substring exactly 4 characters long consisting of characters in the group

note that 1234 is the set of characters, not one number

1111 is a string of exactly 4 characters that matches the group in the bracket

This is not a permutation. Permutations in regular expressions are extremely difficult – the length of the expression increase exponentially with the length of the string to permute. You can’t write it succinctly.

That’s why I originally suggested you write a program to do this rather than do it in a query. You can’t easily write a query that finds permutations.