Handling telephone numbers

I’m trying to work out the best way to do this. I’ll be dealing with UK telephone numbers which will all be stored in the format: 0xxxxxxx… so no country code.

On searching about this it seem the consensus is to store telephone numbers as strings but I can’t help but think storing as a number may solve some problems while also speeding up searches and reducing storage size!?

The only thing I’ll be doing is storing, searching and displaying those numbers.

One of the problems is spaces. For example 01234567890 could be entered as 01234 567890 or 01234 567 890. So I’d need to handles that when doing a search.

If I stored the numbers as an INT then I just need to handle the leading zero.

What do you think?

No expert here, but I’d probably just store them as a string, and strip the spaces (and parentheses/dashes/dots, at least for our numbers), storing just the characters. Figure out how you want to enforce the minimum and/or maximum string length to protect against someone entering the wrong number of digits. Not being that familiar with your phone number formatting, our equivalent would be if some entered less than seven digits (incomplete even for a local number), more than 10 (area code plus local number), or more than seven but less than ten (incomplete area code and/or some portion of the local number).

I believe your subsequent search queries would need to also strip the spaces, dashes, dots, etc. from the user-inputted string so that it would match the value stored in the table.

Alternately you could program things to where the value is stored as a formatted string, with spaces, etc. in the appropriate places…

store both the stripped number and the formatted number

as strings, of course

searching an index on a 10- or 15-char string is not going to be noticeably slower than searching on an integer index

Ok thanks. So why would storing as a number be a bad idea in this case?

Storing as a number doesn’t allow for any leading zeros on the front of the phone number.

Using VARCHAR column ensures that it can accept several possibilities of phone number formats as well. Like those with:

  • Leading zeros
  • Spaces
  • Hyphens and braces
  • Even alphabets

However, you can always filter the user input if you strictly want to follow a format. Otherwise, you can respect the user input (by not saying your phone format is wrong, retype the correct one). Just assume user input is valid, unless it is filled by a spammer.

That’s the thing, if I’m ONLY using numbers I won’t have any of those issues except the leading zero which can be added for display purposes.

The problem would be that there is no way when a phone number is stored in a number field to tell whether or not it has a leading zero and how many leading zeros it should have - eg. the phone number 000 would be stored simply as zero and you’d have no way to tell that it is supposed to be three zeros.

All UK telephone numbers have one leading zero.

Better imv, to store the leading 0, regardless. So when you display it in your webpage, you can process the dropping of it, if you are to show +44, for example.

If there was no 0, then would you not have to hard code that process of adding/dropping the 0, in your script?

bazz

Yes baz, but that seems quite trivial and straight forward considering that whichever method I use I will have to manipulate the data for display and search purposes anyway.

no you wont :slight_smile:

see post #3 – store bofadem :wink:

r937, I’m not sure storing bofadem! is a good idea, because all of a sudden we have added an extra complexity to data integrity, where any insert and future updates will have to make sure that both fields are holding the correct value.

Also data will need to be manipulated either way because for example:

record_field = 0123456789
formatted_field = 0123 456789
search_field = 0123 456 789

The search_field will need to return a match, so there’s no getting away from it.

seems you have three columns, not just the two that i suggested

You can deal with single column as well.
format your searched numbers according to what way you save them in the database.
Whle showing / printing, format the phone data according to second field. That is just for a display and not stored.

I think, only one field is enough to hold.

a telephone number is NOT a number - there is no meaning to the concept of adding, subtracting multiplying or dividing them. A phone number IS a string of characters, from the domain of characters, which just by coincidence all happen to be in the range 0123456789. But they are not numbers. So don’t use a numerical field.

storing as a string gives you options on the way the data is entered that at the present moment you may not have thought about or think you don’t need. for example we all split up phone numbers into groups such as 01324 897 465 because this makes reading them and remembering them easier. While at present you think that 01324897465 is easier, at some point in the future you or another user may choose to enter 01324-897-465 or use spaces, which will not be accepted. Your idea is universally considered wrong, so why choose something that everyone says is wrong.

Now the phone number example you have forgotten about:
01324 897 465 ext 2305

think about that one as a number…

I would do it in 1 column - at least at the beginning. Either store the unformatted number (digits only) or formatted (digits + spaces and other extra characters) depending on your needs.

When using the latter approach use a function while searching that will strip out special characters. In the future, if the database grows large and you need to index the numbers for searches then use another column as r937 suggests. Or do it immediately if you want to be prepared for that already.

I have a similar situation in a client table but with tax id numbers - they are stored as entered by users (sometimes with dashes and/or spaces). The table now has 12000 rows and doing a search for the number using REPLACE functions is lightning fast - so I’m not even considering the need to add another column just to be able to use index. But it may depend on how much data you have in your rows - for example, if each row stores large amount of data in TEXT or BLOB columns then doing a full table scan over 12000 rows can be time-consuming. My client table has only 12 varchar and 2 numeric columns, and occupies 1.5MB so that is not an issue and I expect it won’t be still for quite a long time.

If you read this wikipedia article on UK phone numbers you’ll see that there is a variety of formats so your probably best of having two fields (or one field if it’s just going to be UK numbers only) 1 field would be the 2 digit country code and the other would be a varchar field with any space and non-nuemric characters stripped out.

On the server-side language part using php as an example, you’d strip any non-numeric characters from the phone number and send that to the database after having first sanitized the number. The database would then be doing a simple comparison of strings.

Any formatting of the phone number could be done in php basically using a switch-break block.

I should have been a bit clearer in the example - the search_field is a user input field not a database field.

I agree, store one field and format it for storage and display purposes.