Updating lookup tables using PHP / MySQL - best practice?

Hi,

I am in the process of creating a web site where each user has a number of roles (can be any number) and these roles are stored in a lookup table similar to the “Build your own database driven website…” sitepoint books which describes a “many-to-many lookup table”.

Each user has an ID (users table)

A lookup table contains their ID and a “role ID” - some users will have multiple rows if they have several roles:
mysql> select * from rolelookup where userid=1;
±-------±-------+
| userid | roleid |
±-------±-------+
| 1 | 2 |
| 1 | 3 |
±-------±-------+

Role category table contains roleid and roleCategory - just describes those roles - used to create HTML form which is generated dynamically.

The Problem.

I can can select the roles and display these as check boxes, if the role check boxes are updated / changed I can collect the new roleids in a PHP array. How would I go about changing this in the lookup table? would I delete all userid=1 records and then reinsert the changed roles? Also I might have a number of users with the same roles and these would then be duplicate data in this table but assigned to each users. Could you create instead of userid and rolegroupid that relates to a particular role grouping?

Any ideas, what is the best way to do this? I thought deleting and reinserting which I could do was a bit crude - maybe it is the best way? Any ideas about changing this to a rolegroupid?

Sorry if this is not explained that well - I am learning the MySQL and PHP as I go.

I look forward to some comments / advice

Thanks,

James

deleting and reinserting is both the simplest and easiest way to do it

sorry, don’t know what you were doing with the rolegroupids

If you care a bit about performance, you will only DELETE those records that need to be deleted, instead of all of them.

It would be a bit more code, but will run faster (especially if you have some indexes on those fields).

But depending on how many roles you have/if you ever search by role, you might want to just make a bit field, and do one update. It takes way less db space, and it’s faster/simpler to work with.

a bit field???

oh NOOOOoooo…

:eek:

I could have 2 or more users with the same role combination - I thought if a rolegroupid was used instead of userid I only need one record for that particular combination linked to a number of users records (table: users) instead of lots of users with the same combination - does this make sense?

Thanks

James

I have both fields in the lookuptable indexed / primary keys. I am keen to keep performance good to reduce the load as a lot of the content in driven from MySQL. Both fields are set as INT(6) and INT(2) respectively - have not come accross a bit field so will check this in MySQL manual.

Thanks,

James

not yet… perhaps you could do a SHOW CREATE TABLE for all the tables invovled

I am at work so I will post table details when I get home
Thanks,
James

Please see tables below, Regards, James.

| users | CREATE TABLE users (
userid int(25) NOT NULL auto_increment,
username varchar(40) NOT NULL,
password varchar(40) NOT NULL,
labid varchar(6) NOT NULL,
email varchar(80) default NULL,
fname varchar(40) default NULL,
sname varchar(40) default NULL,
dtele varchar(14) default NULL,
address1 varchar(40) default NULL,
address2 varchar(40) default NULL,
county varchar(40) default NULL,
postcode varchar(9) default NULL,
country varchar(40) default NULL,
disciplineid int(2) default NULL,
registered datetime default NULL,
lastsess datetime default NULL,
sessnum int(4) default NULL,
lognum int(4) default NULL,
priv varchar(1) default NULL,
position varchar(2) default NULL,
PRIMARY KEY (userid),
UNIQUE KEY username (username),
UNIQUE KEY email (email)
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=latin1 |

rolelookup | CREATE TABLE rolelookup (

userid int(6) NOT NULL,
roleid int(2) NOT NULL,
PRIMARY KEY (userid,roleid)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

±-------±-------+
| userid | roleid |
±-------±-------+
| 1 | 2 |
| 1 | 3 |
±-------±-------+
2 rows in set (0.00 sec)


| rolecat | CREATE TABLE rolecat (

id int(2) NOT NULL auto_increment,
role varchar(30) default NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM AUTO_INCREMENT=5 DEFAULT CHARSET=latin1 |

±—±-----------------+
| id | role |
±—±-----------------+
| 1 | CPD Officer |
| 2 | Safety Officer |
| 3 | Training Officer |
| 4 | Quality Officer |
±—±-----------------+
4 rows in set (0.00 sec)

i was hoping you’d show the rolegroup stuff, which was the part i didn’t understand

:slight_smile:

by the way, INT(2), INT(6), and INT(25) all hold exactly the same range of numbers

If both your fields are indexed, then you want to do as little changes as possible on them, since every change will rebuild your index.

For the bit-masks, you can see a simple example here: http://blog.fragfrog.nl/archive/122/Bitmask%20access%20levels/
The idea is that you only have the user_id indexed, and the role is all just one value, that holds all that user’s roles.
So, when you change a role, all you do is update one INT field, and there is no need to update the indexes and so on, plus less traffic on the db.
On the bad side, you will be limited to a maximum of 64 different roles, and searching based on roles will be slower.

On another note:
INT(6) and INT(2) is the same thing… still takes the place of an INT in your DB.
The 6 and 2 is there so it pads your smaller numbers with spaces to pretty up your results (basically useless for what you need).

See here: http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html for what you actually want to do (probably MEDIUMINT unsigned and TINYINT unsigned)

[QUOTE=r937;4560403]i was hoping you’d show the rolegroup stuff, which was the part i didn’t understand

The rolegroup was an idea that in addition to userid I would have a rolegroupid that reference the lookup table for a particular combination of roles - 4 roles would have 4 rows in the lookup table relating to the rolegroupid. Therefore users with the same rolegroup id would reference the same roles instead of essentially their own copy of the same roles.

That way if the rolegroup was changed and the new group exists the reference would change if it was a new combination then that combination would be inserted into the lookup table - but it could be a number of inserts.

Do you think it would be better just to store the roles as an array or bit-mask in the users table - have I been overcomplicating things?

Thanks for pointing our the integer type I think I will need TINYINT.

Thanks for pointing me at the reference docs for numeric-types.

So is a bit-mask similar to storing these in an array but the mask uses less resources? I am keen not to overwork the database so this sounds like a good solution - I will never reach 64 roles.

with the operative part being “sounds like”

there is really no reason to resort to complex storage strategies to save a few bytes

not when you trade off simplicity

why don’t you mock up a few queries that would use bit operators to find those users with a certain combination of roles

i guarantee you it is not as easy as just storing foreign keys

not to mention that if you plan to do your bit string manipulation with an application language like php, you’re pretty much resigning yourself to table scans, i.e. your app won’t scale

all to save a few bytes

this is not the 1970’s, where a megabyte of storage cost you six figures

you can buy gigabytes for pennies, nowadays

I feel more comfortable with the queries than storage strategies and working though it will help to increase my database knowledge.

Do you think a rolegroupid would be a better approach (rather than having multiple users with the same role records) and will having 2 indexed fields in the lookup table slow things down significantly?

Regards,

James

that depends on the number of different combinations, doesn’t it :slight_smile:

no

:slight_smile:

It all depends on how your application is codded, and how those roles are used.

If you need to search on those roles (show me all users who can do X), then it’s better to have another table with records for each role set on each account.

If your only using roles to allow logged in users to do X or Y on your site, then bit-fields are better. (you load them at login, then use them for the rest of your site).

And the idea of bit-fields is not the size on disk, since as pointed out, 1TB HD goes for 100$ now. It’s the performance gained when your system gets bigger.
You gain this performance by not having to rebuild indexes on every update (see how long it takes to remake an index on a 35mill records table), and network traffic on select / update (master/slave setup or DB server separated from your web server).

It all depends on how you use the system, how big it is and how you plan to expand it.

Thanks all for your help, I will try with the lookup tables and a rolegroupid. I will look at the bit mask idea for my user login as currently I have a basic system that could be improved.

Lots of work and reading to do but I have certainly leant a lot from this post.

Thanks,

James

Hi, I have sorted this out now and retrieve the values from the lookup table (a user may have any number of rows depending on the number of roles) which are then stored in an array used to populate check boxes (using PHP). When the user submits the form the new roles are stored in another array. Both arrays are then compared to decide which roles are added or removed from the lookup table.

My actual question is regarding bitmaps - I am looking to store a set of standards for each record (a professional development record) - there are four standards and each record may contain any combination of these standards. To me this looks like bitmaps would be useful, I have also discovered the MySQL “set” type which as I read it is literally the same thing as a bitmap - is this correct?

Regards,

James

Not true. Indices are stored in a sparse tree, meaning that not each every row in the table will be in the index, so not each and every insert/update has to update the index.
Secondly, the index is only rebuilt if the tree gets too unbalanced. This will certainly not occur on each insert/update that does get applied to the index.