Is there a better way to relate a web page to a row in a database?

We currently use a CMS we developed in-house. It’s not a full CMS, as it just contains things like the contact name for a given web page, page title, and some other details, but not the content.

As it is now, each page is “linked” to its row in the database by the page URL (minus the doman name). Since this is a text column and not a numeric column, I know it’s not ideal, but I can’t think of any other way to do it. If I use an ID number, then it seems like I would have to have an ID number on each page (and there are around 60,000 of them) so that would be hard to maintain.

So far, the current method works, BUT our main table is starting to get pretty big, and the server people have asked us to try to reduce the sizes of our databases (we use Access, so each database is an MDB file). I was thinking that it would save space (and improve performance) to somehow make this a numeric field, but I can’t think of a way to do it, especially when each page is still its own individual file.

Anyone have any advice?

I’d use slug_name as your primary key. For example, take the url “http://localhost/blog/my_page_title”, I would store the page in the table as “my_page_title”. This would allow for some simple url hierarchy such as “http://localhost/blog/my_page_title/page2”. Just keep slug_named indexed and you’ll be fine (when you migrate to a real db service :D)

The problem with using just the last bit of the page path is that with 60k pages we’d have duplicates.

For instance, we could easily have these thee (and probably do!)

/planning/research/index.cfm
/environment/air_quality/research/index.cfm
/environment/climate_change/research/index.cfm

I don’t know how to make those unique without using the entire path as the identifier that links it to the page.

I do have that column indexed though.

Those 3 look unique to me… Either way I think its time for a little overhaul of your system

Well they are unique, but I thought you were saying to get rid of everything but the last bit. I guess I didn’t understand what you meant.

…and Google will drastically penalise your site.

Maybe consider using the current page title as a canonical link:

  1. introduce a new table->f_canonical column into your current enormous main table
  2. create a function->fn_canonical(table->title)
  3. function returns with no quotations and spaces replace with with a SEO friendly hyphens
  4. iterate through the table
  5. update table->f_canonical column with results from function->fn_canonical(table->title)
  6. on eof
  7. check for table->f_canonical duplicates
  8. if duplicates modify table->title and return to item 4.

When complete you will have a unique numeric record for each page->title and be able to use the canonical link:


  <link href="<?=$table->$canonical;?>" rel="canonical" />
  <title>Your Category, etc  | <?=$table->title;?></title>

…Google will be delighted and make you number one in all their related searches :slight_smile: