Multilingual CMS - how to design the database

Hi

I have the following question. I have my bunch of libraries - modules (I can’t name that a CMS, even it is look like one, since customer can edit almost all parts of his website), which I use in my everyday projects which are - customer websites.

I have the following modules: products, news, faq, services, projects and articles (which are actually pages like contact, about us, our history, facilities, privacy policy… you name it)

I live and work in Greece, hence standard webiste here is considered 2 languages, Greek and English. It happens however customer to request 3,4 or more languages. My libraries - modules and database tables are created for 2 languages, and if customer request more than 2 languages, I have to alter all the modules that have to be installed on the customer website.

For example, here is how FAQ Categories table look like:


CREATE TABLE IF NOT EXISTS `faq_categories` (
  `catid` int(11) NOT NULL AUTO_INCREMENT,
  `parentid` int(11) DEFAULT NULL,
  `categoryname` varchar(255) NOT NULL,
  `categoryname_en` varchar(255) DEFAULT NULL,
  `description` text,
  `description_en` text,
  `metatags` text,
  `metatags_en` text,
  `sorder` int(11) NOT NULL,
  `visible` tinyint(4) NOT NULL,
  `categoryphoto` varchar(255) DEFAULT '',
  PRIMARY KEY (`catid`),
  KEY `parentid_fk` (`parentid`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=195 ;


Now, as you can imagine, if customer request one more language, I have to alter not only the database tables, but the php code as well, which on large project may be quite cumbersome.

Can anyone propose better database design than this? It would be nice to have somewhere langid field, and the record to be retrieved on base on that, but I have no clear picture of how that may work. Any idea or link for further reading will be deeply appreciated.

I post this question here on database forum because If I create the design of the database right, I will find my way in CodeIgniter (php).

Regards,Zoran

continue to use the existing structure for table relationships, but split off the language-sensitive columns into a one-to-many table…


CREATE TABLE faq_categories 
( catid         INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY
, parentid      INTEGER NULL
, sorder        INTEGER NOT NULL
, visible       TINYINT NOT NULL
, categoryphoto VARCHAR(255) 
, KEY parentid_fk (parentid)
);
CREATE TABLE faq_categories_langs
( catid         INTEGER NOT NULL 
, lang          CHAR(2) NOT NULL
, PRIMARY KEY ( catid , lang )
, categoryname  VARCHAR(255) NOT NULL
, description   TEXT
, metatags      TEXT
);

thus each faq_categories row will have multiple faq_categories_langs rows related to it, one for each language

but the main structural attributes (parentid, sort order, etc.) stay with the faq_categories table

The method proposed by r937 is the more ‘proper’ one where your data is normalized but in practice I find it easier to work with your original model where you have separate columns for each language - however, I’d end them all with language suffixes (_gr, _en) for better consistency when accessing them. The advantages are less complex queries (no need for joins), easier database viewing, faster queries (if that matters).

The disadvantage is more difficulty adding more languages, so if you want to leave room for expansion then use separate tables from the start. Another disadvantage might be you will need to put more logic in your client language (e.g. php) to query generation like adding language suffixes to column names - but depending on your setup this might not be a problem.

Whenever I am 100% sure a web site will never require more than 2 languages I go with single tables/multiple columns. I manage a fairly complex bilingual online shop with single-table approach and I find it convenient not to have all those separate language tables since there can be many of them in a large database - this results in faster management of the whole system.

Hi

Thank to both of you for your answers. I appreciate when someone spent his time to help me to solve my problem.

I will have difficult decision to make i guess. I was hoping for something better but…buh.

Anyone know how joomla has solved this problem? Any other open source idea that may be usefull?

Regards, Zoreli

When you search for “multilingual database design” you will find plently of ideas and also other approaches (but I think it’s enough if you choose from the two presented in this thread). Each one has its weaknesses and strengths so it’s not a simple yes/no decision. I don’t know about joomla but I would expect any ready-made content management system to use one of the normalized solutions like the one presented by r937.

I was searching on Google for multilingual database and bumped into this. I’m going to create the database as r937 suggested, so my database will look like this:


table Posts
---------------------
post_id (pk)
date_added
date_published
is_visible
trash
etc etc

table Posts_lang
--------------------------
id (pk)
lang_id
post_id
title
post_body

table Languages
---------------------------
lang_id
language
lang_code

So in order to fetch the data from the database, I will have to do JOIN queries, for example:


SELECT `posts`.`post_id`, `posts`.`date_added`, `posts`.`date_published`, `posts`.`is_visible`, 
`posts_lang`.`title`, `posts_lang`.`post_body`
FROM `posts` LEFT JOIN  `posts_lang` 
ON `posts`.`post_id` = `posts_lang`.`post_id`
WHERE `posts_lang`.`lang_id` = 1

This looks quite easy for me, but I have some blank spots that I’d hope to make them clear.

  1. how would the INSERT or the UPDATE be in this case and how should I create the page with the form ???
  2. Suppose that in case of cms-pages, faqs etc , like Zoreli said, we have a specific page translated in 2 languages, which is what we want.
    What If I have a post or an article that is only in one language? Do I have to use a dig=ferrent database schema for items like posts/articles in this case or keep the same?
    What I’m trying to say is that pages, faqs etc may be in more than one language, but posts/articles maybe only in one language.

easiest way to resolve this blank spot…

store posts and articles and pages and faqs all in the same table (obviously, you’d want a different name than either “posts” or “articles” or “pages” or “faqs”)

No, I’m afraid it wasn’t this my point.

  1. Let’s take for example the Posts. How should the form structure will look like? Do I have to create 3 different forms (one for the general stuff, one for the English content and one for the Greek content), or use just a single one to hold all my fields?

  2. All items will be storered on different tables, so pages on pages table, posts on posts table etc etc. All pages will have content in two (or even more) languages, but there maybe a post that will be only in one language. Do I have to use a different database structure in this case or keep the same one?

Hope that make sence now what I’m trying to say.

  1. forms design would be a different forum, sorry

  2. keep the same structure – and please seriously consider using one table for all posts, pages, faqs, etc.

I don’t get it. Why should I use one table for all posts, pages, faqs ??? they may not have the same structure , except title and body_content maybe.

i’ll bet they have plenty more in common

but hey, it’s your database, you make the call

:slight_smile:

I 've found on an article another option for the database structure that I’d like to discuss.

Example:


table: Pages
------------------------
page_id
title
content
etc etc

table: Page_translations
---------------------------------------
page_trans_id
page_trans_title
page_trans_content
lang_id

So what you can basicly do, is to store default language content inside pages table , as you would normally do in a no-multilingual scenario, and all translated contents store them inside the page_translations.
However I’m not quite sure how the queries should be build then in order to fetch the data.

What do you think of this option???

six of one, a half dozen of the other

What do you mean ??? Could you be more thorough please??

i mean they are approximately the same in pros/cons