Storing Hierarchical Data in a Database

Hi there.
I’ve been working on this problem for a few days now…
If anyone can answer this, it would be a huge help.
In the midst of creating a photo gallery with unlimmited sub categories…how can I find a parent’s ID provided that I have a subcategory ID.

For example,


function get_parent_id ( $sub_cat_id )
{
// process, process...
return $parent_id;
}

my table looks like this


ID | Name | Lft | Rgt | ParentID

Anyone? I’m even willing to give a small donation or whatever…

Thanks a bunch,
Dustin

Me again.
Brain Fart.

I played around a bit more and came up with this.


<?php
class find_parent
{
var $id;
var $left;
var $right;
	function find_parent ( $id )
	{
	$this->id = $id;
	$q = mysql_query("select m_c_left,m_c_right from multimedia_category where m_c_id='$this->id'");
		while ( $r = mysql_fetch_array($q) ) 
		{
		$this->left = $r['m_c_left'];
		$this->right = $r['m_c_right'];
		}
	}
	function get_values ()
	{
	$qq = mysql_query("select m_c_left,m_c_right,m_c_id from multimedia_category where m_c_right > '$this->right'
	and m_c_left < '$this->left'
	order by m_c_left desc limit 1");
		while ( $rr = mysql_fetch_array($qq) )
		{
		return $rr['m_c_id'];
		}
	}
}
?>

Using PostgreSQL, you can store the hierarchical path for each node in an array of smallints:

create table treedata (
id integer not null primary key,
node smallint not null,
name text not null
);

insert into treedata values (1, ‘{0}’, ‘Food’);

insert into treedata values (2, ‘{0,0}’, ‘Fruit’);

insert into treedata values (3, ‘{0,0,0}’, ‘Red’);

insert into treedata values (4, ‘{0,0,0,0}’, ‘Cherry’);

insert into treedata values (5, ‘{0,0,0,1}’, ‘Strawberry’);

insert into treedata values (6, ‘{0,0,1}’, ‘Yellow’);

insert into treedata values (7, ‘{0,0,1,0}’, ‘Banana’);

insert into treedata values (8, ‘{0,1}’, ‘Meat’);

insert into treedata values (9, ‘{0,1,0}’, ‘Beef’);

insert into treedata values (10, ‘{0,1,1}’, ‘Pork’);

To retrieve, just SELECT FROM treedata ORDER BY node ASC;

To retrieve a branch: SELECT FROM treedata WHERE node[1:3] = ‘{0,0,0}’;

This retrieves ‘Red’ and its children.

Updating is simply a matter of saying: UPDATE datatree set node[3] = 2 WHERE node[1:3] = ‘{0,0,0}’;

This puts ‘Red’ after ‘Yellow’.

Hope this helps.

Just further to my post about PostgreSQL arrays, you can easily find all the immediate children of a node too:

SELECT * FROM treedata WHERE node[1:2] = ‘{0,0}’ AND node[4] IS NULL;

This retrieves ‘Red’ and ‘Yellow’, but not their children.

Just further to my post about PostgreSQL arrays, you can easily find all the immediate children of a node too:

SELECT * FROM treedata WHERE node[1:2] = ‘{0,0}’ AND node[4] IS NULL;

This retrieves ‘Red’ and ‘Yellow’, but not their children.

Can you easily retrieve just the immediate children of a given node via modified pre-order tree traversal?

Suppose you want a series of web pages to walk down a tree, so that each page displays just a node and its immediate children (each child linking to a page that displays it and its children)?

One dynamic web page could handle this recursive algorithm with my array-based tree quite easily and efficiently.

Sigh!

Sorry, make all those queries with SELECT FROM… into SELECT * FROM…

It is hard writing comments in such tiny text fields!

I was just browsing around and found this article. Great theories but why would you ever want to store hierarchical data in a flat database? Unless of course it was the only choice you had it seems like a bad idea to me. The first way takes too much system resources and the second has to have the tree rebuilt everytime you add a row to the table. Icky. I’d say this is the perfect situation for an XML database. An article that explores the XML database options for heirarchical data and php would be a great companion for this article. Anyone know of a good one?

XML does have it’s uses, though in your point you could query the XML with XPATH though you’d be limited to that given data within the XML file, whereas you couldn’t (easily) query the data in one XML file against another one, suchas product catelog and products.

Two seperate though interwinded entities that is where the database has the upperhand… relational data in other words.

On your last question, XML is a great technology… For distribution but not for performing queries. There are a number of XML databases out there, but I cannot see any benifit of using one over a relational database if all your going to do is to perform lookups/queries :slight_smile:

I had a problem like this awhile ago.

Here’s my solution:
http://geniegate.com/art/pdf/tree-structured-relational-database.pdf

It was kind of nice to be able to do this:


SELECT 
  * 
FROM 
  directory 
WHERE 
  lookup_directory_ancestor_path(id) LIKE 'Blah/%';

And not clutter the program up with a bunch of tree handling stuff.

I was going to go with flat files at first, but decided I wanted relational after thinking about some of the other aspects (integrity checks).

You can also create database-side indexes, but in practice, I haven’t needed them. (I just created an index, then dropped it, to see if it’d actually work.)

The Modified Preorder Tree Traversal method solves a lot of the problems of the Adjacency list model but it also has a few of it’s own.

When using a tree as the navigation of a website (or folders in Windows Explorer) you rarely see all branches of the tree at once. Rather, just the current node, child nodes, parent node, parent-siblings and so on back to root. So using food as an example, if the current node is Red you would see Food > Meat|Fruit > Yellow|Red > Cheery.

This isn’t easy to do using the Adjacency list model either. But with the Adjacency list model you can easily query the children of a node (without getting grand children) and the siblings of a node. I can’t see how to query the same with the traversal method so I will probably use a combination of the two methods unless someone can enlighten me.

@Courtney Miles: Explorer-Style? That can be done quite simply using the same technique.

First of all, you have to create Node-Groups. In our example case there would be the Groups (Food) (Fruit, Meat) (Red, Yellow) (Cherry) (Banana) (Beef, Pork).
Now add two more variables (e.g. “GroupLft” and “GroupRgt”) to the database and use the Tree Traversal on the groups.
In our example, this would result in
1(Food)12 2(Fruit, Meat)11 3(Red, Yellow)8 4(Cherry)5 6(Banana)7 9(Beef, Pork)10
Your DB should now look like this
Name,Lft,Rgt,GroupLft,GroupRgt

Food,1,18,1,12
Fruit,2,11,2,11
Meat,12,17,2,11
Red,3,6,3,8
Yellow,7,10,3,8
Cherry,4,5,4,5
Banana,8,9,6,7
and so on…

When you now Query “Cherry” by its Lft+Rgt-Values, you’ll get the path. Querying by GroupLft+GroupRgt will get you a nice basement to build an “Explorer-Style”-menu.

About only getting children and not grandchildren: Simply introduce a “depth” field. E.g.
Food=depth 0
Fruit,Meat=1
Red,Yellow,Beef,Pork=2

In combination with Lft and Rgt you are now able to choose how many “generations” of children you want to fetch.
(e.g. finding Children of Fruit, but not Grandchildren: SELECT * WHERE Lft>2 AND Rgt<11 AND Depth=(Depth_of_Fruit+1))

In response to Courtney’s comment above. If you add an extra field called level, then you can find children by searching within the left/right values that a level of level + 1.

To find siblings you can just find all nodes with the same parent ID

There is an article about this here:

http://www.phpriot.com/d/articles/php/application-design/nested-trees-1/index.html

Using postgresql for a hierarchy, and I came up with this to find the immediate children of a node. Assuming that the parent node has:
nodeleft = $L
noderght = $R

<pre>
SELECT * FROM node AS a 
    INNER JOIN node AS b ON a.nodeid=b.nodeid 
    WHERE a.nodeleft>$L AND a.noderght<$R 
        AND (SELECT COUNT(nodeid) FROM node WHERE nodeleft<=b.nodeleft AND noderght>=b.noderght)=(SELECT COUNT(nodeid) FROM node WHERE nodeleft<=$L AND noderght>=$R)+1
    ORDER BY a.nodeleft ASC;
</pre>

First we have a table of all the children. In a separate table, we finds all nodes that are one level deeper than the parent node. We do this by first retrieving the path to the root, then counting the records in that path. Then it returns the common nodes of the two tables.

I don’t know enough about PostgreSQL’s internals to know how optimal this operation is. For a small number of nodes, it’s trivial, but it would take a significant amount of time on a large set if PostgreSQL isn’t smart enough to satisfy the first table first, which would significantly reduce the result set that it has to look at.

I haven’t tried Weirdan’s method (mentioned above) yet, because I don’t understand it. If anyone has better, please post it.

This code simply doesn’t work for me …

Very nice article. Thanks fo even just making me think again today. :slight_smile:

Unfortunately, the method described in this article is incredibly inefficient.

can someone list a practival use for this?
and explain why its inneficient

Hi ! … I would like to know how could I make a node that would have many parents …

Let’s say I have this:

fruits

Apples

Oranges

Strawberry

Colored food

Red

[INDENT]Apples ( I want this to link to the other Apples )[/INDENT]

Orange

[INDENT]Oranges (Same thing)[/INDENT]

Black

How can I do that ? I’m really having a hard time trying to figure this out ! :smiley: … It must be me !

annweb,
By your description I understand you need to have two trees?
Or… you may need to add an extra property for every node (color property).

You can contact me on Y!M by sebastianvasile, I’ll try to give you a hand.

Regards

Thanks w31rd0 ! :smiley:

But I found a way, about a week ago on how to do it :smiley: with one table … I just create a “ghost” category that will link to “apple” (for example) and that ghost category will be empty … Then with a good MySQL headscratching command I call the actual apple category instead of the ghost ! :smiley:

Neet huh ! ? :smiley:

Well I think it’s neet ! :aparty: