ORM & Lazy Initialization vs SQL Joins

I’m building my own ORM that has three kinds of relations, but for the many-to-one relation I have a question.

Imagine that we have two tables, articles and categories. Each article can belong to a single category, therefore this relation is a many-to-one relation.

How relations work in use:


$category = $this->Model( 'Categories' )->get( array( 'pk' => 1 ) );
echo $category->title;

$article = $this->Model( 'Articles' )->get( array( 'pk' => 3 ) );
echo $article->category->title;

The example above loads and fills two model instances, one instance of a row from the categories table, one from the articles table.

The category node on $article contains an instance of a row from the categories table, the same one as $category. The first and second echo statements output the same data.

Now here’s where it gets tricky!

I’ve thought up the following: What if instance of a row from a table is cached the first time it’s initialized and stored in a ModelContainer static class, and each time a new row instance is requested it tests if that specific row has been cached and if it is, returns a reference to the instance in cache (when it’s not it will instantialize + cache the newly created instance)

If you don’t understand the above, let me put it simpler. Imagine there are 500 articles in your articles table, each belonging to the 12 categories in the categories table. If we use cached lazy initialization the way I described, there could only ever be 12 instances of rows from the categories table, the rest would be references. Let me illustrate it with code:


$article = $this->Model( 'Articles' )->get( array( 'pk' => 3 ) );

// At this point, the category instance gets instantialized and
// cached
echo $article->category->title; 

$article2 = $this->Model( 'Articles' )->get( array( 'pk' => 4 ) );

// When the article belongs to the same category, 
// this time we reference the already instantialized row
// of the same category
echo $article2->category->title;

The alternative to this is to use SQL JOIN statements to pre-load the many-to-one relation’s data into each row. The downside of doing that is that it loads the data even when you don’t need or use it, where with lazy initialization it does extra queries, but only when you ARE using it (have accessed the relation field’s property), and it’s limited to a specific amount (due to cached instances)

I need opinions on whether what I had in mind is a good way to do what I need it to do, or compelling arguments against it/for using JOIN statements instead.

Hope to get some thoughts on this!

What you are describing is an Identity Map, and is a well-known and well-used design pattern. I say go for it. The one thing you may have to watch out for is when people make changes to loaded objects - they then differ from what would have been loaded from the data source.

Wouldn’t making the relation field’s members immutable resolve that issue? Would relation field members be modified in certain situations?

Anyone else have thoughts on this?

My personal thoughts on ORM’s is that they will never be able to cover all the functionality of raw SQL, so why use it?

Not everything should be OOP. The whole point of OOP is to model objects fom real world things. Now is giving some one a command an object? NO. SQL is like a command. SELECT * FROM table.

This is kind of besides the point of this thread but I’ll debate this just for you :slight_smile:

What you’re saying is entirely true, however my ORM is much more than simply a database ‘abstraction’. Sure, it can do easy fetching, easy saving/inserting and three simple relations but it doesn’t stop there.

In my ORM, each database field is mapped to a certain ‘modelfield’ class, classes which manipulate the end-data into it’s representational form (while making the raw value still usable for internal calculations/mutations/etc)

Let me show you an example:


ArticlesModel:
    score:
        type: FloatField
        round: 2

The articles model has a field named ‘score’, of type float. If you fetch a row and output this field, it will output it’s representational form:


$article = $this->Model( 'articles' )->get( array( 'pk' => 1 ) );
echo $article->score; // 3.14

But if you were to do calculations with it or comparisons, it will use it’s raw (full) value:


$article = $this->Model( 'articles' )->get( array( 'pk' => 1 ) );
if( $article->score < 5 ){
    // $article->score's value for this test is  3.14159265 here
}

It works the other way around too, imagine the articles model has a field ‘icon’:


ArticlesModel:
    icon:
        type: FileField
        uploadpath: /path/to/upload/
        pathurl: /upload/
        allowed: .jpg, .zip
        maxsize: 2M

When saving an article, you can directly pass the $_FILES key of the icon upload, and it will upload the file (after validating other criteria) to the uploadpath value


$article = $this->Model( 'articles' )->new();
$article->icon = $_FILES['icon_upload'];
if( !$article->save() ){
    // Do error handling here
}

And when working with the data from the database, representational data will show the URL to the file:


echo "<img src=\\"{$article->icon}\\" alt=\\"\\" />"; // <img src="/upload/icon2.jpg" alt="" />

When using it in raw data context it will prepend the full path to the file:


$filename = $article->icon; // /path/to/upload/icon2.jpg

So as you can see, it’s more than ‘just’ a database abstraction, hell you can even perform your own queries and have it fill the model objects with the results (to the best of the system’s abilities):


$results = $this->Model( 'articles' )->raw( "SELECT * FROM articles" );

Hope that I’ve explained it a bit for you :wink:

@Ruben K.
Really inspired the way it works.
I have some questions:
1> do the form is automatically generated?
2> how did you manage the model attributes like you have mentioned:
ArticlesModel:
icon:
type: FileField
uploadpath: /path/to/upload/
pathurl: /upload/
allowed: .jpg, .zip
maxsize: 2M
3> Have you released your ORM as open source?

Thanks

Thank you for your response, I really appreciate it :slight_smile: To answer your questions:

1: Still working on that, it’s something where I want to go after the first version since you can already use it for form validating. Not sure on how it will work yet but I’ll come up with something!

2: Imagine that all form fields are an instance of a formfield class. So FileField is an instance of FileField, and all the other options are handled by that class purely. For example the ‘uploadpath’ property is stored in the class as a private member, while ‘allowed’ or ‘maxsize’ are validation classes loaded by the FileField’s validation() callback (called on save())

3: Originally I started building it for my own PHP MVC framework, but there will be an opensource release as soon as it’s stable :slight_smile:

Hope that answers your questions :wink:

Yes. definitely.
But regarding #2>
2> how did you manage the model attributes like you have mentioned:
ArticlesModel:
icon:
type: FileField
uploadpath: /path/to/upload/
pathurl: /upload/
allowed: .jpg, .zip
maxsize: 2M

I was mean to ask is such config is array bases or xml bases or else

Rest is clear.

Hope to hear more from you about your ORM & Framework in near future.
(y)

You mean what language the model files are? YAML :slight_smile:

Yes i got it finally.

The command isn’t what’s being modelled, it’s still the object. All the ORM is doing is automating the process of describing the fancy object to the SQL server.