New PHP Data Mapper Library

Lachlan · July 4, 2010, 12:19pm

Been playing around recently with a PHP 5.3 data mapper library, would be interested to hear any feedback.

Still very early days, relationship support is still in proof-of-concept phase, but I’d be interested to hear what you guys think of the usage examples.

I’m well aware that there are two other excellent PHP 5.3 ORMs, both very similar to their Ruby counterparts (and both excellently engineered).

Basically where mine differs is in that I’ve tried to keep simple the declarations (which are in PHP), the database dependancy (which is Mysql and Innodb specific) and a build for a very light memory footprint.

sunwukung · July 14, 2010, 12:41pm

I’m not sure, that’s why I’m asking you lot…

In that respect, is the chaining method TomB described earlier a step towards this?

lastcraft · July 14, 2010, 10:50am

Hi…

How is that a bug? That’s exactly what you would expect from a reference counting collector.

The “fix” of manually calling __destruct() is bogus. It means anyone else holding the reference will get the destructor run even if you haven’t finished with it. Very mysterious behaviour, especially if it gets run twice.

A better solution is to not hang on to such references if you are creating a lot of objects. Pass by value, or pass the reference into the method so you only have linkage for the life of the method.

The fewer invisible threads connecting things the better. Linking everything to everything else may save a few function parameters and may look superficially neat. Longer term the code is much harder to reason about. It’s not just the garbage collector that will get it wrong.

yours, Marcus

sunwukung · July 14, 2010, 8:55am

I figured as much - I’ve been rolling my own as well - mostly out of academic interest. Writing a Data Mapper is an interesting project, so many issues to resolve: magic or concrete accessors, convention vs configuration, Active Record or Datamapper.

My wife finds me completely unintelligible these days - although some times she does say “Is that from Fowler?”.

Something’s bothering me about passing the Mappers to Entities. Is there a potential for a memory leak? Is this a potential issue for DI too?

http://paul-m-jones.com/archives/262

Lachlan · July 14, 2010, 12:03am

Pheasant uses __get to dynamically traverse relationships. I assume TomB is doing something similar.

sunwukung · July 12, 2010, 9:38pm

DoH! sorry - post #12 - I forgot to scroll. thanks for humouring me!

Lachlan · July 8, 2010, 8:48am

Marcus,

Point taken on the fact that using Pheasant with existing objects is difficult. It’s an interesting problem, one to which I am not sure I can find a lot of good solutions without introducing a lot more complexity. After some consideration, I think you and kyberfabrikken are right, it’s probably not strictly a data mapper, however, I am ok with that Looking forward to what Traits brings to the table in PHP5.4.

You’re right the Pheasant class is a singleton. As a general rule I hate Singletons with a firey passion, mainly because of what they do to testability. In Pheasant, I was trying to leave the domain object constructors alone (even if it’s a bit hacky), which precluded DI for the pheasant object into the domain object constructors. At the minute the static pheasant class is basically a Locator for a PheasantInstance, which can be replaced, mocked, etc as needed. Point taken though that it’s got a lot of downsides. Will ponder this.

I haven’t had a lot to do with DI in PHP, I tend to get by with registries of pluggable factories. Something I should look into.

Lox

TomB · July 7, 2010, 9:09pm

You’re right, but those are minor imho, and the benefits of being able to chain anything without reconfiguring the base mapper, adding functions to the domain model and/or mapper and unlimited chaining are superior (YMMV). As with most things, it’s a trade-off but one I believe is worthwhile

allspiritseve · July 7, 2010, 8:33pm

I guess I was thinking based on this:

echo $user->orders[0]->items[0]->product->manufacturer->name

That you were building relations based on table names. Now that I go back and look at your implementation, our examples aren’t all that different except I’m using methods where you’re using property overloading.

It’s a dependency on your mapper interface. Not a big one, it’s more syntactic sugar than anything, but if you were testing your domain object you’d need a mock mapper. If you want to persist the same object in different places, your other mappers would have to implement the same interface. That’s not a big deal at all-- passing an object to a mapper is just a little simpler and a little more explicit than having each object pass itself to the mapper. That’s all.

TomB · July 7, 2010, 7:58pm

Well, does it? It has a dependency on the mapper. Which can be referencing anything, the file system, a web server, etc. The domain object knows it contains data and that that data comes from somewhere (its non-descript mapper)

And in my example, the domain objects don’t need to have knowledge of any mappers. Maybe that’s not important to you, but it has its uses.

I’m probably missing something here but I cant see how the domain object knowing it contains data that comes from somewhere is an issue.

allspiritseve · July 7, 2010, 7:13pm

I could see the example you gave trimmed down to this:

$user = $userMapper->findbyId(1);
$orders = $orderMapper->findByUser($user);
echo $orders[0]->getItem(0)->getProductManufacturerName();

Behind the scenes, I’d have the order mapper batch lazy load items, products, and manufacturers (possibly eager loading them depending on how often I used them when working with orders).

In doing so, I keep knowledge of the database relationships in the mapper layer where they belong, and each object can remain blissfully unaware of the db.

allspiritseve · July 7, 2010, 7:40pm

You’re right. I amended my post.

Maybe… but I think it depends on the complexity of the query because you’re exposing implementation details.

You shouldn’t have to write a new function for each related object. You just need some way to specify immediate object relations. It’s probably not much different from how you’re currently specifying relations, you are just working with relationships between objects instead of between tables.

Yeah I understand that. It’s convenience with the cost of your domain having a dependency on the database. That may or may not be an issue, depending on the complexity of your domain.

And in my example, the domain objects don’t need to have knowledge of any mappers. Maybe that’s not important to you, but it has its uses.

TomB · July 7, 2010, 7:24pm

Not that simple, because you need to get the correct product from the order item, from the order.

Either way, imho, coding specific get*() functions in the mapper seems like overkill for getting something which could be any depth down the chain, your mappers could get very large.

I wanted a way that supports unlimited chaining without having to write new functions in specific mappers every time I want a different object which may be any depth down the chain from the current mapper. It just allows for much faster/easier/consistent development.

In my example, only the mappers do have knowledge of the relationships That’s where they’re stored. The domain objects essentially only have knowledge of the mapper they were created by.

lastcraft · July 7, 2010, 3:25pm

Hi…

Funny. We both know you are anything but :).

The top level Pheasant class. Can I not just instantiate this once myself?

Just make it a normal class - job done.

Say I’m getting my objects from some kind of repository/facade thingy…?


class PheasantPoweredRepository implements Repository {
    function __construct(Pheasant $pheasant) { ... }
    function findStuff() { ... }
}

When I use a DI tool I’ll go…


$repository = $injector->create('Repository');

I don’t need Pheasant to tell me it’s a Singleton, as I can get the DI tool to do that. I can save a bit of complexity by not bothering if I’m only going to have one Repository instance anyway.

Oh, please don’t misunderstand me here. I’m definitely not saying it’s wrong. It looks like your ORM is exceptionally well designed. I even like your opening statement:

“Basically where mine differs is in that I’ve tried to keep simple the declarations (which are in PHP), the database dependancy (which is Mysql and Innodb specific) and a build for a very light memory footprint.”

Stating the features left out is just the coolest advert for a tool ever.

I was being pedantic :).

More pedantry…

To be a DataMapper the object being mapped must know nothing. The mapper can see the domain object and some kind of schema (so it knows how to handle collections, etc). The domain object sees neither. The mapping of fields and relationships is in the schema which could be code or metadata.

On a scale of ActiveRecord to DataMapper you are probably 80% DataMapper. Which is fine. Active record is fine. I’m just quibbling with naming.

Suppose I’ve already written the domain objects and look around for a DataMapper (fat chance in PHP land, but bear with me), i’m going to be mildly disappointed to find I have to go back and edit my code to add these “extends” keywords.

Just call it an ORM and I’ve nothing to complain about. Well, except the Singleton ;).

Your solution is a good one. When I said complex, I meant Java scale complex. I’ve only used a DataMapper once (for a recruitment site, sadly abandoned) and that wasn’t even in PHP.

The place you would get immediate friction is if the domain model itself involves inheritance. You’ve already played the inheritance card.

Yes!

yours, Marcus

Lachlan · July 7, 2010, 1:18pm

Yup, very fair point and a good simplification.

Have you run into the N+1 problem, where you are iterating over relationships? E.g:



foreach($user->groups as $group)
{
   foreach($group->members as $member)
   {
     echo "member name: {$member->fullname}\
";
   }
}

Any clever techniques for optimizing this pattern of access?

TomB · July 7, 2010, 12:53pm

Short answer: I don’t.

Why would you?

$user->orders is just as valid and meaningful as $user->firstname. They just store different data types. One is a string, the other is an array. Imho, There does not need to be any difference in the way these are handled by the code using the object. How it works behind the scenes is less important than how it is used by client code, in my opinion.

if you didnt want the magic you could rename __get to getRelated and call $user->getRelated(‘orders’);

Lachlan · July 7, 2010, 12:37pm

Tom, interesting approach. I’m still not sure about the fact that in Pheasant properties are lowercase and relationships are uppercase (Doctrine does this too). Perhaps slightly too magic.

How do you distinguish between them in your code?

TomB · July 7, 2010, 11:35am

Sorry this example got a bit big, it shows how to get from $user to $user->orders


abstract class DataMapper {
	public $relations = array();
	
	public abstract function findWhere($criteria);
	public abstract function findById($id);
	
	public function getMapper($name) {
		//Initiate a new mapper or get existing one, shout probably really go through a factory.
		return new $name;
	}

	public function getRelated($relationName,  DataObject $object) {
		$relation = $this->relations[$relationName];
		$mapper = $this->getMapper($relation['mapper'] . 'Mapper');
		return $mapper->findWhere($relation['foreignField'] . ' = ' . $object->{$relation['localField']});
	}
	
	public function hasRelation($relationName) {
		return isset($relationName);
	}

}

class UserMapper extends DataMapper {
	public $relations = array(
		'orders' => array('mapper' => 'order', 'foreignField' => 'userId', 'localField' => 'id')
	);

	public function findWhere($criteria) {
		//Query the database and return a set of User objects		
	}
	
	public function findById($id) {
		//Query the database and return a User object
	}
}

class OrderMapper {
	public function findWhere($criteria) {
		//Query the database and return a set of Order objects		
	}
	
	public function findById($id) {
		//Query the database and return a Order object
	}	
}

class DataObject {
 	protected $mapper;
    
    public function __construct(DataMapper $mapper) {
        $this->mapper = $mapper;    
     }
     
    public function __get($name) {
    	if ($this->mapper->hasRelation($name)) return $this->mapper->getRelated($name, $this);
    }

}

class User extends DataObject {
	
	
}

It cant be added to your example as you need some way of getting back from the domain object to the mapper.

To get to manufacturer you’d create all the extra mappers, setting up the relations you go.

AnthonySterling · July 7, 2010, 11:10am

Cheers Tom.

Few questions, in your first example you actually show how you end up at the Manufacturers name, could you show how you would do this in your latter please?

You state in your earlier post that the mapper would be aware of the relations, could this ‘magic’ just not be added to my example too?

Thanks!

Anthony.

AnthonySterling · July 7, 2010, 10:53am

Maybe I’m a bit well, thick - but wouldn’t something like the following be a little more obvious in declaring dependencies and focusing responsibilities?

I’d love to know why I’m wrong here.


class Mapper
{
    public function find(MapperCriteria $criteria);
    
    public function save(MapperSavable $object);
}

class UserMapper extends Mapper
{
    public function save(User $user){
        try{
            $object = new MapperSaveable($user);
            parent::save($object);
            return true;
        }catch(MapperSaveable $ex){
            #thrown if cannot save this type
            return false;
        }
    }
}

class User
{
    
}

It’s rough, and knocked up during my Biscuit break so please forgive the roughness.

The way I see it:-

MapperSavable checks to see if the object is compatible and maybe formats it for persistence, the latter should probably be left up to the Mapper though.
User no longer knows or cares if can be saved, thus left to do User stuff.
UserMapper allows some business logic to creep in and decide what to do if User cannot be saved etc…

Maybe Mapper should be passed into UserMapper to allow different persistence methods?

Sure this means *Mapper objects, but at least each is actually doing something obvious, and all these objects are on the same level of abstraction (application pov).