DataMapper problem, should domain model be aware of the mapper object?

This is a question about the data mapper pattern, in which the mapper is responsible for CRUD operations on domain models in a flexible manner. The data mapper pattern is usually used together with UnitOfWork pattern. The UnitOfWork keeps track of the states of domain objects, it manages which objects will be inserted, updated or deleted. But there is a problem with UnitOfWork used in conjunction with DataMapper object.

I’ve heard from many articles that a good data mapper design the domain models are supposed to be unaware of the mapper’s existence. For instance, if I create a User Object, it’s should not keep a mapper object as its property. This does not seem to work at all with UnitOfWork, since when you commit on a UnitOfWork object, it will carry out the insert/update/delete operations on domain models. Now we are getting down to the main issue, these are operations that the data mapper handles, but how is a UnitOfWork object supposed to know about the data mappers? In other words, how does a UnitOfWork, a storage of domain models, know how to get the mappers from somewhere?

To me it seems that there are only two possible ways: Make the domain model aware of the existence of data mapper, or use static methods on mapper registry. I definitely hate the latter, but the former also comes with a price as the domain model will now possess the knowledge of its mapper object.

What do you think about this issue? How can it be resolved? Is it not a problem even if the domain model is indeed aware of the existence of mapper? Or are there other approaches to solve this dilemma? Please lemme know if you can help, thx.

Regular injection? Maybe I’m not understanding your problem. The UnitOfWork isn’t a domain model. It’s part of your mapper, so there’s nothing wrong with the UnitOfWork knowing about the mapper.

Well the question is how does unit of work know about data mapper? Do I store a list of the mapper object in unit of work? If not, how does unit of work know which data mapper to use when it’s inserting/updating/deleting a domain model?


$mappers = array(
    'User' => 'UserDataMapper',
    'Product' => 'ProductDataMapper',
);
$unitOfWork = new UnitOfWork($mappers);

$user = new User();
$unitOfWork->persist($user);
$unitOfWork-> flush();

So based on the metadata you provided, the unitOfWork will be able to create mappers based on the type of domain model object. The user object should of course remain blissfully unaware of how it is being persisted or hydrated.

I see, so you are essentially storing a Map of mappers in UnitOfWork, and in this way the UnitOfWork is handling mapping a domain model to its corresponding mapper, not the domain models themselves. Sounds like a plan, I like it so far.

I’ve another issue regarding data mapper though, it comes with mapping related objects. Lets say a domain model user needs to load a list of items owned by this user, to make things simple we are talking about privately owned items only. If I have both a UserMapper and an ItemMapper, how is this supposed to be handled? Do I store an ItemMapper instance in the UserMapper object and then have the UserMapper use its ItemMapper object to acquire a list of items owned by this user? But in this case, the UserMapper is aware of the existence of ItemMapper, is this a problem?

Now you are heading into Object Relational Mapping(ORM) territory in which you not only map objects to tables but you also map the relationships between objects. Fun stuff.

This is worth a read: http://www.redbeanphp.com/ . It basically shows how to relate objects without configuration tables and such. Easy to experiment with and quite useful in simple production systems.

More advanced solutions will have mappings like this:


Cerad\\Bundle\\GameBundle\\Entity\\Game:
    type:  entity
    table: games

    oneToMany:

       officials:
           targetEntity: GameOfficial
           mappedBy:     game
           indexBy:      slot
           cascade:      ['all']

This shows a one to many link between Games and Officials. When the GameRepository queries for games it’s smart enough to load the game officials as well if requested. Some systems also support lazy loading of related entities.

Not real easy to implement but a great time saver.

I see, thanks for the advices. I have another question regarding dependent-mapping though. Lets say I have a table called “user” that stores user basic information(id, username, password, email, session etc), while additional informations such as user profile is stored in dependent mapping tables such as “users_profile”. I can easily manage the selection, insertion and deletion of these dependent domain objects, but the problem is with update.

As you see, I do not want to update the table “users” if the only data that have been modified are in table “users_profile”, but dependent mapping does not seem to work well with UnitOfWork. From Martin Fowler’s book on Dependent Mapping, it seems that he updates the table and all of its dependent tables at the same time, which seems to be quite expensive of a database operation. This gets worse when you have many-to-many mapping, as modifying every record associated with one domain model will take too much time. What I’d like to have, is something similar to domain model lazy-load, but focuses on updating records rather than loading records. In this way, only the table with data modified will get updated, not all of its associated or dependent tables.

So I wonder, how are you supposed to manage the state of dependent domain models? I know it may be a bit hard or tricky when it comes down to these tables, I just want to figure out an efficient way of updating/modifying tables with dependency on other tables in ORM. Thanks.

It;s up to the UnitOfWork to track if an update is required. Various strategies:

  1. Pessimistic - Always update regardless. Yes it takes a database hit but on the other hand, how often are entities updated?.

  2. Local cache - When the UnitOfWork queries the database, it can store the results in a cache somewhere. When an update is requested, the UnitOfWork will compare the objects with the cached data.

  3. Requery - Prior to an update, fetch the data again and then compare. Thanks to database caching there is very little overhead.

  4. Change Notification - Have your entities implement a listener interface and let the entities themselves decide if an update is needed. Kind of a pain for the entities to do these sorts of checks but it’s actually quite useful if you have other services (like loggers) who are also interested in changes.

I see, none of these seem to be good options to me. In fact, I think I’ll probably forget about the idea of dependent mapping. Martin Fowler actually said that Unit Of Work cannot work at all with dependent mapping, so the latter is not recommended to be used if Unit Of Work exists. With some minor modification, I can give the dependent objects their own primary ID so they can be managed by Unit Of Work. They are still linked in an one-to-one relationship with their owning domain object though, the only difference is that now they can exist without the owning domain object. Come to think about it, this actually works well when you have an admin control panel system in action. Thanks for your advices though, its a great lesson to learn and feels nice to see all those options available.

spl_object_hash($object) might help with the id problem. Be careful, you may end up with a Doctrine 2 clone.

Oh yeah, the object hash may be a good idea, its good enough to distinguish all objects. I dont think my ORM will turn into anything like Doctrine 2, since it does not use annotation. I follow a combination of Martin Fowler and Matt Zandstra’s data mapper patterns, although mine use constructor dependency injection rather than statics/registry.

Doctrine 2 can use annotations but they are certainly not required. I don’t like annotations myself.

Instead, I use yaml mapping files. xml is also supported. Heck, you can even use a PHP array if you are really hardcore.