Rationale behind Dependency Injection Container libraries

Lemon_Juice · September 17, 2015, 12:14pm

I am starting experimenting with Dependency Injection Containers and I’m weighing pros and cons of using any of the existing DIC libraries and if it really is worth the effort. The idea of a dependency container that instantiates objects and injects all required dependencies to them is nice but has one major drawback to me: it makes my code magical. And while I may understand the simple workings of a DIC, my IDE won’t. So far I have tried Pimple (which lack some features of a full-blown DIC but in the basic implementation is the same) and looked at other libraries and the issue is the same:

// Pimple
$myObject = $pimple['my_object'];

// Symfony DIC
$myObject = $container->get('my_object');

// DICE
$myObject = $container->create('my_object');

In all these case the object is created on run time so my IDE won’t know what type of object is returned by the container and so I will not have auto-completion nor will I be able to make sure while writing code that I didn’t misspell an object name. I’d rather use this:

$myObject = $container->getMyObject();

This makes the usage much clearer and non-magical. The downside is that the container will not be able to create objects automatically in a way that an IDE will know what it’s doing so we would need to write the getters by hand.

This lead me to a reddit post by an anonymous person presenting a solution to use a simple class as a container and update it by hand. This reasoning appears pretty sound to me, I will quote a few parts of his posts:

I use a very simple container, it comes built-in with PHP: a simple class with methods.

Sarcasm aside, I’m willing to demonstrate how you can use a simple
class to achieve the same effect as you would with a container, often
with more understandable, explicit (less magical) behavior, with IDE
autocomplete and static type checking (no “stringly typed” services and
labels) and at times even shorter than the writing needed to configure a
container.

Oh and you can’t beat the performance of a simple class when compared to a container library.
[…]

All you need is a class with a set of one-liner methods returning an instance:

function userRepo() { return new SpecificUserRepo(); }

If that instance requires dependencies, call container’s own methods to resolve them:

new UserRepo($this->userSqlDb());

Every “new” call gets its own one-liner method (except for transient
objects of your choice), so there’s one “new” per dependency. This keeps
your code “DRY”.

Make methods protected or private, except for the 2-3 dependencies you want to fetch at your application root.

To create single instance you have a wide range of options, but I’ll give you the most banal one:

protected $sql; function sql() { return $this->sql ?: $this->sql = new Sql(...); }

When you’re done (5 minutes later) you’ll have one of the shortest
and most-easily written classes in your project and it’s your app
container.

It’s easy to write - you’re simply calling constructors, almost
declaratively. Just like container binds! Except… without the
container libs.

Let’s see what we avoided: third party deps, string configs, array
configs, XML, YAML, annotations, parsers, runtime or build-step code
gen, reflection. Not bad.

When you are ready to start your app, make an instance of your
container, grab an instance of your “starting” dependency, and away you
go:

 $container = new AppContainer();

 $router = $container->router();
 $ctrl = $router->route($url);

 $dispatcher = $container->dispatcher();
 $dispatcher->dispatch($ctrl, $inputs);

[…]
When any dependency changes in a way that alters its constructor (which will be rare, by the way), typically it’s taken us more than a minute to understand the need for change, come up with an idea, implement it, test it.

The thing you need to change in a container as a result of that… is one-line of code calling “new”, in one place. Hardly even a minute. And you don’t have many containers per app, either. In many cases, you just have one for the whole app.

Even with an “automatic” container, chances are when you declare a
dependency in your component, you’re already keeping in mind what that
“automatic” container would do, and you still might have to edit rules, binds and settings if it doesn’t match your intent.

It’s good to keep things in a perspective. Your container won’t be where you spend most of your time, so it doesn’t make sense to focus so intensely on trying to “optimize” a few new calls to an almost equivalent set of bind
calls utilizing expensive runtime magic. This only only obscures the
flow of control in your application and ties you to a library that does
very little for you, if you think about it.

Since I don’t have experience with DI containers in bigger projects I’m wondering if I were missing something if I used such simple getter classes as containers instead of full-fledged DI containers? Maintaining a single container class doesn’t seem like a lot of work and I’d much prefer to have completely non-magical code with proper IDE autocompletion than having all the fancy automatic dependency injection based on argument hints. To me the very little amount of manual work needed to write a simple container would eventually be less time consuming than having to constantly deal with array-like or ‘stringy’ getter syntax that returns an unknown entity that needs to me manually annotated every time to get IDE to recognize what it is.

Therefore, I’m open to opinions on this!

ahundiak · September 17, 2015, 12:49pm

Quite a few ides (such as phpstorm) allow you to use annotations to set the type of a variable and thus provide auto-completion.

/** @var callable $accessTokenMiddleware */
$accessTokenMiddleware = $dic->get(‘access_token_middleware’);
In general, one tries to avoid using the container directly as it is basically a huge global. Try to limit the use of the container to your startup and initial dispatch code.
I have added specific getters to my containers and it works. However consider what happens when you try to break your code up into modules or maybe use 3rd party modules. You end up having to define all of their services in you master container class. No easy way to just load their configuration and go. So now you have to be careful to keep your container in sync with even more code and you might end up with quite a big container class.

Lemon_Juice · September 17, 2015, 1:27pm

This is just what I wanted to avoid! To me this is ugly code and I want to do this only in unusual cases, like using 3rd party code that I have no control over…

This sounds like a good idea, indeed the container is a big global. But if the usage of the container is intended to be minimal then to me this is another reason not to use any full-fledged DIC because there is no need for magic for something that will be used sparingly.

BTW, would you consider passing the container to controllers as using it too much?

Why would this be problematic? If I use a 3rd party module then I imagine it would need to be shipped with all its required dependencies, so if it depends on a specific DIC then I would simply use that DIC for that module.

BTW, the anonymous post author mentioned the problem of multiple modules - each module can have its own container working independently from all the others so there is no need to maintain one big container for everything.

ahundiak · September 17, 2015, 2:03pm

But then you need to deal with multiple containers which can get a bit messy especially if you want to use some of the services from the 3rd party but maybe adjust a few others. It can work. Just depends on your applications.

An excellent question whose answer depends on what you mean by controller. I try to inject all the dependencies a specific controller needs.

However, the Symfony framework base controller has a number of useful generic methods such as getUser, generateUrl, render etc. Individually injecting every service needed for all these helper functions can be done but is painful.

So for my Symfony controllers the container gets injected using a setContainer method and is only used directly for these helper functions. Everything else a specific controller needs to do it’s individual work gets injected into the constructor.

This of course is only one approach. Plenty of people just go ahead and access the container directly which in turn avoids having to define the controller as a service.

Going back to your original question, a hybrid approach might be useful. Start with a container, add some common getter methods to cover the standard use cases then fallback to the generic get command for the more unusual services.

TomB · September 17, 2015, 4:13pm

This really is the crux of the matter. Generally you’d avoid using the container directly, and work with a class that has its dependencies injected:

class A {
	
	public function __construct(B $b) {
		$this->b = $b;
	}
}

Your IDE (if smart enough) should be able to see that $this->b is an instance of B (whether that’s an interface or a class name) and give you a list of available methods.

Lemon_Juice · September 17, 2015, 6:29pm

If a controller accepts its list of specific dependencies instead of a container then the calling code, which is the router, needs to pass these dependencies. Do you then configure the router to pass needed dependencies to each controller, or do you pass the container to the router, which is used by the router to construct the controllers?

Yes, I do this kind of stuff but I get the impression that it’s a roundabout way - if I create a container getter then I might as well instantiate the needed object right there in the getter and the rest of the container is not needed at all. I lose a bit of automation but at the same time gain clarity by getting rid of magic.

TomB:

class A {
public function __construct(B $b) {
	$this-&gt;b = $b;
}
}

Your IDE (if smart enough) should be able to see that $this->b is an instance of B (whether that’s an interface or a class name) and give you a list of available methods.

Correct, but what about this:

$a = $container->get('A');
$a->doSomething();

the IDE will not be smart enough to know what $a is. Or do you mean that we are not supposed to work directly on objects pulled from the container and only pass them to other classes and methods right away?

TomB · September 17, 2015, 6:32pm

You shouldn’t be using the container directly in more than a couple of places. Admittedly, you lose IDE hints in those few instances, but it’s a small sacrifice for the architectural advantages/flexibility. For example, you said:

Which is the approach I favour. Using this approach (and a DIC which uses type hinting) I can add/remove dependencies from the controller on a whim. I add a dependency to the construct and with zero other changes, it’s available

ahundiak · September 17, 2015, 6:47pm

I define each controller as a service in the dependency injection container which in turn handles creating and injecting the dependencies. All my router does is to match a given route against a service name. The application runner then pulls the service from the container and invokes the desired method.

The App code ends up looking like this:

$router = $this->dic->get('router');
$path   = $request->getUri()->getPath();
$route  = $router->dispatch($request->getMethod(),$path);

/** @var callable $action */
$action = $this->dic->get($route['name']);

return $action($request,$response);

The goal is that only the main App instance needs to directly access the container once configuration is done.

The same approach works in Symfony by defining all controllers as services:

Lemon_Juice · September 18, 2015, 10:07am

I actually never made attempts to make controllers decoupled and reusable so I rarely injected objects into them (apart from the obvious ones like request, etc.). But I can see it makes sense and getting the global container out of the controllers results in more independent code. However, I don’t think the benefits are huge by doing this since controllers are supposed to be slim and often they need framework specific objects like request or session - and even if they are injected they remain framework dependent.

BTW, I tried using Dice in Silex, which uses Pimple, and it doesn’t seem to coexist well with it. While Dice was able to load my own classes with my own dependencies it failed to instantiate objects that were internally registered in Pimple like Symfony\Component\HttpFoundation\Session\Session - “Fatal error: Cannot instantiate interface Symfony\Component\HttpFoundation\Session\Storage\SessionStorageInterface in S:\www\silex\lib\Dice\Dice.php on line 48”. This also resulted in failure to instantiate my own objects with any dependencies on Symfony components. I don’t know if it would be easy to get these two to work together since if a class is registered in two DI containers then even if it worked I might be getting different instances in cases where only one is desirable.

TomB · September 18, 2015, 10:14am

It’s certainly not worth trying to change an existing framework to decouple the controllers yourself, but in my opinion this should be the direction the frameworks themselves go down.

Lemon_Juice:

BTW, I tried using Dice in Silex, which uses Pimple, and it doesn’t seem to coexist well with it. While Dice was able to load my own classes with my own dependencies it failed to instantiate objects that were internally registered in Pimple like Symfony\Component\HttpFoundation\Session\Session - “Fatal error: Cannot instantiate interface Symfony\Component\HttpFoundation\Session\Storage\SessionStorageInterface in S:\www\silex\lib\Dice\Dice.php on line 48”. This also resulted in failure to instantiate my own objects with any dependencies on Symfony components. I don’t know if it would be easy to get these two to work together since if a class is registered in two DI containers then even if it worked I might be getting different instances in cases where only one is desirable.

Personally I’d avoid using two containers on a single project (or at least not in any way that they’re working on the same classes), not for any technical reason but for someone looking at the code it’s needlessly confusing, to reconfigure a dependency they first need to work out which DIC is using it, and then configure it using one of two possible syntaxes.

Having said that, your problem here is likely just that you need to add some rules to Dice to have it create the right implementation for each interface. This is possible two ways but the simplest is:

$dice->addRule('Symfony\\Component\\HttpFoundation\\Session\Storage\\SessionStorageInterface', ['instanceOf' => 'ConcreteClassNameToUseInstead']);

edit: If you do have more than one container as a rule of thumb I’d make sure that one encapsulates the other (e.g. the application is coupled to only Pimple, but Pimple may call methods on Dice or visa versa) than having the application itself have direct knowledge of both containers.

Lemon_Juice · September 18, 2015, 11:32am

[quote=“TomB, post:10, topic:202052”]
Personally I’d avoid using two containers on a single project (or at least not in any way that they’re working on the same classes)[/quote]

Yes, that’s the same conclusion I came to. Actually, what I was more interested in was replacing Pimple with Dice in Silex but knowing I can’t really get rid of Pimple (because Silex extends Pimple) I thought I could extend Pimple somehow to also accept Dice interface on top of Pimple interface - so the internal framework components could use the Pimple style without any change while my code could use Dice style. But as I see it it’s not worth the effort especially that the DI container is not supposed to be used much. I’ve simply extended Pimple with my own getters and this simplest solutions works fine.

Yeah, I also think this could be resolved with proper configuration since by default Dice apparently tried to instantiate an interface Configuration would work but considering I strive for simplicity integrating Dice as a second DIC would be complicating things too much.

BTW, there seems to be a bug in your latest Dice in php 5.4-5.5 branch - a piece of php 5.6 code slipped in

Lemon_Juice · September 18, 2015, 11:54am

Do you also automatically inject dependencies into controller methods other than the constructor? I think that would be nice since some dependencies may only be required by one or two methods. Or maybe this is not a DI container’s job any more but the router’s?

TomB · September 18, 2015, 12:52pm

Calling the method and automatically injecting the dependencies is easy (containers already do this for __construct) the problem with this is that how do you tell the container when to call the method and inject the dependencies? How does the container know when they’re needed?

Some containers provide lazy loaded objects, which are essentially a placeholder object for the real thing. However, what is the advantage here? Your constructor shouldn’t be doing anything computationally expensive ( see http://misko.hevery.com/code-reviewers-guide/flaw-constructor-does-real-work/ ) so constructing the actual object is not going to be much slower than constructing the placeholder (other than the real object might have dependencies that also need constructing) so its: Creating half a dozen or less objects vs creating a placeholder object via reflection and eval… then when the placeholder is used, creating the original object anyway, and forwarding all method calls using __invoke()… I know what my money is on speed wise!

Lemon_Juice · September 18, 2015, 2:45pm

The calling code knows when - in this case the router. The router calls the controller so it could tell the container to do it while injecting dependencies.

Not necessarily computationally expensive but nevertheless taking time - database objects or remote services are examples of objects that connect to some service on object creation (in the constructor). A controller method might for example need a connection to a second database so the advantage here would be not creating the connection for other methods that do not need it. A placeholder is not needed.

Is there some kind of anti-pattern in my reasoning? I’m wondering why you don’t see the benefit while it seems like a simple thing to me :). Symfony does something similar, but to a limited degree (I think) - it can detect that a controller method needs a request and/or an app object and the router will detect them via hint reflection and inject them automatically.

TomB · September 18, 2015, 2:58pm

It’s poor encapsulation. In your controller you’re saying “I don’t want to connect the database yet” which is basically exposing the state of the database connection to the controller (or other object that is aware of the potential issues). If connecting the database is a problem, solve the problem in the database class. For example, this answer on SO: http://stackoverflow.com/a/5484811/471227 is a better solution to the problem than playing object tetris and manually shuffling things around inside the application to work around it.

Lemon_Juice · September 18, 2015, 3:20pm

I don’t really understand why requiring a dependency in only one controller method is poor encapsulation? What is wrong if only one method of a class requires a dependency? Or do you mean the controller should not need a database connection at all?

Such a controller would rather be saying “I don’t need the database connection” when constructing. One method would be saying “I need the database connection to run”. I don’t see what it has to do with exposing the state of the database connection?

Well, that is certainly not a bad solution. But if we go further with this reasoning then we might as well instantiate all application objects in the DI container right away without caring whether they are needed or not I just think it’s a good idea not to create an object if not required.

TomB · September 18, 2015, 3:54pm

What I mean is, if any class other than the database class is doing things based on the state of the database connection (E.g. I don’t want the database class yet because I don’t know if the database should be connected or not) then you’re breaking encapsulation because the state of the database connection (connected or not) is being used to make decisions outside the database class.

I’m not so sure, consider the following:

Sometimes the object will need to be created
On every request you will need logic to determine whether the object needs to be created or not

If the object of this exercise is performance, then in cases where the object is required, the additional logic of determining whether or not is is required is just an overhead.

If we say the time taken to construct the object is C and the time taken to determine whether the object is required is R

Then when the object is required is the execution time is R+C and the execution time is just R when the determining logic runs but determines that no object should be created. Now if we assign the probability of the object being required to P we can formulate that the total execution time E is represented as

E = R + (P*C)

Obviously with a 100% probability, R is just an overhead, and when the object is always required, it’s actually slower to have that check in there. So the question is… what value for P is required to actually offer a lower execution?

Let’s plug in some numbers and see what happens. Let’s assume it takes twice as long to construct the object as it does to determine whether it needs to be constructed, C=2R

Now, with a 50% probability and any value for R, we can determine the execution time
E = 1 + (0.5*2)

Which, actually evaluates to completely breaking even (a 50% probability where constructing the object takes twice as long as determining whether or not it needs to be constructed)… it’s actually identical overall to constructing the object every time!

So let’s plug in some real numbers:

<?php
function required() {
	return false;
}


$t1 = microtime(true);

for ($i = 0; $i < 1000000; $i++) {
	if (required()) {

	}
}

$t2 = microtime(true);

echo $t2-$t1;

echo '<br />';

class Foo {
	public function __construct($a) {

	}
}



$t1 = microtime(true);

for ($i = 0; $i < 1000000; $i++) {
	new Foo('bar');
}

$t2 = microtime(true);

echo $t2-$t1;

Which gives me:

0.74090385437012
0.95893001556396

There’s a 25% difference here, and obviously if there was more complex (is the object required) logic or a more complex constructor, these numbers would change,but I wanted a bare bones test… and it shows that it’s 25% slower to create the object than check whether it needs to be created (and this is a very conservative estimate for the requirement check!)

So on that basis, put the numbers into our formula:

E = 0.74 + (0.5*0.95)

And with a 50% probability we can see that the execution time is 1.17… 17% slower than just instating the object… adjusting the percentages downwards, it’s not until you get to below 25% probability of the object being required that “not creating the object unless it’s needed” is actually faster, and the advantage doesn’t get to a 10% speed improvement until the probability of needing the object is 15%…

So is it worth it? Well benchmark your use-case and find out, but it’s probably not!

Jeff_Mott · September 18, 2015, 4:51pm

My oh my, you put a lot of work into that, but ultimately it all comes down to this cherry picked assumption.

Obviously, if someone is going to go through the effort of lazy loading something, then that thing is probably hefty. Certainly something that has only twice the cost of a mere if-statement (the equivalent of… *gasp*… two if-statements!) isn’t remotely close to being hefty, so your C=2R assumption also isn’t remotely close to a real-world situation.

Consider instead that for the cost of a mere if-statement, we could avoid loading the entire Doctrine library, something that is probably several orders of magnitude more costly.

Lemon_Juice · September 18, 2015, 5:21pm

Haha, Tom, great micro-optimization benchmarks, you almost convinced me this is not worth doing This might hold true if indeed all objects instantiation is really so quick. I agree that the constructor should do as little as possible but I don’t think I agree it must always be blazing fast. Its role is to initialize the object so as to get it to a complete and workable state - often this initialization will be fast but sometimes we may need to connect to a db or a remote service to initialize it - and I think that’s fine. If you apply this SO solution to the performance problem then 1) the class is littered with checkConnection() in all its methods and 2) the performance will drop because of the checks and it may drop much more than a one-time reflection argument checking I was considering for a controller method.

But let’s not talk about micro-optimizations since we all know this matters little - my idea was for convenience. If the DIC can detect constructor dependencies then it can do so for method dependencies - why not? Then I don’t have to worry if an object takes a long time to instantiate when I don’t use it - and even if I kept the discipline for very fast constructors I might be using 3rd party classes which are not optimized in this way.

I don’t know if we aren’t talking about different things? I don’t see how the controller class is doing things based on the state of the database connection - I don’t see it happening at all, the state of the connection is handled by the database object, did I say that the controller checks for the state? This is what I have in mind:

class ProductController {
    private $db;
    
    public function __construct(Db $db) {
        $this->db = $db;
    }
    
    public function add() {
        // ...
    }
    
    public function delete() {
        // ...
    }
    
    public function archive(Db2 $db2) {
        // ...
    }
}

I don’t make any decisions based on the state of the database ($db2) - how so? I just require it in one method and use it - most probably passing it further to a class that does the archiving to the second database.

Jeff_Mott · September 18, 2015, 5:27pm

If you’re dependency injecting your controllers, which you seem to be, then you should only have one action per controller.

See this other discussion for more details: