Group commonly used application-wide methods in unit-test friendly way

Lemon_Juice · May 12, 2013, 6:50pm

Usually, in a web application there are a few commonly used functions that it’s convenient to have globally defined so that I can call them from any place I want. These functions are so simple that there’s no need to create a separate class for them and they don’t really belong to any other specific class. So far I have defined them as static methods inside a class. I’ll show an example of 2 methods, one is completely standalone, the other needs a Db object to function:


class Utils {
	// Send redirect HTTP headers and exit script. It converts relative url to absolute if needed.
	// This method does not have any dependencies.
	static public function redirect($url, $httpStatusCode = 302) {
		// ...
	}
	
	// Read user-set global application setting from database table.
	// This method dependends on Db class.
	static public function getSetting($settingName) {
		// ...
	}
}

// Calling is very easy:
Utils::redirect($url);
$perPage = Utils::getSetting("products_per_page");

This is procedural code and procedural is not always good for unit testing. While the first method could be unit-tested easily because it doesn’t depend on anything else, the second one may be harder. So far I can see 2 solutions:

Make getSetting require Db object as a parameter, which should make unit-testing easy (but then when I move settings to a text file then I’ll have to remove the parameter in all calling scripts).
In the getSetting method invoke a factory to return the Db object, however I’m not sure if calling new Factory() would not create problems on its own.

Are these good solutions? Maybe something completely different? Make an instance of the Utils class?

captainccs · May 12, 2013, 8:12pm

In my original procedural framework I had a page with utility functions (functions.php). This page would be included at the top of all my scripts. When I started to develop the OOP framework I moved many of the utility functions to specific classes. What was left was bundled into the abstract Functions class (Functions.php). One outcome is that the file structure of the OOP framework is much simplified and I don’t have to worry about including functions, autoload takes care of it.

To call a function:

Functions::functionName();

captainccs · May 12, 2013, 8:32pm

How about adding a second optional parameter that tells the function if it is supposed to get the data from the database or from a file?

static public function getSetting($settingName) {
    if(func_num_args() = 2) {
        // do file stuff
    } else {
        // do db stuff
    }
}

Call:

Utils::getSetting($settingName, $fileName);

Lemon_Juice · May 12, 2013, 9:03pm

How is that different from the static methods example I posted above? And most importantly, is unit testing of those methods easy?

captainccs:

How about adding a second optional parameter that tells the function if it is supposed to get the data from the database or from a file?
static public function getSetting($settingName) {
    if(func_num_args() = 2) {
        // do file stuff
    } else {
        // do db stuff
    }
}
Call:
Utils::getSetting($settingName, $fileName);

This wouldn’t work well because the calling code doesn’t know if a specific setting needs to be fetched from a database or from a file. Later I might change the storage type for the settings and I don’t want to change all of the calling code in that case.

cpradio · May 12, 2013, 11:06pm

My personal opinion is if your function has any external dependencies (within it or passed to it) static isn’t really a good candidate. Static was primarily with the idea that your class/method has zero dependencies on the class itself and externally.

Here is an example of what I’d consider an acceptable static method (granted many will still disagree with using static period!, but the below is still easily testable)

class FileType
{
  public static function getExtension($filename)
  {
    // pretend I did error checking on file length, etc)
    return substr($filename, strrpos($filename, '.'));
  }
}

Both the class and the method getExtension don’t require anything but a string with the file name in it. There isn’t any class variables or external dependencies it needs to worry about (thus testing is very straight-forward).

Once you introduce dependencies, you are better off with an instantiated object so you can mock the dependencies/implementation. If you still want to ensure only one connection is established, there are ways of introducing a singleton pattern within an instantiated class, that still gives you the benefits of mocking (there is an example of that somewhere in this forum…I’d have to spend some time searching for it).

Anytime you talk Testing and your class/method has dependencies to external sources (be it other classes, a database, another server, whatever), DI is a great option and in my opinion, a must have. Otherwise, your tests are going to be fragile and need to be updated on a regular basis as changes are made.

oddz · May 12, 2013, 11:12pm

Agreed.

captainccs · May 13, 2013, 12:07am

It’s not different except semantically. “Functions” is not really a class at all but more of a “collection” of code that is collected into a class. Every method in the class is abstract because there can never be a “Functions” type object. On the other hand, Functions could be inherited by every other class. I realize that some people don’t like inheritance (I’m not sure why being quite green in OOP) but it makes sense to me. For example: just about every object wants to send an error message sometimes. Shouldn’t they inherit that ability from a common ancestor?

And most importantly, is unit testing of those methods easy?

What is “unit testing?”

captainccs · May 13, 2013, 12:15am

@cpradio What is “DI” (I’m alphabet soup challenged )

cpradio · May 13, 2013, 9:45am

Dependency Injection

TomB · May 13, 2013, 9:47am

As others have said, Dependency Injection is a far better way of achieving this. Rather than having arbitrary code calling Utils::getSetting(“products_per_page”); , you should pass the result of that call into the method or class. E.g.

Instead of:


class Foo {

	public function bar() {
		$x = Utils::getSetting("products_per_page");
	}
}

You should use:


class Foo {

	public function bar($x) {
		
	}
}

This way, your application is more flexible. You have the ability to have the value of $x come from any source, not just the return value of Utils::getSetting().

There’s a really good blog post here: http://misko.hevery.com/2008/12/15/static-methods-are-death-to-testability/ about why static methods are bad for testability, and my own thoughts on the impacts of their use in PHP here: http://r.je/static-methods-bad-practice.html

cpradio · May 13, 2013, 10:02am

Funny story (this may relate to you, but it seriously made me re-think inheritance so I used it correctly), the company I had worked for hired a contractor to write their initial service. The service was fairly simple, accept a standard insurance stream from vendors and give them a quote (or an error if they failed any validation). Now for the funny part, his inheritance had all classes inheriting from an Error object, so as you walked through the code you could easily come to the conclusion that all logic/processing were based on an Error (think of that statement for a bit). You are identifying your code to be built upon an Error… so you are expecting it to fail?

In reality, none of the classes that inherited the Error class were “an error”. They all “could throw” an error. Big difference there. When you inherit another class, your new class should be the same type as the class it inherits. You should be able to read it as TextExtension (class) is an Extension (class) or MySQL is a Database, and so on. TextExtension is not an Error, MySQL is not an Error, they both however can throw Errors (which means Dependency Injection [DI] may be suitable).

Since you grew up on file cards (and likely a green screen); I’ll forgive you for asking this Seriously, though, good question. To start, you really need to know there are multiple kinds of tests. Unit Tests and Integration Tests are the common ones used. How they differ is based on your code’s dependencies (to some level). I won’t go into all of the detail, but here is some further reading that may help you understand the differences between Unit and Integration Tests

From the reading, please take note on the following characteristic of Integration Tests:
“different pieces of the system work together” This is important, you are testing the system more as a “whole” instead of individual functions/methods of classes. So you may be testing myclass.Process() instead of testing each method .Process() calls within itself.

Now for Unit Testing, smaller concise methods/functions are invaluable. Why? It is easy to test them and to fully test them (including all edge cases). Granted keep in mind, just because you wrote a small method, doesn’t mean that you won’t have 547 test cases (take my word on it – including edge cases). I personally thought Sitepoint did a decent job on their phpUnit article, however, it was written in 2010, so it may be a bit out-dated (so I provided a link to phpunit as well).

Hope that helps.

Lemon_Juice · May 13, 2013, 10:50am

cpradio:

My personal opinion is if your function has any external dependencies (within it or passed to it) static isn’t really a good candidate. Static was primarily with the idea that your class/method has zero dependencies on the class itself and externally.

Here is an example of what I’d consider an acceptable static method (granted many will still disagree with using static period!, but the below is still easily testable)
class FileType
{
  public static function getExtension($filename)
  {
    // pretend I did error checking on file length, etc)
    return substr($filename, strrpos($filename, '.'));
  }
}

Yes, this makes sense. The only problem, I think, would be methods which sit between needing dependencies and not needing dependencies - I mean a method that is fully standalone now but potentially might need to be extended in the future and might need dependencies.

But for such simple things like your getExtension() example I don’t think static is bad because I don’t have to mock it in test, I can always use the real one, can’t I? I just don’t know if I wouldn’t prefer to have one class for all such simple static methods - I wouldn’t like to have a separate class (in a separate file) just to hold one method like this getExtension().

OK, that sounds reasonable!

TomB:

As others have said, Dependency Injection is a far better way of achieving this. Rather than having arbitrary code calling Utils::getSetting(“products_per_page”); , you should pass the result of that call into the method or class. E.g.

Instead of:
class Foo {

	public function bar() {
		$x = Utils::getSetting("products_per_page");
	}
}
You should use:
class Foo {

	public function bar($x) {
		
	}
}
This way, your application is more flexible. You have the ability to have the value of $x come from any source, not just the return value of Utils::getSetting().

I see the point about flexibility. But does this solve the problem of calling the static method? In your example you avoided the static call altogether by shifting the responsibility of getting $x by the caller. So the Foo class is more flexible but now the caller has the same problem of getting $x somehow. The question is how does it accomplish it?

cpradio · May 13, 2013, 11:38am

Well, that’s part of the problem. Many times we get requirements that simply state, for the forseeable future, we want to get the value from X and only X. So do you spend the time to implement DI now even though it is unnecessary and mocking your tests accordingly, or do you provide a static implementation because you can?

To help with that decision, let’s look at the pros and cons:

The pros to using static in this scenario

[list][]It’s easy to write and quickly test a static implementation that has no dependencies and doesn’t implement DI
[]It meets the requirements as stated (thus the business won’t have a problem with the time you spent)[/list]

The cons

[list][]When the business changes where you get your data (now from two possible sources), you are screwed. You now have to update the method, find all uses of it and update them as well.
[]Plus you have to update all your tests, the implementation, etc.[/list]

I agree, I don’t believe it to be bad either, but just wanted to point out, others find the word “static” to be forbidden (never useful), blah blah blah.

You don’t have to have a “file for each method”, but you should have some organization to them. Group together what makes sense to be together. getExtension($filename), getFilename($filename), getDirectoryPath($filename), etc. could all be in the same file. I like to think of it this way, if I were reading out loud my code, does it make sense to another person who hasn’t seen the code? Example: I have a class named FileHelper that contains getExtension receiving a $filename, getFilename that also receives a $filename, and getDirectoryPath which also receives a $filename. Stating I have a class named FileHelper that has getSetting doesn’t really make sense…

TomB · May 13, 2013, 11:42am

There’s no hard and fast rule. It depends entirely on what the caller is doing. However, whatever its doing, has to presumably pass some information into the callee, it needs to have some knowledge of what the caller is doing. Should the caller get the value from the static method? Probably not!

One important factor is always going to whether the callee is going to need multiple function calls on the dependency. If you have a piece of code that requires multiple function calls to retrieve values from an object, pass the fully constructed object to either the method or the constructor of the class in question. If you need one very specific value, pass that.

In a fully DI (Inversion of control) system, there would be a very top level part of the application which constructed the entire object tree for the request and passed any configuration/dependencies into each class as it created it. This part of the application is essentially static because it knows about what dependencies every class it’s creating has and the new keyword is static, it also has access to shared objects and configuration values. However, this is a better separation of concerns as it allows the entire rest of the application to be non-static and more flexible, making it easier to test.

It’s all about single responsibilities and separation of concerns. Any class which does processing on data should not be concerned with where that data came from. Another, different class, with its own responsibility should be concerned with loading data but have no bearing on how that data is then used.

captainccs · May 13, 2013, 12:23pm

File cards and, would you believe it?, buttons, lights and switches on a control panel, green screens would arrive next. I think dinosaur tracks were still fresh!

This was my second computer.

Hope that helps.

Very much so! I started to look into OOP a long time ago but left coding for other work. When I came back, OOP was like entering a complex new world, one I want very much to explore.

captainccs · May 13, 2013, 12:49pm

Unless you have written a “god” class, yes. According to mythology even gods fail on occasion.

In reality, none of the classes that inherited the Error class were “an error”. They all “could throw” an error. Big difference there. When you inherit another class, your new class should be the same type as the class it inherits. You should be able to read it as TextExtension (class) is an Extension (class) or MySQL is a Database, and so on. TextExtension is not an Error, MySQL is not an Error, they both however can throw Errors (which means Dependency Injection [DI] may be suitable).

Thanks! A very interesting explanation of how to organize the functionality of the code. In the “real world” there are both vertical integration (inheritance) and horizontal value chains (DI). Each one has its uses, advantages and disadvantages.

Again, thanks!

logic_earth · May 14, 2013, 5:33pm

Why have a class at all? If you are not going to use the benefits of having an object? But to replicate namespaces like in the olden days? Why not just a set of plain normal functions under a namespace?


namespace Utilities;

# \\Utilities\\Whatever();
function Whaterver () {}

Lemon_Juice · May 14, 2013, 8:22pm

Yes, I posted an example with a class because it was a namespace workaround for php 5.2 but yes, functions in a real namespace are the same.

cpradio · May 15, 2013, 12:23pm

logic_earth:

Why have a class at all? If you are not going to use the benefits of having an object? But to replicate namespaces like in the olden days? Why not just a set of plain normal functions under a namespace?
namespace Utilities;

# \\Utilities\\Whatever();
function Whaterver () {}

I’ll blame my .NET background where everything you code is within a class of some sort. There is no such thing as a functions in a namespace. Granted that’s my excuse though I don’t necessarily disagree with that thought process.

captainccs · May 15, 2013, 1:40pm

Just because namespaces were added to php 5.3 ¿is it a good reason not to go all out OOP?

I hate complexity. Using namespaces for procedural functions when they can be handled in abstract classes just adds one more bit of unnecessary complexity, IMO.

One major problem I have with the procedural framework I built up over the years is the complexity of the file structure. Initially it seemed to make sense to keep things organized and in practice it works quite well with ongoing web-apps but it is a pain to get new ones started. One advantage I discovered with OOP is that you throw all the classes in a single folder and all you need to call (include) is autoload.php, it takes care of the rest of it including the Functions classes that I refer to above. If hiding all the uglies and KISS are the name of the game then Functions class is better than an additional namespace.