A crazy idea - mark_script_ready()

This isn’t an April Fool’s joke. Let’s get that out of the way first. And this could be a stupid idea - let that be known second.

Drupal, Wordpress and many other frameworks go through a somewhat lengthy process before they really do anything. And this is repeated every page load. Wouldn’t it be nice to be able to bookmark the moment where the script begins branching out in reaction to whatever was in the request?

That’s the idea behind the mark_system_ready() function, or maybe statement. It tells the PHP engine to take it’s current state and save it. The next time this script is executed clone that state and start from there.

To fully take advantage of the presence of this function a CMS or framework would need to get as much code independent of the request ahead of this call as possible - but it might help in the speed process. Or it might be an impossible task depending on how the PHP engine actually works. Or the dumbest idea you’ll hear this month.

Thoughts?

It’s not the dumbest thing, but it’s also nothing new. It’s called caching… :wink:

3 Likes

I was thinking the same thing. Don’t most frameworks already do this?

function reload_saved_state() {
	$requestHash = md5(serialize(array_merge($_GET, $_POST, $_SERVER['REQUEST_URI'])));	
	return unserialize(file_get_contents('./cache/' . $requestHash));
}


function mark_system_ready() {
	$requestHash = md5(serialize(array_merge($_GET, $_POST, $_SERVER['REQUEST_URI'])));	
	file_put_contents('./cache/' . $requestHash, serialize($app));	
}

As with most smart-ass responses, this one only indicates the reader was incapable of grasping the question put before them.

I’m well aware of what a cache is. Even if you cache you still have to redeclare the constants, actually execute the code that defines the classes and functions, and so on, so no, caching is not the same thing.

The idea is essentially to mark a point where the program can return to after completion to await the next http request. PHP does all this setup work and throws it away every time exit is reached. Even if caches are present, they still have to be loaded back in and serialized objects still have to be reconstituted.

So you are talking more of a “hibernate” my application here, so on the next request, it can start at this location so all of the “pre-work” is done?

Yes. (And btw, that prework would likely include loading the caches)

Yeah, that makes sense. I don’t know that I’ve seen any language really do this… I’m trying to think if .NET has anything close to this ability. It is quite an interesting idea, but presents a real challenge of using it correctly as a backend developer.

You have to make sure what you hibernate is not related to a specific user or state. So you have to be really careful or you may introduce some serious bugs in your application.

Save the state after that logic has been processed, no need to define the constants if the logic which uses them isn’t run.

Further to my above Eureka, as well as extracting the Laravel bare HTML output, I have also used CodeIgniter 3.0 to extract the bare HTML. It is working fine and gets 100% Pingdom Performance Grade.

CodeIgniter:

  // RENDER TO SCREEN
    $this->load->view( $data['page_new'], $data, FALSE); 

  //  if 200===http_response_code() 
  //      WRITE TO "_CACHE_JJ/current/"  
  //   else
  //      WRITE TO "_CACHE_JJ/Found"
     $this->_new_cache_jj($data, $delete= FALSE);

Cached web-pages that bypass PHP & MySQL?

http://www.johns-jokes.com/_CACHE_JJ/current/

.htaccess
script checks if the file exists in the cached folder and outputs the file

Question:
Does Apache load PHP & MySQL regardless?

My previous post was being composed before your reply was read.

Can you clarify your statement and perhaps show a simple example?

So I had another think about what you meant and thought of potential solutions for this as I think you may actually have a good point. As step 1 all I did was essentially skip the autoloader.

In a very small application that doesn’t use a database, sessions or anything else but loads ~30 files, I had a ~30% speed increase. There are more caveats than I have the time to go into here (superglobals, shared sessions) and hurdles I haven’t even considered yet but here’s my current implementation:

  • Run a PHP script as a server that runs indefinitely. This can have a mysql connection, load all the files, do all the bootstrap work
  • Have that server register a socket
  • Have a minimalist client script which connects, via sockets to the server and just calls functions on it. This allows the same PHP script instance to serve multiple requests.

Some code:

class SocketServer {
	private $sockFile;
	private $functions = [];

	public function __construct($sockFile = '/tmp/mysock2') {
		$this->sockFile = $sockFile;
		if (file_exists($this->sockFile)) unlink($this->sockFile);
	}

	public function addFunction($name, \Closure $function) {
		$this->functions[$name] = $function;
	}

	public function start() {
		$this->socket = socket_create(AF_UNIX, SOCK_STREAM, 0);
		socket_set_option($this->socket, SOL_SOCKET, SO_REUSEADDR, 1);
		if (!$this->socket) throw new \Exception('Could not create socket');
 		if (!socket_bind($this->socket, $this->scokFile)) throw new \Exception('Could not bind socket ' . $this->host . ':' . $this->port . ' (' . socket_last_error($this->socket) . ')');
 		if (!socket_listen($this->socket, 3)) throw new \Exception('Could not listen on socket ' . $this->host . ':' . $this->port . ' (' . socket_last_error($this->socket) . ')');

 		$this->listen();
	}

	private function listen() {
		while ($spawn = socket_accept($this->socket)) {
			$message = socket_read($spawn, 2048);

			list($function, $data) = explode('#SOCKETMESSAGE#', $message);
			if ($function == 'SOCKETCLOSE') $this->stop;

			try {
				if (isset($this->functions[$function])) $output = $this->functions[$function](unserialize($data));
				else $output = 'Invalid socket call';
			}
			catch (\Exception $e) {
				$output = $e->getMessage();
			}	
			
			if (!socket_write($spawn, $output, strlen($output))) echo socket_last_error($this->socket);
			socket_close($spawn);
		}
	}

	public function stop() {
		socket_close($this->socket);
		die;
	}

}



//This was the index.php bootstrap file for my framework
set_time_limit(0);

chdir('../framework');
require_once 'Conf/Core.php';
$conf = new \Config\Core;

foreach ($conf->autoInclude as $file) require_once $file;

//Create the DIC
$dic = new $conf->dic(new $conf->dicConfig);

//Use the DIC to consturct the autoloader
$autoLoader = $dic->create($conf->autoloader);

//Now, instead of constructing the entry point and echoing the output, allow a socket connection to get the output from the entry point
$server = new SocketServer();

$server->addFunction('getoutput', function($data) use ($dic, $conf) {
	$entryPoint = $dic->create($conf->entryPoint, [$data['get'], $data['post'], $data['server']]);
	return $entryPoint->output();
});

$server->start();

Then the client script looks like this:

class Client {
	private $sockFile;
	private $socket;

	public function __construct($sockFile = '/tmp/mysock2') {
		$this->sockFile = $sockFile;

		do {
			$this->socket = socket_create(AF_UNIX, SOCK_STREAM, 0) or die("Could not create socket\n");
			$connected =  @socket_connect($this->socket, $this->sockFile);

			//If there's no server running, run it.
			if (!$connected) {
				exec(sprintf("%s > %s 2>&1 & echo $! >> %s", 'php server.php', 'serverlog.txt', 'server.pid'));
				sleep(1);
			}
		} while (!$connected);
	}

	public function sendMessage($function, $data = null) {
		$message = $function . '#SOCKETMESSAGE#' . serialize($data);
		socket_write($this->socket, $message, strlen($message));
		return socket_read($this->socket, 1024, PHP_BINARY_READ);
	}

	public function __destruct() {
		socket_close($this->socket);
	}
}

$client = new Client();
echo $client->sendMessage('getoutput', ['get' => $_GET, 'post' => $_POST, 'server' => $_SERVER]);

All requests that used to go to index.php now go to client.php and it all just falls into place. Clearly there are issues with sessions, database concurrency… oh and if you use TCP rather than UNIX sockets the performance is about 800 times worse than just running everything through the original index.php that was both the client and server.

My framework is set up so that $_GET, $_POST and other superglobals are not referenced anywhere but the very top level then passed into the framework entry point… obviously if you’re using superglobals at arbitrary points in the code this method will not be workable, you won’t even be able to overwrite them $_GET = $data['get']; because requests will interfere with each other and give you very unpredictable results… and if your DIC is backwards like most of them and creates shared instances of objects by default you’ll also be in a world of pain because application state will be stored across each request… if your DIC is sensible, you’ll have shared objects which are actually shared among all requests. Neat.

Been playing around with this idea some more and I now have a 500% performance increase. The above was actually a 50% decrease when opcache was enabled.

Using the same server/client architecture:

ab -n 100 -c10 http://mvc.dev/currencyconverter/ 
This is ApacheBench, Version 2.3 <$Revision: 1638069 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking mvc.dev (be patient).....done


Server Software:        Apache/2.4.12
Server Hostname:        mvc.dev
Server Port:            80

Document Path:          /currencyconverter/
Document Length:        1431 bytes

Concurrency Level:      10
Time taken for tests:   0.010 seconds
Complete requests:      100
Failed requests:        50
   (Connect: 0, Receive: 0, Length: 50, Exceptions: 0)
Total transferred:      168828 bytes
HTML transferred:       129139 bytes
Requests per second:    10031.10 [#/sec] (mean)
Time per request:       0.997 [ms] (mean)
Time per request:       0.100 [ms] (mean, across all concurrent requests)
Transfer rate:          16538.38 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:     0    1   0.3      1       2
Waiting:        0    1   0.3      1       2
Total:          1    1   0.3      1       2

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      1
  90%      1
  95%      2
  98%      2
  99%      2
 100%      2 (longest request)

Using a combined architecture where each request includes all files and does all the bootstrapping:

ab -n 100 -c10 http://mvc.dev/currencyconverter/ 
This is ApacheBench, Version 2.3 <$Revision: 1638069 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking mvc.dev (be patient).....done


Server Software:        Apache/2.4.12
Server Hostname:        mvc.dev
Server Port:            80

Document Path:          /currencyconverter/
Document Length:        176 bytes

Concurrency Level:      10
Time taken for tests:   0.055 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      37700 bytes
HTML transferred:       17600 bytes
Requests per second:    1824.92 [#/sec] (mean)
Time per request:       5.480 [ms] (mean)
Time per request:       0.548 [ms] (mean, across all concurrent requests)
Transfer rate:          671.87 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:     4    5   2.8      4      15
Waiting:        4    5   2.8      4      15
Total:          4    5   2.9      4      15

Percentage of the requests served within a certain time (ms)
  50%      4
  66%      4
  75%      4
  80%      5
  90%     12
  95%     15
  98%     15
  99%     15
 100%     15 (longest request)

Impressive.

Definitely not a dumb idea, and I think Tom’s suggestion here is the right approach. I think that’s how Java servlets work already. The servlet is loaded and initialized just once, and the server invokes service() for each HTTP request. Which means you can load all your classes just once; you can configure your DIC and instantiate a lot of your services just once; and you can connect to the database just once and leave the connection open. It certainly makes a lot more sense than booting and terminating the same application over and over.

Unfortunately PHP and its numerous kinds of global state doesn’t fit well with this model. It’ll be interesting to see what kind of workarounds Tom or others can come up with.

Or maybe we should all switch to Java. :-p

It’s a very clever trick indeed and I was impressed too when I first saw it. It’s achilles heel, of course, is that it only works when the page is 100% static. Even the tiniest dynamic bit, such as switching between “Log in” or “Hello, {{user}}” forces you to always go to the PHP application.

1 Like

Could be you know something they don’t… or it could be the other way around. :wink:

The good DICs implement the concept of “scope”. There can be, for example, a “request” scope. So if you ask the DIC for the same request-dependant service multiple times, you’ll get the same shared instance, but only within the context of your HTTP request. Different requests will get different instances.

Some examples:

Indeed, but a lot of the popular DICs such as Symfony’s default to retrieving the same instance unless they’re told not to… this behaviour will cause a lot of things to break using an application server (as above) because it will treat objects as shared when they probably shouldn’t be. So when you use the DIC to create an instance of a MVC model, all users get the same one… which in most cases isn’t what you want to happen.

When you only have one instance of a class in the application the scope makes zero difference, defaulting to retrieving the same instance will cause strange side effects as soon as that is no longer the case.

Your complaint depends on services neglecting to define their scope.

And DIC’s have picked singleton/container scope as the default because that’s what the vast majority of services will want to be. I recall one of the lead developers of Drupal 8 trying to tell you exactly the same thing, but I can see you’re still not convinced. I guess everyone’s still doing it backwards—except Tom.

I guess it depends if you’re using a DIC as a DIC or a service locator. If you’re using it as a Service Locator then it does make sense to retrieve the same instance each time, if you’re using it as a DIC then most objects will want to be unique instances. It’s only by luck that the container is destroyed/recreated on each request that it doesn’t cause any problems.

As I said to that same Drupal Developer, as an analogy it’s like requesting a browser window from the DIC. In your world, next time you do that you get the same: Window History for back/forward, DOM, JavaScript state, URL, address bar and making changes to one window will then affect the other. This isn’t apparent and doesn’t cause a single issue when you only have one window open at a time. Yes there are some services that are shared: Browser history, Extensions, Configuration, etc, however, these are a small % of the actual application.

No, it doesn’t.

I, and Drupal, and Symfony, and Java Spring, and probably many more, all seem to disagree with you there.

Argumentum ad populum isn’t a reason to do anything, besides if we’re going down that root most java devs have dropped spring for guice which does not.