My next framework

For the last two weeks I’ve been rewriting my old framework at my leisure taking ideas I’ve picked up in the last year and a half and some ideas I never found time to incorporate and put them into a new framework. This after taking a long look at the existing ones and finding none of them to my liking.

The main reason for that varies from framework to framework, but one case they all shared was a lack of proper AJAX support. Most have AJAX support, but it’s tacked on as an afterthought. I’m baking this into the core. It will continue to work without javascript mind you, but there will be parts that are specifically catered to js to enhance the site.

I’m still at a point where I can abort problems without too much pain. Only the first 20 or so files are being worked on. So I’m going to go over things here for review and comment, see if anyone spots any obvious trainwrecks.

First, the pattern is EDMVC - Event Driven Model View Control. Base flow is as follows.

  1. Apache looks for file, either finds the cache or nothing. Assuming no cache PHP is started.
  2. First object to be instantiated is Core - the core services module. This holds the “global” variables and objects of the project, as well as essential methods such autoload. It sets the environment, loads the core settings either from the database or from cache, then determines the proper page controller for the request and starts it.
  3. The Page controller determines the requested event based on path, validates it, and if required athenticates the user’s permission to perform the event. It then fires the event.
  4. Any remaining path parsing is performed by the event so that settings can be embedded into the URL, then the event does what it does. Events can be view based or model based - but it isn’t proper to think of them as views or models. They exist between the control and these elements. An event will by its conclusion create a responder or another event and pass it back to the page control.
  5. The page controller fires the event if it received one, or commands the responder to send out its reply and returns flow back to core.
  6. Core calls the shutdown function and does final cleanup.

With that outline in mind some other notes and principles.
Coupling - Decoupling: All class objects are loosely coupled back to core and can be decoupled simply by giving them the information they need to function. For example, the HTMLResponder class references the Core::paths map - but it has a setPaths() method that allows you to assign a set of paths to use. You’ll need to call that method if you want to use it outside the framework. By similar means all of the classes can work without Core.

Restful State: $_REQUEST does not appear in the code. It is evil, do not use it. Initial events care about whether they are POST or GET based and will raise a bad request exception if accessed by the wrong method.

DRY: The code follows DRY principle - don’t repeat yourself.

No PHP is best: The framework does not reside in the htdocs directory - it lives elsewhere except for a single landing.php file. The framework by default wants write access to the htdocs directory so it can cache it’s work where Apache can find it without instating PHP on the next page request, or at least without instantiating the whole framework (for example, a file might be given a small PHP header to expire the file and force the framework to rewrite once an hour). Security is also improved by having only one exposed PHP file.

JIT Loading: Just In Time - No object is loaded until it’s use is unavoidable. This includes starting sessions, the database, what have you. Overhead is not incurred until it must be incurred.

View The view consists of templates and Responder classes. The Responder does the actual coupling of template to data. Template files are *phtml files written in braceless syntax. Unlike Code Igniter or some other frameworks out there at NO point does HTML ever get echoed out in any way from anywhere but a template file. This is one of the laws of the framework - if it’s HTML it is in some template somewhere and can be edited.

Responders aren’t just templates though. Javascript responders prep outgoing javascript snippets and responses. XML Responders compose, well, XML responses. It is possible for a page to parse multiple responses - an email responder would compose an email and send it even while an HTML responder tells the user the process is complete. Any outgoing data passes through a responder.

Model PDO is at the heart, with a Database class on top of it with enhanced query methods. The conversion to PDO is a major switch for my framework and I haven’t done too much on this side except the bare minimum to get the database reads done. That said, I plan to continue having table, row and field objects. The table and row objects will each implement array so that PHP can foreach over their contents cleanly. Field objects contain most of the validation brute work. In the rewrite I want javascript to be able to quickly shoot an isValid query to each field object onBlur of the field rather than wait for form submit.

I also plan to place Collection objects into the works. A collection is a set of related tables. Finally any table can be a field, recursively.

I have other ideas jumbling that aren’t sorted, this is hardly final.

Weirdest structural thing I’ve done that I am starting to get comfortable with is the fact that there is a loadOrder.ini file in the js directory. As mentioned before the framework, and all of its files, reside outside the htdocs directory. This includes the javascript. The loadOrder.ini file tells PHP which javascript files are going to be used and in what order to load them. When the framework resolves the header of the html page file it uses this ini file to write the script tags. It is also used one other time, by the collator. Once debug mode is switched off the framework points to index.js. This “file” is just a collation of all javascript files that the loadOrder.ini file calls out in the order they are called out (Note, if you create an index.js file and put it in the load order you’ll trip an illegal script name Exception).

It’s odd, but it feels right to have the file that talks about the javascript load order in the js directory alongside the javascripts. Also, it’s a file that doesn’t need referencing except during development (since after dev the cache will get written and Apache wills start hitting that file).

I’ll look into that. In any event, Database and Cache(s) will be treated as separate model data services. I believe that is the best approach.

Yes, in the absence of a load balancer. I’ve not seen a good way to deal with sessions when a load balancer is used other than employ the db.

Anyway, at this point I’m commenting on things that I haven’t fully framed out so sometimes the answer I should be giving is “I don’t know.”

That makes sense now. Sounds interesting. Out of curiosity how do you handle events with multiple outcomes. E.g. success/failure I’m guessing you supply a different callback in the response? If so, that’s pretty neat in terms of keeping related code together and making it very easily possible to handle js/non-js browsers.

I’d love to see an example of this HTML/JS/PHP reusability… it sounds like a strange concept.

The reason AJAX gets ‘tacked on’ to the end of most frameworks is because JS should never be explicitly relied on, in a general-usage site. If you’re overcomplicating your PHP code just to make your JS code easier to write/execute, you’re looking at things from the entirely wrong angle IMO… especially if someone presses the button saying ‘disable Javascript’.

Don’t get me wrong, I like Javascript, I like ajax interactions, when done right, they make for a snappy web application… but the JS belongs in your web directory, just like images and CSS and whatnot, and has nothing to do with the server side access of your application.

If i’m reading this correctly you’re including all the JS on every single page? That doesn’t seem very scalable to me. What if there are 100 views all with some JS functions? Really redundant to include them on every page and goes directly against your JIT loading point. IMHO, you should allow the views to state which js files to include.

Agreed.

Yes, it is redundant for the browser to have the javascript of the whole site in memory, but it’s also faster, and speed is the prime concern here.

Now if you are under the impression that I have to put this script binding code in every single page template you’re just insulting my intelligence. Only one template holds this code.

I think TomB’s point, is that, with RIA JavaScript can become quite verbose. Some page requests will simply not need all that code. If the site was large enough and you had tens of thousands of SLOC in JS, loading that for every page is not ideal.

Loading only the required JS files per view makes more sense. If you are worried about having to pull 30 JS files and it being slow, you are absolutely correct, but then the solution is to pull all those JS files for that page request into a single file, re-write the <script> tags to use the cached JS file and return that to the user agent. Same benefit as your solution but multiple cache files are used, instead of one God file.

Cheers,
Alex

Some will be by nature. For example, an autocompleter and it’s PHP callback are by nature coupled. If AJAX isn’t available the PHP call isn’t likely to be used either.

My plan is to have javascript Event objects exist as children of the root event since they are by nature a “step beyond” the base state.

CMS are a different story, because ones like Drupal store everything in the database. Routing tables, mappping URI to pages for instance, so by design it needs a DB almost immediately upon loading. Personally I opt to store routing files/maps in files, so a DB isn’t required after the expected controller and action has been invoked, avoiding senesless overhead.

My approach is to store everything in the database then write to a file cache everything that will not change in production. It really doesn’t matter as long as there is only 1 authoritative copy.

No, that’s pretty much it. I know PHP can do non-web related tasks, but that’s beyond the scope of what I’m trying to do. I’ve always saw this project as a CMS Framework which is designed to be easily and quickly modified by programmers. Joomla & Drupal and easily modified and customized by end users, but for that functionality to be available the programmer has to jump through a LOT of hoops. In my experience they present too many options for many clients who would prefer to have as much as possible taken care of for them by the developer.

Also, the event model exists because it complements javascript well and should make it easier to orchestrate and manage crosstalk between the languages. The first task ahead of this framework is a fairly sophisticated web application on a corporate intranet where javascript will be required and very heavily used.

The JS side doesn’t get much chatter because it’s based on prototype.js and to be honest there’s not much there needing changes.

Weirdest structural thing I’ve done that I am starting to get comfortable with is the fact that there is a loadOrder.ini file in the js directory. As mentioned before the framework, and all of its files, reside outside the htdocs directory. This includes the javascript. The loadOrder.ini file tells PHP which javascript files are going to be used and in what order to load them. When the framework resolves the header of the html page file it uses this ini file to write the script tags. It is also used one other time, by the collator. Once debug mode is switched off the framework points to index.js. This “file” is just a collation of all javascript files that the loadOrder.ini file calls out in the order they are called out (Note, if you create an index.js file and put it in the load order you’ll trip an illegal script name Exception).

It’s odd, but it feels right to have the file that talks about the javascript load order in the js directory alongside the javascripts. Also, it’s a file that doesn’t need referencing except during development (since after dev the cache will get written and Apache wills start hitting that file).

If i’m reading this correctly you’re including all the JS on every single page? That doesn’t seem very scalable to me. What if there are 100 views all with some JS functions? Really redundant to include them on every page and goes directly against your JIT loading point. IMHO, you should allow the views to state which js files to include.

The database holds a page and events map. Core builds a master map from the database or loads it from file cache if in production.

The URL string is exploded on ‘/’ after redundant ‘//’ have been filtered to prevent URL weirdness like ‘/some//weird///path’ - this matches how UNIX behaves when it encounters such files.

The last element of the path is checked for an extension. If it has one it becomes the “event”. The core then compares the path heirarchy against the exploded path and matches as much as it can.

The page map in the database is a tree, so pages can contain each other. Also, page properties descend to their children. The most critical of these is the class of the page - the system allows you to choose which class handles the page so long as it extends from Page. Eventually blocks will be worked in as well. If the path doesn’t check out the remainder path is stored in a variable and handed off to the last found page. The event is then invoked for the page.

The Page class validates the event against the events allowed for its context. If it checks out then the event sets up.

It is the event’s responsibility to handle the remaining path that core couldn’t find a match for. If it fails it should throw FileNotFoundException, which Page will catch (event parsing is done in a try/catch block for this reason - If page catches it starts the reserved ExceptionEvent and gives the exception object and the active event object to the ExceptionEvent to formulate the error response).

Events map back to the database, not every event necessarily has its own object. For example, the IndexEvent is pretty much going to be the same each time it’s fired, the entry for the event may specify different models and templates to pull but the actual process is fairly static. If there is something different about the process for a particular event then the programmer can freely create and assign a child class.

So multiple pages will call the same event object to handle a given event, but that object gets contextual information from the database based on the page which can vastly alter the output. Events are on one table in the database, Pages an another, and there is a cross-reference table of event id’s to page id’s that map what events are allowed on what pages and what overrides are used for that iteration of the event (different template, or even different event handling class from normal)

In direct answer to your question, since events are contextual so you wouldn’t have a GET “\event\7371.json” Given how the parser is works that event maps to the event page, the IndexEvent, and that index event would need to have code to figure out what “7371.json” means. Or if there is no event page the root page of the system would ask it’s IndexEvent what “event\7371.json” means.

It is possible to write lookup code and have it fire another event since events can be chained, but to save time and processor overhead access permissions are only checked against the first event - and it is presumed that the user has access to all subsequently called events.

Also, it is only the initial event that is checked for GET or POST matching. For example, the SaveEvent will on success return the user to a new page which is built using the IndexEvent.

Finally, events do not have to be registered in the database, but if they aren’t the outside world (including javascript) cannot reach them. At the moment ExceptionEvent is the only event that is like this since it handles exceptions. It is also pertinent to note that the final exception handler in Core invokes this event even though it isn’t a page.

I’m also debating whether to have model raised events. Summaries call out as a major candidate to wrap around select queries that have a particular data view and possibly as a comfortable place to hold the pagination state of the summary.

I’m always curious to hear how others feel some frameworks missed the mark in terms of AJAX?

Most frameworks that claim to simply AJAX do indeed make it easy, but the implementation/design is very hackish.

Using any framework that followed MVC, I “could” simply:

  1. Implement a action that generated a full display (master template, contents, etc)

  2. Implement a second ajax action that generated only the portion which I wish to make dynamic via AJAX

The alternative approach, would be to check conditionally (inside a single action) whether a request was AJAX and avoid rendering the master templates, etc and just returning the contents.

This is assuming you render the master template per action or perhaps a base controller class that does all this for and simply asks for the content specifics from each individual action (a la template method pattern).

So how exactly does your framework optimize AJAX handling?

Cheers,
Alex

I for one will be interested in taking it out for a test drive. I have also revamped some of my stuff to better support ajax/js but it looks like you have formalized some things that I just hacked together.

You might also want to glance through the latest PHP Doctrine 2.0 documentation. They have what seems to be a nice database access layer that looks like it would be a good match for your collections and table relations stuff. The full blown doctrine with all it’s fancy entity classes run on top so you can use the access layer without the full product.

Memcache is a great option for sessions over a load balancer.

Out of interest, how is your user-authentication for specific pages going to work?

Until there are 10,000 concurrent users.

I’ve been burned in the past by taking the db for granted. Accessing the db is about the slowest thing the application can do.

Why would authentication require the database after the initial log in? Can’t you just use sessions for that?

If the user is authenticated there will be db. No way around that. When I say 10,000 concurrent users I’m referring to guests, bots, crawlers and the like. Ideally the pages should not necessarily need the db to fulfill these requests. By taking that approach resources are freed for the authenticated users.

I think I’ve finalized the starting process. Been a headache - I didn’t realize how many dependencies, especially hidden dependencies, I had in the old framework until I started a concerted effort to make sure none of them carried over.

This function is the result - main. It is the only function of the framework, everything else is in a class, furthermore all of those classes can be extended to customize their behavior in the future.

The structure of the function gives some insight into the structure of the framework.



/**
 * The main function of Gazelle is a bootstrapper. It's job is to get the framework up
 * and running while doing as little as possible. It has the distinction of being the
 * only function in Gazelle - the rest of the framework is composed of classes.
 * 
 * @param $path Filepath of the calling landing file.
 * @param $projectNamespace Namespace of the calling landing file, which should be the project.
 */
function main( $path, $projectNamespace ) {
	define ('START', microtime(true));

	$config = parse_ini_file($path.'/../config.ini', true);
		
	if ($config === false) {
		die('Parse error in configuration file. Gazelle cannot start.');
	}
		
	// TODO cache code here.

	require($config['loadSystem']['factory']['path']);
	require($config['loadSystem']['loader']['path']);
	
	$loader = $config['loadSystem']['factory']['name']::getLoader(
		'Gazelle', realpath($path.'/'.$config['paths']['framework']),
		$projectNamespace, realpath($path.'/'.$config['paths']['project'])
	);
	
	if ( !$loader->find($config['services']['ErrorHandler'], false )) {
		throw new FatalException('An Error Handler class must be defined in the configuration.');
	}	

	set_error_handler( function( $code, $message, $file, $line, $context ) use ( $config ) {
		$config['services']['ErrorHandler']::setError($code, $message, $file, $line, $context);
	});
		
	set_exception_handler( function( \\Exception $e ) use ( $config ){
		$config['services']['ErrorHandler']::exception( $e );
	});
		
	if (isset($config['services']['Debug'])) {
		$debugger = new $config['services']['Debug']();
	} else {
		assert_options(ASSERT_ACTIVE, false);
	}
	
	$registry = new $config['services']['Registry']( 
		$config, 
		$loader,
		realpath($path.'/'.$config['paths']['framework']), 
		realpath($path.'/'.$config['paths']['project']),
		$projectNamespace,
		$debugger 
	);

	$dispatcher = new $config['services']['EventDispatcher']( $registry );
	$dispatcher->parseRequest();
}

The only code that comes before main is 2 lonely lines in the landing.php file.


require(dirname(dirname(dirname(__FILE__))).'/main.php');
Gazelle\\main( dirname(__FILE__), __NAMESPACE__ );

I think main is done but if anyone sees potential structure problems from this let me know.

Drag & Drop functionality itself (and the whole of the scriptaculous library) is part of the core of the site’s files. If a site is going to use it then it probably should be part of the load order.

Each page can add it’s own on load library but usually does not. As importantly, so can each module (a page may have a number of modules attached) and modules usually do for their events.

Event callback which would require the server to harvest information anyway are embedded into the PHP. They aren’t hosted in the js file. In your drag ‘n’ drop example, the callback when the file gets dropped on the trashbin would be in the PHP controller as part of its response.

So much storm and fury to suit the less than 5% of the users out there who actually do that. Talk about getting priorities wrong.

Don’t get me wrong, I like Javascript, I like ajax interactions, when done right, they make for a snappy web application… but the JS belongs in your web directory, just like images and CSS and whatnot, and has nothing to do with the server side access of your application.

No, that’s wrong. Javascript is controlling code. It executes client side but it is a part of the application and best treated as such. This framework is being written to host javascript/PHP solutions that are complex enough to be unworkable if javascript isn’t available. That has long been a limit of AJAX as a tack on - if it can’t be done without it we don’t do it. That doesn’t mean writing to exclude non-javascript clients intentionally though it might sound that way. Instead they are no longer considered “baseline” and will degrade gracefully if used.

The .htdocs directory in this framework starts empty except for one file - landing.php which is the rewrite target for apache (by default an .http.conf include file is used, not an .htaccess file because it’s faster that way). If the system determines it’s output can be cached it will write the results into the .htdocs directory to the web folder. So, eventually, the javascript files do end up in .htdocs. Along the way in production they are stripped of their comment text and most all of their whitespace (Full minifying is buggy with the prototype.js framework).

Also, unlike EVERY PHP framework in existence, this one is designed to exist as a central library on the server that services multiple projects. I do NOT want seperate copies of the javascript in each of those project’s htdocs directory because it vastly increases the chances of getting incompatibilities between the javascripts of different projects.

That said framework doesn’t outright fail if javascript isn’t present though. Javascript events are unobtrusive and not attached using onClick and the like, they are instead attached using Event.observe(window,‘load’). The framework is being built so that the project coder doesn’t have to care whether javascript or the browser initiated the request.

That does require a level of complexity in the framework that simply tacking on AJAX as an afterthought doesn’t involve, but the abstraction of creating events without having to constantly write extra code for AJAX tack ons will be worth it.

Ok then perhaps I’m not getting it :stuck_out_tongue: are you saying the response to each event is stored on the server?

If so, say I wanted drag and drop functionality on some elements.

  1. If you’re not using on* events and you’re not doing it in a .js file, where are you binding the events?
  2. Are you saying that when the drag starts, it fires an event to the server to find out what it should do next? That seems like it would be really slow for the end user… and doesn’t that remove any kind of caching for these events?

I see where you’re going, here’s where you’re both mistaken. The site makes heavy use of the fact that when the prototype.js script receives a response with the header text/javascript it evaluates it immediately.

Hence the callbacks of many functions are embedded in their respective php controller and event files, not in the js files. The reason for this is to simply make life easier - having related code in the same file helps immensely. Also having the javascript callbacks be contained within wrapper PHP functions to deliver them means they can become subject to the same inheritance pattern as the PHP class files. :smiley:

The core javascript, the stuff I was referring to when I mentioned the loadOrder.ini, is the stuff that will be used repeatedly. Prototype.js, scriptaculous and a project of mine for Web Forms 2.0 emulation for the moment. About 32K of stuff compressed.