Object Persistence in PHP?

Umm…

> This argument only holds water if we are talking about an applicatin on a single server.

:agree:

That is true is what I’ve found myself, but it’s the only thing in this entire thread that I can agree with you, on. This, however I have to disagree with you on…

> but allowing it in the platform as an option would be nice and would give developers
> more options when their applications have performance problems.

That I belive wouldn’t particularly help with performance issues, more so in regards to PHP, even if it’s just an option. The huge benifit of PHP is it’s architecture, something which the language has picked up on and prospered on.

It’s due to this architecture, that the only real option to scale with PHP is from the hardware side of things; That is something I believe in firmly - software is only every capable of so much, so yes… Throw more hardware at it is my thoughts on it.

Trying to gain 0.1% more performance enhancement from your horrendously under pressure application with a cache of persistable data is bordering on lunacy; You have got to remember that all data dies at some point in the applications life.

You therefore have to maintain that cache, regardless of how it’s persisted; That maintenance is a resource hog in of it’s self; It’s that resource hogging that at the moment, PHP doesn’t have to deal with, thank you God!!

Really…

Maybe I’ve got the wrong end of the stick here but I don’t see the advantage of storing objects in memory. The amount of time it takes to load even a complex page is minute compared to the average time a user spends viewing it. Doesn’t this mean a lot of wasted memory where information sits doing nothing but waiting for the user to click on the next page? Basically doesn’t this just swap the hardware cost from buying more/better CPU’s to more/better RAM?

You still have to find it, before you have a reference to it. The HTTP request is just a stream of text. Normally, this will contain some sort of unique identifier of an object (Presumably embedded in the URL). Translating such an identifier to an object is what databases excel at.

The only alternative to using the request to identify your object, is to maintain application state at the server …

I don’t see why one would put static data in mutable storage in the first place. Configuration information is meta data. It’s best expressed through code - not configuration files. And certainly not configuration files persisted in a database.

That I belive wouldn’t particularly help with performance issues, more so in regards to PHP, even if it’s just an option.

You’re free to think that, but I’m more interested in the reason why you think that. Its unclear why it helps so many other platforms, but wouldn’t help PHP.

Trying to gain 0.1% more performance enhancement from your horrendously under pressure application with a cache of persistable data is bordering on lunacy

Yeah that is lunacy, fortunately the performance boost received from local caches is dramatically greater than 0.1%. For example, I work on a fairly busy e-com site written in Java. If you turn the cache off the server loads become high and response time decreases on average by about 40%. Turn it on the applications works great with mild server loads.

That maintenance is a resource hog in of it’s self; It’s that resource hogging that at the moment, PHP doesn’t have to deal with, thank you God!!

I’m not sure if you are talking about the increase in memory usage if you are caching or development time. So, I’ll response to both. 1.) PHP hogs different resources - CPU cycles and IO. These are harder to deal with then memory which is cheap to add. 2.) Adding caching to an application isn’t hard. Usually you are already going to be using some sort of ORM solution (even if its just a simple DAO), from this your cache is simply a hash between the object and the key(s) you used to grab it in the first place. It can be as simple as:


public class ProductDAO {
   
    private $cache = array();

    public getProduct($id) {
        $product = $this->cache[$id];
        if(!isset($product)) {
             $product = loadProduct($id);
             $cache[$id] = $product;
        }
      return $product;
   }
}

It’s best expressed through code - not configuration files. And certainly not configuration files persisted in a database.

There are a view issues here. 1.) Expressing it through code is not fundamentally different then expressing it in config files. For PHP I think its truly a matter of preference, but in compiled languages config files are superior since they can be changed without re-compiling the source. 2.) For many apps configuration information needs to be mutable, so that end-users can change it. 3.) Putting configuration details in code doesn’t change the fact that you have to load the configuration details on each request and often have to include configuration details that is not required by the request.

is to maintain application state at the server …

Sure, which is done on the vaste majority of applications regardless.

The problem, as I’m sure you’re aware, is that even usability suffers since nowadays many users are running the same application in multiple tabs or windows. In the test management tool I’m currently using, if I have two test specifications open at the same time (because I want to compare the old one to the new one), the existing test spec may be destroyed if I save the wrong one. (What I do to avoid it is open one in Firefox and the other in Opera.) BTW, this is not a PHP app, but it’s the same problem: lots of info is stored in session and the app just can’t handle two current objects of the same kind.

What I do is try to avoid using sessions except for authentication.

Exactly. And PHP is not a compiled language. What makes sense in Java, doesn’t necesarilly extend to PHP.

In that case, it belongs in the model layer, and then it’s not immutable.

If loading up code becomes a problem, you can apply an opcode cache.

Those applications have a serious problem. I’m not arguing that you’re right though.

Yes, but how would you implement PRG?

Also I have no idea what metric you are using for “success”.
to add some oil to the fire :wink:

http://www.google.com/trends?q=php%2C+java%2C+.net%2C+ruby&ctab=0&geo=all&date=all

Can you name just one PHP e-commerce that has been as much a success as OSCommerce ? :smiley:

Stefano

This looks like an interesting question, but I don’t understand what you mean. :confused: What relationship between PRG and sessions are you implying?

It’s quite nifty to store the form data in sessions and redisplay it after the redirect in the correct form fields.

Are you referring to a valid or invalid form submission?

Yes, I should have been clearer; The problem is how to handle an invalid submitted form. You can do validation at the client side to prevent invalid submission in the first place. But then you have to write the same validation code twice (in PHP+javascript). It won’t work for complex rules, which depends on other server side resources.

Both, actually. When the form is invalid, the form data must be redisplayed after the redirect. When the form is valid, you want to show your user a message that the form was valid and that something happened. Sure, you could trigger such a message with a query string, but then someone can trigger that message by browsing their history, and if it’s a message saying that something was deleted, it’s not so good. :slight_smile:

My point of view: Persisting resource data is crucial once you hit high volume, if we were to cut our local cache of huge composite objects that are frequently used then basicaly just about every of our database servers would kneel.
This has, as already said here, little to do with application state. I would store as little as possible in sessions, basicaly it means i store authentication state and probably a user id, from there i look data up when needed, and since this data is persisted its quick to fetch.

In the situation of PRG you can often make decisions based on post data or referer data, for more complex needs then yes, sessions is a solution.
But really, the less state you maintain the easier you make it, and the easy route is always what we want, so i would always try to find simple solutions and only be advanced when there really arent any other options.

Regarding CPU cycles; Have anyone here had a php app that actually is close to using a lot of CPU? IO is the only problem i’ve ever seen with a php app. I bet we could run most our stuff on 10 year old CPU-s but in terms of memory it needs lots (try to avoid disk IO as much as possible).

IMO this is a bad idea. Most browsers out there now feature tabbed browsing. Certainly power users have already adopted such browsers (FF, IE7, Opera).

You run the risk that power users work in multiple tabs (or just multiple windows or even frames/iframes) concurrently. They all share the same session and you will have a race condition on your hands. Your users will start receiving strange confirmations or errors because the tabs will steal “messages” from eachother. You run the risk that the power users deem your application/site unreliable.

The problem is amplified by the fact that you’ll have to protect the session with an exclusive lock so that only one request access the session data at any one time. This means that the next request (if any has been received) is likely already queued up. When you send the redirect to the browser it will respond by a new request which will land at the end of the queue, behind requests for other tabs, windows, iframes etc, increasing the likelihood that one of these requests will interfere with the logic.

The session should not be used for page to page communication. It is for session state.

I can see I was too quick in starting a new thread; I see now how the two are interrelated. In [URL=“http://www.theserverside.com/tt/articles/article.tss?l=RedirectAfterPost”]Michael Jouravlev’s recommended method, session state is used for page to page communication. A “GUI object” can have a message attached to it. I’ve just implemented messages in session, and discovered that I had to make the message self-destruct after it was used once. So yes, it’s unnatural, but it could still be the best choice.

As for interference between windows or tabs, Jouravlev solves that problem, too, or most of it, by storing objects in session by object ID. That should prevent interference unless you’re editing the same object twice, which is a problem in any case. On the other hand, this seems to imply that you need to assign an ID to an object that hasn’t been saved to the database yet. That could be tricky.

Tac

One issue with non redirecting is that when the form is invalid the user could not only hit the Back button but also the Reload button and resubmit the same invalid values. Since the values are invalid, there’s been no manipulation of the underlying data store. I have a vague memory of someone here objecting to the POST data dialog popping up and uising redirects to avoid that, but I think it makes more sense to redirect after receiving valid data and storing it.

As for Findus’ message problem, this is very much like placing a security token in a hidden field and comparing its submission to a value stored in a session. (Chris Shilflett has explained this a number of times but I can’t find a URL right now.) Same issue: you have to destroy that session variable. If the user reloads the form with invalid values, the security token may be regenerated and would then be deemed invalid. So if you want to include that sort of security mechanism, you’re going to have to do this. I think it would also solve the problem of multiple tabs. Once the form values have been found valid, the security token is no longer in the session and the form in the other tab, if resubmitted, won’t validate. (I think… haven’t tested it.)

Summing up what I’ve seen so far, here is “PRG lite”:

  • Always redirect on valid form submission
  • Don’t redirect on invalid form submission.
  • After a valid form submission, instead of displaying a confirmation message, make sure the result of the form submission is apparent from the contents of the result page.

This is what I’ve been doing for the past few years. The benefit of this procedure is it doesn’t require session data. The downside is the fact that the user may be confused by using the reload or back buttons after invalid form submissions. That may be a real problem sometimes (depending on the skill level of the users), but nowhere near as big a problem as getting a the ugly re-submit message when hitting reload on the result page.

This is basically the solution, I’m using as well. It is a problem for forms without an identity (Eg. create new entry), but in my experience, a race condition isn’t as likely to happen for these forms. When a user enters a form for creating an entry, they are likely to complete that task, before beginning a new, similar one.