Php daemon vs Cron Job

Dear Tango,
The gps data comes via a java listener. So I am only left with few options let me review and test them and then share with you guys the results.

Can’t you pre-process the data at the java listener before you insert it into the database?

Dear Scallio,
I have tried that but is too complex too many things to process till the whole process of even receiving the data becomes slowdown and miss lots of data. What option do you have?

You could receive the data in java in one thread and have another thread (or multiple threads) for pre-processing the data. Once the data is pre-processed the pre-processor(s) could hand it off to the process that will insert it in the database. That way you have clear separation of concerns and you’re keeping all the work in one place (java) instead of spreading it over two (java and php).
Besides, IMHO Java is a lot more suited for pre-processing stuff like that than PHP is.

So basically
receive (1 thread) ===> process (1 or more threads) ===> insert in database (1 thread)

Dear Scallio,
The problem is that there are many devices. On top of that there are too many things to process e.g poi,geo fence, alerts. So I guess all this will take up time . Wewe have broken into functions in java and tried it too slow response. The problem is that the processing time will become higher and cause delay in inserting data into the database.

Hence the multiple processing threads :slight_smile:

I won’t go so far as to state this with 100% certainty, but I’m fairly certain that if Java can’t handle it, PHP definately can’t handle it.

Dear Scallio,
Here we dont mind the delay as this will only be use for reporting purpose the other one is used for live data. So that is why I do not mind delay this part. Do you know any knowledge of java multiple threads?

The problem you face though is that if you allow a queue to build up within php then mysql (or any other database) will also start to become locked for pulling data from because there will be an exponential rise in the amount of data waiting to go into it.

Yup, it gets messy real fast :smiley:
No it’s doable as long as you don’t need concurrency, that’s pretty nasty. However explaining how concurrency works is too much to explain in a post on a forum (and TBH my Java is starting to get a bit rusty) so I suggest that of you’re interested you google it and start a new thread in the Java forum if anything isn’t clear.

When is the data coming in? Is it all the time, 24/7?
If there are breaks in the incoming data (like say there is no data at night), you might want to create the reports then. If that’s not possible and you do run everything at the same time you’ll still clog up the database like tangoforce just said.
It sounds like your handling a whole lot of data here so if I were you I’d really sit down and think about how to handle it, and what to handle when. This is not something you can just try and hope the best of it.

I’m sure that someone else has spotted that doing something and then sleeping for one second is not the same as executing every second. You do need cron for that, and something like a semaphore to prevent overlap.

Also, keep in mind that *nix is NOT a real-time system. When I wrote real-time systems for the telephone company and laboratory automation systems for the pharmaceutical industry, we used true real-time systems (HP and DEC at the time). The first thing the process did was re-queue itself and then do the work; cron is close enough to that for what you want to do.

Daft question, but whats the definition of a real-time system? - I always thought that all OS’s would work in real time? You’ve got me curious now!

When going down the cron-route I’d rather use lockfiles with locking than just checking for their existence.

$lockFile = dirname(__FILE__).'/cron.lock';
if(FALSE !== ($lockFH = @fopen($lockFile, 'w'))) {
      if(TRUE !== @flock($lockFH, LOCK_EX | LOCK_NB)) {
            exit;
      }
}
else {
      exit;
}

// Do fancy things...

@fclose($lockFH);
@unlink($lockFile);

This way an existing file (e.g. left by a cancelled run) does no harm as it isn’t locked anymore.
While the script is running it holds the lock and no other process (ok, not exactly, but in PHP terms) could get the lock and thus exits early.
I’ve been using this method for years now with transactions running more than a minute and a minute-cron - no problems so far, all is in sync.

Dear Scallio,
The problem is that the data is coming 24/7. There is no break. You are right that my reading will lock up my database and then there will be whole long of queue waiting to insert while I am keeping reading. I am also pretty lost first I thought of processing it on 5 minutes basis but come to see of it is the same problem because I will do a read and lock the db so look like I have to find other alternatives.

The only other alternative you have is to have different servers receiving information from different GPS devices, Realistically this might be your best option because it will break down the workload as you’ll have multiple machines handling their own devices.

As long as they are all accessible by your main system (for pulling data from them) you’ll theoretically be ok.

Dear Tango,
How will that help because end of the day I am still having only a single database to read from right it will still lock I guess.

No, you are not understanding me correctly.

If you have multiple machines, each one can process its data and then store it in a database. You then need to become creative about how you pull data from your systems.

Say you have 50 GPS devices and 10 servers. Each server handles the data for 5 GPS devices. You want to search for data from two devices which are on differeny machines. Simple - You send a query to them all to select data where GPS-device=‘<device_number>’

You’ll then get the data back from two different machines to your main system where you can analyse it on screen.

I’ll admit it could be a lot of work and ver complex but it could save you if you truly have so much data coming in that one machine can’t handle it.

If you have that much data, then you need to delegate the work to slave systems to handle it. You’ll need a master system that can then interrogate them for the required information which can then be used for your purposes.

Dear Tango,
So basically you are asking me to load balance at the receiving
End itself right. Your advice is not to use the php or
cron because of table locking problem right.

Dear Tango,
So basically you are asking me to load balance at the receiving
End itself right. Your advice is not to use the php or
cron because of table locking problem right.

Dear Tango,
So in what will the php script you suggested will be best applicable?

MySQL with the InnoDB storage engine does not lock the whole table, it supports row-locking.
Could you please specify in detail what data and how it is received?

There are plenty options to handle large amounts of data - but I don’t think a “Newbie” (no offence! I’m refering to your nick “newtomysql”) will be able to handle them, so the best bet would be to get someone who knows what he’s doing.