Displaying Records In Order of Hits

cfpmedia_com · August 17, 2011, 5:13pm

Hello,

I am attempting to display five records from a database, but in order by hits.
Each record has an assigned “hit counter”, and I have a query that is supposed to display them.

Here’s the code:

<?php

$like_1 = $link->query("SELECT * FROM likes WHERE hits > 8 ORDER BY hits DESC");

while ($like_2 = $like_1->fetch_object()) {

echo "<a href=\\"like.php?like={$like_2->id}\\">{$like_2->content} | {$like_2->hits}</a><br>\
";

}

?>

The table is composed of four columns: content, hits, lip, id, and ltime.
Content is a small strand of text, hits is how many people have seen it, lip is ignored, id is the id number, and ltime is also ignored.

I want it to display them in order from the highest to the lowest (according to hits), but it isn’t working!
Any ideas?

Thanks,
Eric

tgavin · August 17, 2011, 5:21pm

Doing it this way can cause some performance problems because each time somebody accesses the page, or file or whatever, you need to perform an UPDATE on that record to increase the hit count.

You’d be better off creating a separate table for the hits and just inserting a new record every time somebody access the page, or whatever it is your counting. All you’d need is a PK field for the record and an INT field for ID of the item your counting.

Then you just perform a query on the hits table and count the number of records that contain the item’s id.

cfpmedia_com · August 17, 2011, 5:22pm

Good idea. I’ll try that and see what happens.

tgavin · August 17, 2011, 5:24pm

Planning ahead it might be a good idea to add a date_added field to this hits table too, if you’re curious when each hit occurred.

cfpmedia_com · August 17, 2011, 5:44pm

Yes, I have the time for that.

cfpmedia_com · August 17, 2011, 5:51pm

Having a separate table for the hits, how would I cross-reference the two tables to display information from the likes table, according to the hits from the hits table? As in display all the records from the likes table whose id matches that of the hits table whose hits are higher than 10?

tgavin · August 17, 2011, 7:17pm

Assuming these are your tables

CREATE TABLE `hits` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `likes_id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

CREATE TABLE `likes` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `text` varchar(10) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;


SELECT id
FROM hits
GROUP BY likes_id 
HAVING COUNT(*) &gt; 10

system · August 17, 2011, 8:53pm

You criticize the overhead of an UPDATE, and then suggest burning away hard disk space by making a monstrous table of nothingness? Mein gott man… Maybe it’s because I’m used to 100K page views a day, but that has to be the most absurd suggestion I’ve ever heard for tracking something as simple as hits. Don’t even want to THINK about the seek time on that after a week. Hello IOWAIT.

I’d take “UPDATE likes WHERE id=? SET hits=hits+1” over what you just suggested ANY DAY on disk space alone… much less memory use when he tries to pull a total.

As to the OP’s question, your logic looks ok, though I advise against using fetch_object as it’s excess overhead for no good reason (row as an array is faster), and the double quotes make the code needlessly complex.


$results=$link->query('
	SELECT * FROM likes
	WHERE hits > 8
	ORDER BY hits DESC
'); 

while ($data=$results->fetch_row()) { 
	echo '
		<a href="like.php?like=',$data['id'],'">
			',$data['content'],' | ',$data['hits'],'
		</a><br />';
}

Though could you be more specific on what’s going wrong? Is it outputting nothing? Is it displaying them out of order? Echo out msqli::error after the query, what’s it saying?

rpkamp · August 17, 2011, 9:00pm

I find that recording statics is best done in a DBMS that supports fire-and-forget (i.e. send the update but don’t wait for an answer, which makes it cost less time), like MongoDB. Since most hosts don’t offer that (yet), MySQL is probably still the way to go. With MySQL I agree with what DS60 above, but would like to add that if you use MyISAM tables, you’re probably better of using UPDATE LOW_PRIORITY for queries like these, to prevent the updates from clogging up the entire table (and MySQL with it if you’re very unlucky).

cfpmedia_com · August 17, 2011, 9:37pm

Thanks for the help everyone. I decided to go with keeping it all in one table as I had thought originally to do. I also edited some stuff and the ordering works properly now.

r937 · August 17, 2011, 11:25pm

man, that’s needlessly harsh

storing raw data is rarely a bad decision

if you ever wanted to track hits by hour of day, or by customer status, or by geolocation, or by anything other than the simplistic SET hits=hits+1, you will be glad you stored the raw data

it might’ve been an absurd idea back in the day when a gigabyte was a million dollars, but you can get a terabyte for seventy bucks today

tgavin · August 18, 2011, 12:18am

deathshadow60,

I have an API script that is accessed over 300,000 times per day and this is the way that I store my stats. That stats table currently has over 10 million records in it and it takes mere seconds to find what I’m looking for with a simple SELECT query.

I used to keep my stats in the manner that you suggest and it placed a tremendous load on the server. It was also myisam, which didn’t help. So I had to switch to innodb and a separate table - which has worked out incredibly well - not only in speed but in flexibility.

When my stats table gets too big I just back it up, TRUNCATE it and then start counting again. There’s no need for me to keep two-year-old stats in a live database.

This also gives me the flexibility to not only store hits, but to timestamp them, get the visitor’s IP, the developer’s ID who sent the user to the script, etc., and store that along with the hit - which you can’t do by performing what you’re advocating, other than writing over the existing timestamp, and cfpmedia.com mentioned he was interested in timestamping them.

system · August 18, 2011, 12:58am

That’s 4 megs of logging a DAY (assuming two 32 bit integers, with the associated overhead of table indexing effectively doubling that) – and that’s before taking file-system blocks into account (a read/write operation that should take just as long as just incrementing the appropriate counter). That’s IOWAIT hell compared to a simple read/write, especially when it comes time to show that result. After 30 years of doing databases, the mere notion of “Just dump it to the end of a record” on something like that… OUCH?

If performing a simple update operation on ONE record was taking any more time than flooding a log file, there was something HORRENDOUSLY wrong with how you were handling that… especially given how badly ALL database systems slow down the more records a table has.

That I could see – but at that point why not just open up your apache log files?

Even with wanting that information I’d probably try to index it by a mix of IP address and browser ID string, just to keep the database size down to something that isn’t going to result in that pesky 99% IOWAIT…

But then that’s my opinion of things like hit counters in the first place – alongside other idiocies like google analytics; waste of code and time for something the server should already either be logging, or on the whole is a waste of time to bother keeping track of.

Of course, I’m overly paranoid about wasteful IOWAIT – my specialty being taking sites choking out quad Xeon’s and running them cleanly on Atom 330’s after neutering the bloated/half-assed coding practices that seem so common these days. 99% of the time people go to throw more CPU or RAM at a problem, it’s often IOWAIT that goes overlooked – when it’s the biggest bottleneck in a modern system.

Goes hand in hand with my disgust that almost 20 years ago I could have 500 people network booting off a single 386/33 running Netware 3.12 sharing a two billion record financial database slamming the server with continuous requests all day without a hitch – while today a crappy little 2000 user/million post forum where no more than 150 of them are online at the same time can drag a Quad Core Xeon with 8 gigs of RAM to it’s knees… progress… RIGHT.

tgavin · August 18, 2011, 1:18am

Because mine isn’t a hit counter, it’s a web service that needs to keep track of all sorts of user and developer information, such as which user accessed the script and what desktop program they were using when they accessed it. This site has over 11,000 users and needs to track their stats. They depend on it.

The reason I suggested what I did to cfpmedia was because he may have wanted to store extra information at some point and he would have been stuck keeping it all in the same record. However, there is zero room for growth in doing it that way.

That’s his call and he knows what’s right for his system. I wanted to bring it to his attention. If he knows he will never need to store anything other than the hit, then he can do it in a single record. However, if he wants some room for growth then the smart money is on another table. I mean, it is a relational database.

You can always gain performance by adding servers, but if you’re stuck with no room for growth in your database design then you have bigger problems. It’s best to plan ahead.