Trending algorithm in PHP?

I’m working on a personal project that organizes songs uploaded by the frequency of votes in a certain amount of time, and maybe some sort of base score (probably how many likes something has, or plays). I’ve seen some trending/hot algorithms, but the problem is that they have downvoting, while my system does not. Can anyone give me insight on how to perform something like this in PHP?

This is my idea so far:

$s = $row['song_stat_plays'];
$s = $s + (2 * $row['song_stat_downloads']);
$s = $s * (3 * $row['song_votes']);
$TrendScore = floor(log(max($s,1)));

Simple answer: Figure out the system first, then figure out how to do it in PHP.
Write down your system in English (or your chosen language. You know what i mean.). Make all decisions (even if you change them later, make A decision for now).
Once you’ve defined your rules, it’s basic word-problem solving. Take each step/rule and reduce it to code. When you get to the end, you’ve got your code.

I’m sure writing it down would be great, but it looks to be derived from an actual mathematical equation.

so… figure out your own algorithm? I dont know what you’re expecting here…

I just wanted to see if someone did something similar and kinda guide me through what they did differently. But eh. I’ll figure it out. thanks.

I’ve not seen nor ever put together such an algo myself, but I’ve been thinking about it.

The first thing I’d do is think about what kind of information I might need.

Number one is a way to store datetimes for each criteria I was interested in so I could do by week / month / year etc. if I wanted to.

You already mentioned

  • play
  • download
  • vote

I’m guessing you would need song name, and maybe, though it would make things more complex, info such as artist, genre, record label etc.

As far as weighting values with multiplication and / or log, sorry, but I have no idea ATM

.

Well, the Max just ensures that the minimum score is 0 (log(1)… log(0) is invalid because ln x for 0 is undefined)…
flooring the ln though is interesting, because it essentially sets major thresholds - e^y.

Translation: Your $s values get flattened to groups of e^y, so… 2, 7, 20, 54, 148, 403, 1096, 2980, 8103, 22026, 59874, etc.
Essentially it becomes harder (exponentially so) to reach the next ‘level’ of score.
This behavior has a couple of interesting points: when you get above group 44 on a 64 bit system, your integers now flow over into floats. (At this point however, you’re dealing with a needed increase of $s in the range of 3x10^18 points… not likely to happen, If every person who viewed, also downloaded, and voted, and only did so once, you would need 1,194,980,949 people to do so - in other words, the entire population of China or India having clicked the three buttons. Also you’d have a song that was more popular than any youtube video ever except for Gangnam Style - and youtube counts multiple views.) Once you reach group 48, you’ve recorded a view, download, and vote for every man, woman, and child on the planet - and to reach group 49, they’d all have to have done so 3 times over. So there’s significant limitations in place. (Generally you can say 0 <= $TrendScore < 49.)

StarLion is right.
You must write down the Logic Flow in English first.
And hence you can figure out the algorithm.

I moved 10 posts to a new topic: What is the usefulness of Entities Relationship Diagrams and Object Oriented Analysis Design?

I feel like if I stored the dateTime of each play download and vote, it would take up way too much space. I feel like I should store one dateTime, which would be when it was recently liked.

I mean I just need one algorithm in this project. So maybe using different methodologies might not be necessary.

I wouldn’t worry about saving space, MySQL can handle a lot of data.

I would be interested to know how a “trend” can be determined on a single datetime.

As Mittineague says, a single timestamp is not a trend. # of timestamps/time can be a trend (considered a popularity trend), but comparing a single timestamp isnt a trend - it’s a “freshness”. If 100,000,000 people view a video yesterday, and 1 person views another video today, which is more trending?

Trending can only be done by taking snapshots of data at particular times over time. Some refer to the process as data warehousing (OLAP). Of course, the Wikipedia article is talking about enterprise kind of DW, but the concept is the same. To get a trend, you have to have a “picture of the moment” and store it. Then you can analyse those pictures of data to see if what you are looking at is changing (for good or bad). Straight OLTP data won’t get you this alone. You need OLAP.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.