I am into long overdue project with personal weather station data.
My weatherstation produces, every minute, a record containing just over 100 fields of data - where perhaps only 30-40 contains real data, the rest is just “filled” because Ambient’s Virtual Weather Station (VWS) can gather data from a huge number of different weather stations, in different formats/types.
The csv files from VWS are populating a MySQL database via VWSql, a product which ‘just’ reads the csv file and pumps the data into the MySQL database.
So, it is only one table.
Sometimes I get some bad data, which I know how to look out for. So, my idea is to write a trigger that checks for bad data, and then just shovel such possibly bad data into a table with exactly the same fields but with another name.
If good data, I want to update hourly, dayly, monthly, yearly tables tables with with statistics. Like average, max, min temps, max wind, wind direction, UV, solar incidence, barometrics, precipitation, [there are 7 basic readings made], and maybe some derived data like air density (which is calculated from the basic seven readings) - It all adds up to some 30-40 fields.
My thought was to store these updated values, in the different calculated tables every minute - I think that this goes against some kind of database rule like: Don’t store calculated values - get them on the fly by SQL SELECT statements…
On the otherhand, it may be faster - especially for another site, in a city, where there could be lots and lots of visitors to my web site.
I’d like to present each visitor with current weather, minute and hourly average, todays average, … and comparisions with last two-five years, and ten years ago (I only have 5+ years of data so far).
Users would only through PHP code see result sets from views (security).
The main table uses a DATETIME field as unique (incl minutes - YYYYMMDDHHMM format).
I have had a look at splitting up just the datetime, which by itself is quite a bit of coding. So, I thought it would be simpler SQL, but on the other hand quite a bit of programming for the stored procedures/triggers.
Database space is not that much of a problem - the main table grows with just around 140 MB per year.
So, what are your recomendations? Large SQL SELECT statements or large trigger/stored procedures with simpler SELECTs?
Regards,
“Bagtjesen”
(The old stubborn man (also in a positive way) - in dialects in north of Sweden)