Query to find a boolean boundary within the data

I am collecting environmental data from my hen house and storing it locally in a table. Updates occur every minute. In addition to temperature, etc, one of the data streams I am monitoring is the state of the chicken door on the hen house, i.e. whether it is open or closed. In the table, I am using a boolean (tinyint) for the data open = 1 and closed = 0. There is also a timestamp column that defaults to CURRENT_TIMESTAMP when a new row is inserted.


CREATE TABLE IF NOT EXISTS `coop_data` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `inside_temp` float DEFAULT '0',
  `photocell_voltage` float NOT NULL DEFAULT '0',
  `door_is_open` tinyint(1) NOT NULL,
  `calling_for_heat` tinyint(1) NOT NULL,
  `calling_for_ac` tinyint(1) NOT NULL,
  `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_bin;


I would like to be able to find the point where the data for the door changes state, for example:


// col door_is_open
0
0
0
0
0
1 <-- want to find this point
1
1
1
1
1
...
1
1
1
1
1
1
0 <-- and this one
0
0
0
0

Of course, this boundary will occur twice a day, open in the morning, close in the evening. The data will display on my website. I am not new to writing sql, but I’ve never had to deal with this type of a query before. Any suggestions on a query that would find these two points for any given day?

I am sure that I probably need a BETWEEN on there on the added column,


select id, door_is_open as state, added from coop_data where added between ('2012-01-19 00:00:00' and '2012-01-19 23:59:59') and

and that’s as far as I got. Any suggestions?

try this (untested) –

SELECT t.id
     , t.door_is_open AS state
     , t.added 
  FROM coop_data AS t 
 WHERE t.added >= '2012-01-19' 
   AND t.added  < '2012-01-20' 
   AND t.door_is_open <>
       ( SELECT door_is_open
           FROM coop_data
          WHERE added =
                ( SELECT MAX(added)
                    FROM coop_data
                   WHERE added < t.added ) )

Cool, I see what you are trying to do there, using the ‘not equal’ operator to find all the rows not equal to the results of the nested selects. However, running the query makes mysql timeout.

add an index on the added column

Awesome! Works well, although it makes my server think for a bit. I noticed I have an anomaly in my data as well, but I’m not sure I can filter it with the query.

What’s happening is the door closes, then opens briefly because of the sensor’s margin of error, and then finally closes again for good. The inverse of this is also true. I end up with data like this:


1
1
1
1
1
1
0
1
0
0
0
0
0
0
0
...
0
0
0
0
0
0
1
0
1
1
1
1
1
1
1
1
1
1

Is it possible to write the query to account for that?

i’m certain there is, but i cannot do it without help

it’ll be in chapter 24 “regions, runs, gaps, sequences, and series” in joe celko’s “sql for smarties”

sorry, i do not have time right now to research it further

Not a problem! Thanks so much for the help! Now that I have the query, I’ll play with it and see what I can come up with. Thanks again.