Page 1 of 1

How should missing data be handled in graphs?

Posted: Sat Dec 19, 2009 10:51 am
by YJB
I see a bit of inconsistent behavior when data is missing:

20091218220929 thb0 202 37 0050 10100 1 10100
20091218221045 thb0 202 37 0050 10100 1 10100
<GAP>
20091219020156 thb0 156 51 0055 10110 1 10110
<GAP>
20091219034626 thb0 147 54 0055 10110 1 10110
20091219034704 thb0 147 54 0055 10110 1 10110

While the humidity clearly shows a GAP in the graph, the temperature just pretends that nothing is wrong:

Image

I see 2 options:
1) Both temperature as well as humidity show a gap (preferred)
2) Both temperature as well as humidity are drawing a line between points ignoring the missing measurement points in between.

Re:How should missing data be handled in graphs?

Posted: Sat Dec 19, 2009 1:57 pm
by skyewright
YJB wrote:I see a bit of inconsistent behavior when data is missing
I think it's more a case of a difference in the display type for the two sensors.

The temperature is a line plot, so it shows as a line connecting known points. If there is a gap that line will have a straight line between the last data point before the gap and the first one after.

The humidity is set for 'Points' so it shows a mark on the graph for each known data point. If there is a gap in the data then there are no data point in the gap.

To get your option 1), you could change the graph definition for temperature to plot as 'Points' too.

Does that help?

Re:How should missing data be handled in graphs?

Posted: Sat Dec 19, 2009 5:08 pm
by wfpost
mathematically spoken this is about
Continuous function

http://en.wikipedia.org/wiki/Continuous_function

Which criteria should apply?
1. Weierstrass definition (epsilon-delta) of continuous functions
2. Heine definition of continuity

Aside from the drawing mechanism of Gnuplot which I guess does not allow to plot discontinuous functions, implementing the two criterias above with meteohub is a lot of work.

But: Maybe I wrong with my opinion.

Re:How should missing data be handled in graphs?

Posted: Sat Dec 19, 2009 8:08 pm
by YJB
Thanks, bot of your comments are making sense and are indeed explaning the different behavior.

After thinking about this a bit more, I realized as well, that this would be a kind of complicated, since it would require also a definition of "what is exactly missing data" 5 minutes, 15 minutes, 1 hour?