How should missing data be handled in graphs?

Discussion of the Meteohub software package

Moderator: Mattk

Post Reply
User avatar
YJB
Platinum Boarder
Platinum Boarder
Posts: 387
Joined: Thu Feb 19, 2009 5:53 pm
Location: Venhuizen, Netherlands
Contact:

How should missing data be handled in graphs?

Post by YJB »

I see a bit of inconsistent behavior when data is missing:

20091218220929 thb0 202 37 0050 10100 1 10100
20091218221045 thb0 202 37 0050 10100 1 10100
<GAP>
20091219020156 thb0 156 51 0055 10110 1 10110
<GAP>
20091219034626 thb0 147 54 0055 10110 1 10110
20091219034704 thb0 147 54 0055 10110 1 10110

While the humidity clearly shows a GAP in the graph, the temperature just pretends that nothing is wrong:

Image

I see 2 options:
1) Both temperature as well as humidity show a gap (preferred)
2) Both temperature as well as humidity are drawing a line between points ignoring the missing measurement points in between.
skyewright
Platinum Boarder
Platinum Boarder
Posts: 873
Joined: Fri Jan 25, 2008 6:27 pm
Location: Isle of Skye, Scotland

Re:How should missing data be handled in graphs?

Post by skyewright »

YJB wrote:I see a bit of inconsistent behavior when data is missing
I think it's more a case of a difference in the display type for the two sensors.

The temperature is a line plot, so it shows as a line connecting known points. If there is a gap that line will have a straight line between the last data point before the gap and the first one after.

The humidity is set for 'Points' so it shows a mark on the graph for each known data point. If there is a gap in the data then there are no data point in the gap.

To get your option 1), you could change the graph definition for temperature to plot as 'Points' too.

Does that help?
wfpost
Platinum Boarder
Platinum Boarder
Posts: 591
Joined: Thu Jun 12, 2008 2:24 pm
Location: HONSOLGEN
Contact:

Re:How should missing data be handled in graphs?

Post by wfpost »

mathematically spoken this is about
Continuous function

http://en.wikipedia.org/wiki/Continuous_function

Which criteria should apply?
1. Weierstrass definition (epsilon-delta) of continuous functions
2. Heine definition of continuity

Aside from the drawing mechanism of Gnuplot which I guess does not allow to plot discontinuous functions, implementing the two criterias above with meteohub is a lot of work.

But: Maybe I wrong with my opinion.
User avatar
YJB
Platinum Boarder
Platinum Boarder
Posts: 387
Joined: Thu Feb 19, 2009 5:53 pm
Location: Venhuizen, Netherlands
Contact:

Re:How should missing data be handled in graphs?

Post by YJB »

Thanks, bot of your comments are making sense and are indeed explaning the different behavior.

After thinking about this a bit more, I realized as well, that this would be a kind of complicated, since it would require also a definition of "what is exactly missing data" 5 minutes, 15 minutes, 1 hour?
Post Reply