Problems After 4.6c Upgrade
Moderator: Mattk
Re:Problems After 4.6c Upgrade
Anybody else out there with Meteohub on Sheevaplug who has a system freeze every 24 hours or so?
I don't think of an error on application level as no operation in Linux user space can make a filesystem go wild. I also monitored your system data which reports number of processes, system load, and storage usage. None of these values do show anything interesting before your system does freeze out of sudden.
Can you take the plug into another environment? I had one week of ghost hunting with an ALIX that freezes from time to time until it turned out that it's power supply did handle spikes from mains very badly. Changing powersupply did solve all problems.
Do you have someone touching SD card during operation or anything physical like this?
I am not reluctant to think of bugs im my software - it of course has plenty of it - but symptoms are not in sync what an application error can do.
Anybody having an idea out there?
I don't think of an error on application level as no operation in Linux user space can make a filesystem go wild. I also monitored your system data which reports number of processes, system load, and storage usage. None of these values do show anything interesting before your system does freeze out of sudden.
Can you take the plug into another environment? I had one week of ghost hunting with an ALIX that freezes from time to time until it turned out that it's power supply did handle spikes from mains very badly. Changing powersupply did solve all problems.
Do you have someone touching SD card during operation or anything physical like this?
I am not reluctant to think of bugs im my software - it of course has plenty of it - but symptoms are not in sync what an application error can do.
Anybody having an idea out there?
Re:Problems After 4.6c Upgrade
Rainman, could you please set the SheevaPlug behind a online UPS, which reshapes the sine wave to rule out that power spikes do stall the whole thing.
Re:Problems After 4.6c Upgrade
Hi, actually it has been on a UPS the entire time it has been running.
The part that keeps nagging me to thinking that it is an application error is that the second unit which is literally hundreds of miles away is having the exact same issues which happen at the very same time. The occurrence has been closer to once a week rather than 24 hours or so.
The Sheeva itself (the one close to me) is in an area where it is not touching anything on the sides and not in any traffic (would not be accidentally bumped). It is propped up a little on one end from the bottom for better air flow.
Add: Both were upgraded to 4.6f yesterday, the issue has not occurred again as yet. last occurrence was yesterday at 5:00am EDT for both.
I know it's vexing and appreciate greatly your support!
The part that keeps nagging me to thinking that it is an application error is that the second unit which is literally hundreds of miles away is having the exact same issues which happen at the very same time. The occurrence has been closer to once a week rather than 24 hours or so.
The Sheeva itself (the one close to me) is in an area where it is not touching anything on the sides and not in any traffic (would not be accidentally bumped). It is propped up a little on one end from the bottom for better air flow.
Add: Both were upgraded to 4.6f yesterday, the issue has not occurred again as yet. last occurrence was yesterday at 5:00am EDT for both.
I know it's vexing and appreciate greatly your support!
- YJB
- Platinum Boarder
- Posts: 387
- Joined: Thu Feb 19, 2009 5:53 pm
- Location: Venhuizen, Netherlands
- Contact:
Re:Problems After 4.6c Upgrade
I'm running a Sheeva without any issues (4.6e (Build 90)), current uptime 14 days. I've got 11 RFX sensors attached (apart from a few system sensors), not the biggest implementation, but I'm sure not the smallest either.
I also see the following message:
[419320.040000] TCP(wget:6380): Application bug, race in MSG_PEEK.
[771520.520000] TCP(wget:8830): Application bug, race in MSG_PEEK.
But since I'm not experiencing any issues, I don't think it's something to worry about (at least apparently not related to your issues.
It might be an idea to use the serial console (using the USB connector) and see if there is any system output before the freeze. At that time you might be still able to find something out using the console.
My 1st guess was going to be a filesystem full issue, but that doesn't seem to be the case, if I read Boris' observations.
I also see the following message:
[419320.040000] TCP(wget:6380): Application bug, race in MSG_PEEK.
[771520.520000] TCP(wget:8830): Application bug, race in MSG_PEEK.
But since I'm not experiencing any issues, I don't think it's something to worry about (at least apparently not related to your issues.
It might be an idea to use the serial console (using the USB connector) and see if there is any system output before the freeze. At that time you might be still able to find something out using the console.
My 1st guess was going to be a filesystem full issue, but that doesn't seem to be the case, if I read Boris' observations.
Re:Problems After 4.6c Upgrade
Thanks YJB, good idea on the Serial Console, worth a shot.
Are you performing regular uploads to FTP site or Weather Networks?
Doh! Can't use the console because I have a Weather Station plugged into the port!:S
Are you performing regular uploads to FTP site or Weather Networks?
Doh! Can't use the console because I have a Weather Station plugged into the port!:S
- YJB
- Platinum Boarder
- Posts: 387
- Joined: Thu Feb 19, 2009 5:53 pm
- Location: Venhuizen, Netherlands
- Contact:
Re:Problems After 4.6c Upgrade
Lol, didn't realize about the usb port, on the other hand, when the plug is hung, you might be still able to connect the serial cable and see if the console responds?
I've got 24 ftp push jobs, 4 jobs every minute, the rest ranging from 5 minutes to daily. I'm uploading to wunderground every 5 minutes, I believe that's an ftp job as well.
Every 10 minutes I'm pulling most of the sensor data to my sql database using the http interface, that's, generally speaking, 4 queries with a total of let's say 140 entries.
I've got 24 ftp push jobs, 4 jobs every minute, the rest ranging from 5 minutes to daily. I'm uploading to wunderground every 5 minutes, I believe that's an ftp job as well.
Every 10 minutes I'm pulling most of the sensor data to my sql database using the http interface, that's, generally speaking, 4 queries with a total of let's say 140 entries.
-
- Platinum Boarder
- Posts: 873
- Joined: Fri Jan 25, 2008 6:27 pm
- Location: Isle of Skye, Scotland
Re:Problems After 4.6c Upgrade
I realise that the problem might be something else, but...Rainman32 wrote:Hi, actually it has been on a UPS the entire time it has been running.
What sort of UPS are you using?
The ALIX power supply problem that Boris mentioned was happening with the unit on a UPS, but it was an 'off-line' model (APC Back-UPS) rather than an 'on-line' one. The lockup was happening at the tiny switchover time that exists with an off-line UPS. Temporarily moving the ALIX to an on-line UPS (APC Smart-UPS) proved where the problem was. As a long term solution the basic power supply unit was replaced by a better model; after that the off-line UPS was again sufficient (I prefer the off-line model as it has a significantly lower background energy overhead).
Re:Problems After 4.6c Upgrade
Hi skyewright, thanks for your input.
I did not understand the term at first, it is actually Not an online model. another term that is commonly used for 'off-line' in the UPS world is 'Standby UPS' I found. And that is what I have, So there is that.
Although if Sheeva was so sensitive to power fluctuations I would expect there to be more reports of this.. the other thing that still doesn't fit is the other unit which is located 285 kilometers away and locks up at the very same time, it is not on any type of UPS at all.
YJB, you are blowing my whole upload theory out of the water! and I had to think about it again, the console actually goes through mini usb on the side not the main USB port so that is good to use.. Thanks for keeping me straight!
I did not understand the term at first, it is actually Not an online model. another term that is commonly used for 'off-line' in the UPS world is 'Standby UPS' I found. And that is what I have, So there is that.
Although if Sheeva was so sensitive to power fluctuations I would expect there to be more reports of this.. the other thing that still doesn't fit is the other unit which is located 285 kilometers away and locks up at the very same time, it is not on any type of UPS at all.
YJB, you are blowing my whole upload theory out of the water! and I had to think about it again, the console actually goes through mini usb on the side not the main USB port so that is good to use.. Thanks for keeping me straight!
Re:Problems After 4.6c Upgrade
OK I have a serial connected that I will leave running. Also just had a nice power glitch that kicked on every UPS in the office.. the Sheeva is still humming along, no issues at this time.
Re:Problems After 4.6c Upgrade
Time for an update, there has been activity going on behind the scenes in case anyone is following this.
The error has been identified as an SD card issue, I was able to capture it in dmesg using the serial console. actually ssh still worked at this point too as I caught it early enough. errors attached, couldn't get it to display properly.
So at this point I have to say that my theory of it having something to do with uploads is dis-proven. the apparent synchronized failures with remote unit a fairly large coincidence. perhaps based on uptime and workload as they are both configured the same and have been experiencing issues and thus reboots around the same time periods.
A search of the errors finds that this has been reported on other Sheeva devices used in different applications and affects a variety of cards including other class 6 and 10 cards so it is hit and miss. Although there are some cards that are known not to experience the issue, specifically Transcend TS4GSDHC150.
There is a possibility that kernel updates correct the issue and I am experimenting, but this is not recommended or supported; the fix is to use the known working good card.
Based on the above, this issue I brought up really belongs as a Sheeva thread and not a firmware issue. I will restart/continue it there for further observations.
Thanks so much for your support with this Boris! [file name=errors-f63d5d36200cb362d8b95344e58d968a.txt size=37344]http://www.meteohub.de/joomla/images/fb ... 8d968a.txt[/file]
The error has been identified as an SD card issue, I was able to capture it in dmesg using the serial console. actually ssh still worked at this point too as I caught it early enough. errors attached, couldn't get it to display properly.
So at this point I have to say that my theory of it having something to do with uploads is dis-proven. the apparent synchronized failures with remote unit a fairly large coincidence. perhaps based on uptime and workload as they are both configured the same and have been experiencing issues and thus reboots around the same time periods.
A search of the errors finds that this has been reported on other Sheeva devices used in different applications and affects a variety of cards including other class 6 and 10 cards so it is hit and miss. Although there are some cards that are known not to experience the issue, specifically Transcend TS4GSDHC150.
There is a possibility that kernel updates correct the issue and I am experimenting, but this is not recommended or supported; the fix is to use the known working good card.
Based on the above, this issue I brought up really belongs as a Sheeva thread and not a firmware issue. I will restart/continue it there for further observations.
Thanks so much for your support with this Boris! [file name=errors-f63d5d36200cb362d8b95344e58d968a.txt size=37344]http://www.meteohub.de/joomla/images/fb ... 8d968a.txt[/file]
Re:Problems After 4.6c Upgrade
Hi black23,
can you tell me your preferences for cpu usage?
Thx Stregus
can you tell me your preferences for cpu usage?
Thx Stregus
black23 wrote:Hi,
I want to report same problem. After update on 4.6c my Sheeva after 6:00 goes very slowly, CPU usage is extremly high. After rebooot everything looks good. This problem repeating everyday from update.
Previous version was ok.
Boris, please help us.
Thanks
Config: Sheeva, MH 4.6c, WS 23XX
Re:Problems After 4.6c Upgrade
this thread deals with 4.6b/c/d problems related to tcp/ip socket connection problems.
4.6e/f/g do have that fixed. Please just update.
4.6e/f/g do have that fixed. Please just update.