Message boards : Number crunching : Large number of Abandoned WU's on Raspberry Pi 3
Message board moderation

To post messages, you must log in.

AuthorMessage
Sam Osborne

Send message
Joined: 16 Oct 20
Posts: 8
Credit: 6,373,933
RAC: 0
Message 4503 - Posted: 18 Nov 2020, 17:16:13 UTC

No particular STR Output either so can't figure this one out.

It's happening on a device which until today had been crunching away wonderfully too:

https://universeathome.pl/universe/result.php?resultid=119155301
https://universeathome.pl/universe/result.php?resultid=119154719
https://universeathome.pl/universe/result.php?resultid=119153859

and the full list: https://universeathome.pl/universe/results.php?userid=219532&offset=0&show_names=0&state=6&appid=

Any thoughts / ideas? I'm at a loss.

RPI 4 continues to crunch away happily...
ID: 4503 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 10 May 20
Posts: 308
Credit: 4,733,484,700
RAC: 13,061
Message 4504 - Posted: 18 Nov 2020, 17:45:37 UTC - in response to Message 4503.  

Glitch? Try rebooting the device.

A proud member of the OFA (Old Farts Association)
ID: 4504 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sam Osborne

Send message
Joined: 16 Oct 20
Posts: 8
Credit: 6,373,933
RAC: 0
Message 4505 - Posted: 18 Nov 2020, 18:13:05 UTC - in response to Message 4504.  

It's effecting four devices (three added today) but will give that a go once the WCG tasks I switched to in lue of U@H finish.

Thanks
ID: 4505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sam Osborne

Send message
Joined: 16 Oct 20
Posts: 8
Credit: 6,373,933
RAC: 0
Message 4506 - Posted: 18 Nov 2020, 18:28:07 UTC - in response to Message 4505.  

It's effecting four devices (three added today) but will give that a go once the WCG tasks I switched to in lue of U@H finish.

Thanks


Boinctasks is showing that the server is down quite a lot wondering if that is causing it?
ID: 4506 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 10 May 20
Posts: 308
Credit: 4,733,484,700
RAC: 13,061
Message 4507 - Posted: 19 Nov 2020, 5:15:37 UTC - in response to Message 4506.  

It's effecting four devices (three added today) but will give that a go once the WCG tasks I switched to in lue of U@H finish.

Thanks


Boinctasks is showing that the server is down quite a lot wondering if that is causing it?

??? Which server is down??

The Universe server has not been down today.

A proud member of the OFA (Old Farts Association)
ID: 4507 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 23 Mar 16
Posts: 95
Credit: 23,431,842
RAC: 23
Message 4508 - Posted: 19 Nov 2020, 8:57:58 UTC

"Abandoned" means the server thinks the device detached from the project. I've seen it on this project whenever one of my Pies trashes its SD card and needs to be rebuilt. All the WUs that were running get marked as "Abandoned" when I re-attach the rebuilt Pi. Does that give you any clues to work on?

The Universe@Home server was briefly uncontactable yesterday; I had a few uploads stall, though no serious problems.
ID: 4508 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sam Osborne

Send message
Joined: 16 Oct 20
Posts: 8
Credit: 6,373,933
RAC: 0
Message 4509 - Posted: 19 Nov 2020, 11:48:51 UTC - in response to Message 4508.  

"Abandoned" means the server thinks the device detached from the project. I've seen it on this project whenever one of my Pies trashes its SD card and needs to be rebuilt. All the WUs that were running get marked as "Abandoned" when I re-attach the rebuilt Pi. Does that give you any clues to work on?

The Universe@Home server was briefly uncontactable yesterday; I had a few uploads stall, though no serious problems.



That is useful to a point, I'm wondering if it because it thinks "raspberry pi", which i the name for all four RPI3's, caused this issue. I'll try and update hostname prior.

In addition, these were new installs - I'll let run and complete a full WU before setting up for BoincTasks to ensure that it's not BT that is freaking it out.

(I broke a cable in the rack so won't be back up for a day or so (oops)).
ID: 4509 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sam Osborne

Send message
Joined: 16 Oct 20
Posts: 8
Credit: 6,373,933
RAC: 0
Message 4510 - Posted: 20 Nov 2020, 14:02:25 UTC - in response to Message 4509.  

So I think I found the cause.

When setting them up I think I used the non-secure protocol for the project_attach (http:// instea of https://)

This resulted in communication issues with the server and likely the cause of the server comminication issues - so the server likely didnt go down but instead comms was the issue.

Having got two Pi's up and running today I can see that they are now processing correctly with teh project and crunching away.

Sam
ID: 4510 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 4511 - Posted: 20 Nov 2020, 14:40:08 UTC - in response to Message 4510.  

Our server automatically redirect all connections to https and it should be not a problem if you use http instead of https, but... As we can see it looks like the difference is still important...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 4511 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sam Osborne

Send message
Joined: 16 Oct 20
Posts: 8
Credit: 6,373,933
RAC: 0
Message 4512 - Posted: 20 Nov 2020, 18:49:43 UTC - in response to Message 4511.  

Our server automatically redirect all connections to https and it should be not a problem if you use http instead of https, but... As we can see it looks like the difference is still important...


That's the the only thing I can think it could be. It was definately contacting the server via HTTP and redirecting the the HTTPS version but it seems that checkpoints seem to fail and reset to the WU to zero or simply stop it progressing.

Important thing is that it's all up and running and I can get back to it!

S
ID: 4512 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Large number of Abandoned WU's on Raspberry Pi 3




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek