Message boards : Number crunching : Too many errors (may have bug)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Ananas

Send message
Joined: 26 Mar 15
Posts: 52
Credit: 1,737,270
RAC: 0
Message 647 - Posted: 27 Oct 2015, 18:03:08 UTC
Last modified: 27 Oct 2015, 18:14:24 UTC

An old BOINC bug, it sends out results that wouldn't ever have had a chance to be valid, caused by the limits :

max # of error/total/success tasks 3, 10, 6

Especially with the long running ones, the error limit is too low.

I have one with wingmen errors :

Not started by deadline - canceled
Aborted by user
Error while computing (an unreliable box that has ~1/3rd errors)
Completed, can't validate (mine *sigh*)
Error while computing (ERR_RESULT_START, result not even started)
Error while downloading (ERR_RESULT_DOWNLOAD)

None of the errors is caused by the result itself but still it is invalid

http://universeathome.pl/universe/workunit.php?wuid=2850372

The other one looks very similar.

http://universeathome.pl/universe/workunit.php?wuid=2850252


I aborted three now (_5 and _6) because the risk to waste a lot of CPU time is much too high with those settings.

p.s.: The _6 results are always doomed, if a workunit had nothing but error results so far. BOINC still sends one out (this is the ancient BOINC server side bug)

p.p.s.: A second really old BOINC bug is that hosts with only 2% valid results still can have the full quota, reported a lot of times, but in Berkeley no one cares.
ID: 647 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Nemrod

Send message
Joined: 21 Feb 15
Posts: 15
Credit: 1,620,571
RAC: 0
Message 673 - Posted: 31 Oct 2015, 18:46:22 UTC

I have 14,81% of errors and I also think I waste CPU time for this project.
ID: 673 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 0
Message 675 - Posted: 31 Oct 2015, 19:08:25 UTC - in response to Message 673.  

I have 14,81% of errors and I also think I waste CPU time for this project.

I see that one of your computers have a problem with errors. On other one you have some cancelled WU's because some tasks not finished on time (or not started).
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 675 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Nemrod

Send message
Joined: 21 Feb 15
Posts: 15
Credit: 1,620,571
RAC: 0
Message 676 - Posted: 31 Oct 2015, 20:21:08 UTC - in response to Message 675.  

Było kilka zadań, które miały deadline tego samego dnia, w którym zostały pobrane. Z tego powodu anulowałem inne. Raz też miałem zadania, które resetowały się po każdym uruchomieniu. Jednak trzeba przyznać, że jest sporo zadań, które się kończą błędem. Wcześniej to się praktycznie nie zdarzało, podobnie w innych projektach.
ID: 676 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ananas

Send message
Joined: 26 Mar 15
Posts: 52
Credit: 1,737,270
RAC: 0
Message 687 - Posted: 2 Nov 2015, 2:20:26 UTC
Last modified: 2 Nov 2015, 2:22:38 UTC

I don't want to be misunderstood. My point was not that I think that the application is faulty in any way.

I just think that the error quota should be increased, as a lot of hosts seem to have trouble with things that are not connected to the application itself.

With the current setting, the risk is (too) high that a successful result doesn't make it into the science database
ID: 687 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Too many errors (may have bug)




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek