Message boards : Number crunching : UV reionization tasks FAILNG

UV reionization tasks FAILNG

Post to thread Subscribe


AuthorMessage
Dr Who Fan
Avatar

Send message
Joined: 20 Feb 15
Posts: 18
Credit: 719,759
RAC: 12
Message 1513 - Posted: 4 Sep 2016, 14:25:26 UTC

Looking over my task list almost ALL (18 of 19 so far) of the UV reionization tasks HAVE FAILED that I have received and by MULTIPLE wing-persons.

LINK: http://universeathome.pl/universe/results.php?userid=1998&offset=0&show_names=0&state=6&appid=

ID: 1513 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski projekt BOINC
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 646
Credit: 84,336,631
RAC: 249,332
Message 1514 - Posted: 4 Sep 2016, 14:36:29 UTC - in response to Message 1513.  

Looking at your computers shows that only one computer of three have those errors. I strongly suspect that is something wrong with your configuration (but, I haven't any machine with win8.1 for tests).
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team
My Patreon profile
ID: 1514 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 20 Feb 15
Posts: 18
Credit: 719,759
RAC: 12
Message 1515 - Posted: 4 Sep 2016, 18:20:04 UTC - in response to Message 1514.  

Looking at your computers shows that only one computer of three have those errors. I strongly suspect that is something wrong with your configuration (but, I haven't any machine with win8.1 for tests).

THAT PC HAS NO PROBLEMS WITH ANY OTHER Universe task or ANY OTHER PROJECTS TASKS - ONLY the UV reionization tasks.

PROBLEMS IS NOT WIN 8.1 If you re-read what I said the tasks are FAILING FOR EVERYONE CRUNCHING THEM!

ID: 1515 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 4 Feb 15
Posts: 44
Credit: 2,530,446
RAC: 0
Message 1517 - Posted: 4 Sep 2016, 22:47:09 UTC
Last modified: 4 Sep 2016, 22:51:01 UTC

While I can't see the problems that Dr Who Fan is seeing (and I am using Linux at the moment as my Windows machine is doing other stuff), I do have one task that is invalid that all others have also failed on WU 6749023 for some reason it can't be validated.

No problems with other tasks except for the 'error' counter which are not errors but server cancellations and nothing to do with me.
State: All (466) · In progress (10) · Validation pending (13) · Validation inconclusive (0) · Valid (405) · Invalid (1) · Error (37)

Sorry Dr Who Fan but I have no access to the link you provided to see if it is the same problem that my Invalid work unit has suffered.

Conan
ID: 1517 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bill F

Send message
Joined: 23 Jun 16
Posts: 29
Credit: 1,008,732
RAC: 1,263
Message 1518 - Posted: 4 Sep 2016, 23:39:11 UTC

Also seeing a high failure rate with associated UV application errors on my Windows Based AMD system. BOINC Event Log mentions "missing results"

Reset my Universe@Home and turned off that Application in Preferences for now.

Bill F
Seti since 1999
ID: 1518 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski projekt BOINC
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 646
Credit: 84,336,631
RAC: 249,332
Message 1519 - Posted: 5 Sep 2016, 7:52:32 UTC - in response to Message 1515.  


PROBLEMS IS NOT WIN 8.1 If you re-read what I said the tasks are FAILING FOR EVERYONE CRUNCHING THEM!

Not really...
Fail rate for this application in fact is quite high on Windows machines (about 14-15%) but errors happens nearly always at start crunching, so it not takes CPU power at all.
On Linux machines errors rate on 0.5% level.

This is exactly same source code on both (in fact in all) operating systems and architectures.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team
My Patreon profile
ID: 1519 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 4 Feb 15
Posts: 11
Credit: 44,425,235
RAC: 0
Message 1527 - Posted: 6 Sep 2016, 6:04:21 UTC - in response to Message 1519.  

Let's put aside the differing rates of failure for now. Do we know the cause of the failures? Is anything being done to fix the problem?
Dublin, California
Team: SETI.USA
ID: 1527 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski projekt BOINC
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 646
Credit: 84,336,631
RAC: 249,332
Message 1528 - Posted: 6 Sep 2016, 6:13:58 UTC - in response to Message 1527.  
Last modified: 6 Sep 2016, 6:15:48 UTC

Let's put aside the differing rates of failure for now. Do we know the cause of the failures? Is anything being done to fix the problem?

Nothing on my side. That error not exists on any computer where I have access.
Of course, I can compile app with full debug options but this will produce tens of gigabytes info stored on server and seriously slow it and I'm afraid didn't send anything usefully as only on some configurations this error exists.

Edit:
It looks like the problem exists on some low end CPU's but I'm not quite sure about this.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team
My Patreon profile
ID: 1528 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 4 Feb 15
Posts: 44
Credit: 2,530,446
RAC: 0
Message 1529 - Posted: 6 Sep 2016, 13:33:06 UTC
Last modified: 6 Sep 2016, 13:35:29 UTC

The code (0xc000001d) as in (unknown error)- exit code - 1073741795 - (0xc000001d)
indicates an illegal instruction has been attempted (as per information from the internet).

So it needs to be found what instruction the work unit is trying to do that is failing, as BHSpin v2 work units run fine on this same computer.

Conan
ID: 1529 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski projekt BOINC
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 646
Credit: 84,336,631
RAC: 249,332
Message 1530 - Posted: 6 Sep 2016, 16:14:32 UTC - in response to Message 1529.  
Last modified: 6 Sep 2016, 16:16:48 UTC


So it needs to be found what instruction the work unit is trying to do that is failing, as BHSpin v2 work units run fine on this same computer.
Conan

And this same source code, difference is in input parameters, so it is not easy to find.

Edit:
Fail rate for Windows drops to 10%...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team
My Patreon profile
ID: 1530 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
fruehwf

Send message
Joined: 5 Jul 16
Posts: 31
Credit: 18,447,833
RAC: 0
Message 1531 - Posted: 6 Sep 2016, 19:23:18 UTC
Last modified: 6 Sep 2016, 19:25:39 UTC

This seems to be an interesting Error Description:

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<message>
couldn't start app: Can't get shared memory segment name: shmget() failed
</message>
]]>

WU:
http://universeathome.pl/universe/workunit.php?wuid=6647888

Task:
http://universeathome.pl/universe/result.php?resultid=14690935

My Machine: AuthenticAMD
AMD Opteron(tm) Processor 6320 [Family 16 Model 2 Stepping 3]
Microsoft Windows Server 2008 "R2"
Enterprise x64 Edition, Service Pack 1, (06.01.7601.00)
failed with:

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -1073741795 (0xc000001d)
</message>
]]>

Win Message:

Problem signature:
Problem Event Name: APPCRASH
Application Name: Universe-UV_1_windows_x86_64.exe
Application Version: 0.0.0.0
Application Timestamp: 57a4b652
Fault Module Name: Universe-UV_1_windows_x86_64.exe
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 57a4b652
Exception Code: c000001d
Exception Offset: 0000000000001559
OS Version: 6.1.7601.2.1.0.274.10
Locale ID: 2055
Additional Information 1: b0ca
Additional Information 2: b0ca1f9f868a1308b48e71a70652ca08
Additional Information 3: 526f
Additional Information 4: 526f34dde85cc2f9141163488289e93d


On this Machine ever UV tasks falls.
http://universeathome.pl/universe/show_host_detail.php?hostid=47221
On an other Machine:
GenuineIntel
Intel(R) Core(TM) i5-3340M CPU @ 2.70GHz [Family 6 Model 58 Stepping 9]
Microsoft Windows 8.1
Enterprise x64 Edition, (06.03.9600.00)
http://universeathome.pl/universe/show_host_detail.php?hostid=45391

Everythin ist allright.

HTH

Franz
ID: 1531 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski projekt BOINC
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 646
Credit: 84,336,631
RAC: 249,332
Message 1533 - Posted: 7 Sep 2016, 15:39:54 UTC

I had updated Windows applications yesterday to version with debug flags enabled.
Please let me know if this helps (on host where I get access it is helped).
New version is 0.03.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team
My Patreon profile
ID: 1533 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
boboviz

Send message
Joined: 21 Feb 15
Posts: 51
Credit: 271,939
RAC: 0
Message 1534 - Posted: 7 Sep 2016, 17:34:39 UTC - in response to Message 1533.  

Please let me know if this helps (on host where I get access it is helped).


It's difficult to help you, if we can't download wus.... :-(
ID: 1534 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile JumpinJohnny
Avatar

Send message
Joined: 1 Dec 15
Posts: 8
Credit: 443,667
RAC: 0
Message 1605 - Posted: 30 Sep 2016, 22:42:06 UTC

I got just one Universe Ultraviolet reionization v0.02 today.
It had a very low estimated time on it. I was curious so I ran it by itself. It failed immediately.
http://universeathome.pl/universe/result.php?resultid=16169870
universe_uv_160803_4_4_2000_1-999999_22300_3
stderr reads: (unknown error) - exit code -1073741795 (0xc000001d)

it failed all three times and has been sent out again to a forth user.

I'll keep trying to test them when sent.
this computer
ID: 1605 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Post to thread

Message boards : Number crunching : UV reionization tasks FAILNG