Message boards : News : BHDB application
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

AuthorMessage
scole of TSBT

Send message
Joined: 22 Feb 15
Posts: 24
Credit: 250,365,396
RAC: 416
Message 2648 - Posted: 10 Mar 2018, 0:58:03 UTC
Last modified: 10 Mar 2018, 1:07:51 UTC

Remember, it was clearly announced this was a new app and the first WUs were for testing. If a WU errors out and it upsets you, maybe you should wait to run the app.

BTW, I've got over 125 threads on it and if the WUs error out, no problem. I'm glad to help the project.
ID: 2648 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gibson Praise
Avatar

Send message
Joined: 26 Feb 15
Posts: 3
Credit: 56,424,411
RAC: 0
Message 2649 - Posted: 10 Mar 2018, 1:37:23 UTC - in response to Message 2647.  
Last modified: 10 Mar 2018, 1:42:07 UTC


<message>
Maximum disk usage exceeded
</message>

FYI .. 6 bad, 0 valid, all with this message.

Is this something configured on the server side? stderr references in the neighborhood of 500 Mb for peak disk usage on my errors. I can certainly handle more per task.
ID: 2649 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2650 - Posted: 10 Mar 2018, 1:50:09 UTC - in response to Message 2649.  

stderr references in the neighborhood of 500 Mb for peak disk usage on my errors. I can certainly handle more per task.

For the ones that error, I see a peak disk usage of about that also (up to 580 MB thus far).

But on the ones that are completed and validated, I see much less peak disk usage; about 35 to 39 MB.
https://universeathome.pl/universe/result.php?resultid=33192132
ID: 2650 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 2651 - Posted: 10 Mar 2018, 1:53:18 UTC - in response to Message 2649.  

Look above... This is what I found :(
Basically, in "Disk" option set "Use at most" at minimum 1GB per thread of your CPU to prevent failing of tasks.

It looks like BM wrongly calculate space required, but I can be wrong... Tests in progress.

But, for your information - server let Manager that it requires 900MB space per task and never use it in work...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2651 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2652 - Posted: 10 Mar 2018, 1:55:52 UTC - in response to Message 2651.  

Tests in progress..

Get some sleep.
ID: 2652 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 169
Credit: 317,253,046
RAC: 6
Message 2653 - Posted: 10 Mar 2018, 2:13:34 UTC

Only 9 of 26 tasks started right away. I suspended the other 15 when I saw they were all group 3. I later aborted them with 0.0 CPU time. All 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED. 477 of 502GB free. Only like 480-530mb used per task info.

Before creating more and more batches. Can the old ones be completely canceled? They are still being sent out to users until reaching 4 errors.
ID: 2653 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 2654 - Posted: 10 Mar 2018, 3:01:50 UTC - in response to Message 2653.  

I will not generate more batches now. Most of tasks from the previous one are cancelled (except the ones already sent).
At the moment percentage of successfully finished task growth and I will wait to get stats from already generated and computed work units.

There is 120k of tasks where half of them will take 7-10 minutes of work where we usually finish 60-70k of "full time" jobs.

Lets see results, give me some time, please :)
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2654 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 20 Dec 15
Posts: 4
Credit: 18,837,655
RAC: 0
Message 2655 - Posted: 10 Mar 2018, 5:03:14 UTC

This program is quite the disk space hog. I had to modify my computing preferences to raise or remove the disk space limits to get tasks to run to completion without running into the disk space exceeded errors.
ID: 2655 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 169
Credit: 317,253,046
RAC: 6
Message 2656 - Posted: 10 Mar 2018, 5:07:08 UTC

Grabbed another set. 12 of 32 were batch 3/4. :(
ID: 2656 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 169
Credit: 317,253,046
RAC: 6
Message 2657 - Posted: 10 Mar 2018, 5:19:17 UTC
Last modified: 10 Mar 2018, 5:58:32 UTC

23% and .dat2 and .dat3 are over 100mb and growing but BOINC is showing <3mb.

43min and same disk errors. The 19 others errored at 48min.
https://universeathome.pl/universe/result.php?resultid=33270269
ID: 2657 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile pututu
Avatar

Send message
Joined: 7 Jun 16
Posts: 9
Credit: 121,795,337
RAC: 0
Message 2658 - Posted: 10 Mar 2018, 6:03:12 UTC

Still got this error message: 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED

See one WU with error on multiple computers:
https://universeathome.pl/universe/workunit.php?wuid=14662143
ID: 2658 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Thyme Lawn

Send message
Joined: 15 Oct 17
Posts: 11
Credit: 4,735,011
RAC: 0
Message 2659 - Posted: 10 Mar 2018, 8:39:06 UTC

I've had a few batch 5 tasks have failed with EXIT_DISK_LIMIT_EXCEEDED, with plenty more in progress. One has just failed at almost 99%, and BOINC Manager's event log shows the underlying problem:

10/03/2018 08:22:16 | Universe@Home | Aborting task universe_bhdb_180109_5_21874979_20000_1-999999_875200_1: exceeded disk limit: 961.11MB > 858.31MB

Workunits have <rsc_disk_bound>900000000.000000</rsc_disk_bound> (858.31MB).

Disk usage in that task's slot directory leading up to the failure was primarily in the data1.dat2 and data1.dat3 files, with both weighing in at over 460MB (i.e. their total disk usage was above the limit).
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 2659 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2660 - Posted: 10 Mar 2018, 10:09:13 UTC - in response to Message 2659.  

Disk usage in that task's slot directory leading up to the failure was primarily in the data1.dat2 and data1.dat3 files, with both weighing in at over 460MB (i.e. their total disk usage was above the limit).

Good find. I suggest that they just increase the disk limits. Most people probably have plenty of space. The science should come first, I would hope.
ID: 2660 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile nexiagsi16v

Send message
Joined: 28 Feb 15
Posts: 23
Credit: 42,229,680
RAC: 59
Message 2661 - Posted: 10 Mar 2018, 11:39:20 UTC

Some of the WUs running fine, with the same specs of maximum discusage.
The WUs that crashing on my system, crash on other systems, too.

I have now a WU with 100% and one data-file were only 135MB. I can´t look so fast for the other file, because the WU at this time was at 100% and next moment uploaded ... :-/

https://universeathome.pl/universe/result.php?resultid=33277772

The question is, why some WUs need so much diskspace? Can you resend some of this invalid with higher limit of allowed discuse? If they become valid, we find a way...if they become invalid (not by error discusage), theres a problem with the WUs and they never become valid.

Tschau Norman
ID: 2661 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 11 Mar 15
Posts: 19
Credit: 390,816,392
RAC: 6,268
Message 2662 - Posted: 10 Mar 2018, 12:33:57 UTC - in response to Message 2661.  

Is the parameter SS not being used at the moment ?

In the log.txt file:

00:00:00 00:00:00 PARAMIN: SS = 1
00:00:00 00:00:00 PARAMIN unknown parameter: name: SS; value: 1

This is in WUs that work and I have seen different values set for SS
ID: 2662 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 21 Feb 15
Posts: 64
Credit: 65,733,511
RAC: 330
Message 2663 - Posted: 10 Mar 2018, 13:50:47 UTC

It is really a 50:50 chance to get a scientific useful result; that's not good enough.
I hope you find the problem soon. I come back every day or so and try some wu's.
ID: 2663 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2664 - Posted: 10 Mar 2018, 15:15:06 UTC - in response to Message 2663.  

This is really a test that just started. They often go on for months. It looks like this will be finished much sooner.
ID: 2664 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
CmdrSpota

Send message
Joined: 22 Nov 16
Posts: 1
Credit: 3,550,658
RAC: 6
Message 2665 - Posted: 10 Mar 2018, 16:41:58 UTC

There are just much more workunits that produce errors with the disk space message ...

10-Mar-2018 00:30:40 [Universe@Home] Aborting task universe_bhdb_180109_4_3894997_20000_1-999999_895200_0: exceeded disk limit: 670.96MB > 667.57MB
10-Mar-2018 00:42:35 [Universe@Home] Aborting task universe_bhdb_180109_4_3884997_20000_1-999999_885200_1: exceeded disk limit: 749.81MB > 667.57MB
10-Mar-2018 01:18:20 [Universe@Home] Aborting task universe_bhdb_180109_4_3819997_20000_1-999999_820200_1: exceeded disk limit: 701.19MB > 667.57MB
10-Mar-2018 02:06:00 [Universe@Home] Aborting task universe_bhdb_180109_4_3889997_20000_1-999999_890200_0: exceeded disk limit: 686.42MB > 667.57MB
10-Mar-2018 02:29:50 [Universe@Home] Aborting task universe_bhdb_180109_4_3814997_20000_1-999999_815200_1: exceeded disk limit: 689.97MB > 667.57MB
10-Mar-2018 05:58:21 [Universe@Home] Aborting task universe_bhdb_180109_4_3644997_20000_1-999999_645200_2: exceeded disk limit: 670.52MB > 667.57MB
10-Mar-2018 08:03:31 [Universe@Home] Aborting task universe_bhdb_180109_5_21044979_20000_1-999999_45200_1: exceeded disk limit: 904.93MB > 858.31MB
10-Mar-2018 08:51:13 [Universe@Home] Aborting task universe_bhdb_180109_3_30849970_20000_1-999999_850200_3: exceeded disk limit: 505.29MB > 476.84MB
10-Mar-2018 09:56:45 [Universe@Home] Aborting task universe_bhdb_180109_3_10504990_20000_1-999999_505200_4: exceeded disk limit: 550.82MB > 476.84MB
10-Mar-2018 11:32:05 [Universe@Home] Aborting task universe_bhdb_180109_3_10089990_20000_1-999999_90200_2: exceeded disk limit: 479.50MB > 476.84MB
10-Mar-2018 15:06:35 [Universe@Home] Aborting task universe_bhdb_180109_5_61049939_20000_1-999999_50200_1: exceeded disk limit: 865.53MB > 858.31MB
10-Mar-2018 16:06:10 [Universe@Home] Aborting task universe_bhdb_180109_5_61229939_20000_1-999999_230200_1: exceeded disk limit: 883.49MB > 858.31MB
10-Mar-2018 16:24:02 [Universe@Home] Aborting task universe_bhdb_180109_4_10064990_20000_1-999999_65200_3: exceeded disk limit: 683.80MB > 667.57MB

Maybe an calculating error for the "real" filesize to be written ...

On this drive I have more than 200 GB free ... so that is NOT the problem ...
ID: 2665 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 4 Feb 15
Posts: 4
Credit: 68,029,795
RAC: 132
Message 2666 - Posted: 10 Mar 2018, 16:42:45 UTC

Here are some messages from my BOINC, I personally don't think 858.31MB is very excessive usage of a disk,
it would probably be if RAM is concerned, but with the disk even a few GB won't be that problem.
Sa 10 Mär 2018 15:46:49 CET | Universe@Home | [checkpoint] result universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 checkpointed
Sa 10 Mär 2018 15:47:32 CET | Universe@Home | Aborting task universe_bhdb_180109_5_31044969_20000_1-999999_45200_2: exceeded disk limit: 859.77MB > 858.31MB
Sa 10 Mär 2018 15:47:32 CET | Universe@Home | [task] task_state=ABORT_PENDING for universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 from request_abort
Sa 10 Mär 2018 15:47:32 CET | Universe@Home | [task] result state=COMPUTE_ERROR for universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 from CS::report_result_error
Sa 10 Mär 2018 15:47:32 CET | Universe@Home | [task] result state=ABORTED for universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 from abort_task
Sa 10 Mär 2018 15:47:33 CET | Universe@Home | [task] Process for universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 exited, status 49664, task state 5
Sa 10 Mär 2018 15:47:33 CET | Universe@Home | [task] task_state=ABORTED for universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 from handle_exited_app
Sa 10 Mär 2018 15:47:33 CET | Universe@Home | Computation for task universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 finished
Sa 10 Mär 2018 15:47:33 CET | Universe@Home | [task] result state=COMPUTE_ERROR for universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 from CS::app_finished

It's about this WU, in the stderr stands only this:
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
<stderr_txt>

</stderr_txt>
]]>

Grüße vom Sänger
ID: 2666 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2667 - Posted: 10 Mar 2018, 17:06:52 UTC - in response to Message 2661.  

The question is, why some WUs need so much diskspace? Can you resend some of this invalid with higher limit of allowed discuse? If they become valid, we find a way...if they become invalid (not by error discusage), theres a problem with the WUs and they never become valid.

The errors usually occur right at the end of the run, as they are finalizing something. Hopefully a little more disk space will fix it, though it could be something more serious as you suggest.
ID: 2667 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

Message boards : News : BHDB application




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek