Message boards :
News :
BHDB application
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next
Author | Message |
---|---|
Send message Joined: 22 Feb 15 Posts: 24 Credit: 250,365,396 RAC: 0 |
Remember, it was clearly announced this was a new app and the first WUs were for testing. If a WU errors out and it upsets you, maybe you should wait to run the app. BTW, I've got over 125 threads on it and if the WUs error out, no problem. I'm glad to help the project. |
Send message Joined: 26 Feb 15 Posts: 3 Credit: 56,424,411 RAC: 0 |
FYI .. 6 bad, 0 valid, all with this message. Is this something configured on the server side? stderr references in the neighborhood of 500 Mb for peak disk usage on my errors. I can certainly handle more per task. |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
stderr references in the neighborhood of 500 Mb for peak disk usage on my errors. I can certainly handle more per task. For the ones that error, I see a peak disk usage of about that also (up to 580 MB thus far). But on the ones that are completed and validated, I see much less peak disk usage; about 35 to 39 MB. https://universeathome.pl/universe/result.php?resultid=33192132 |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
Look above... This is what I found :( Basically, in "Disk" option set "Use at most" at minimum 1GB per thread of your CPU to prevent failing of tasks. It looks like BM wrongly calculate space required, but I can be wrong... Tests in progress. But, for your information - server let Manager that it requires 900MB space per task and never use it in work... Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
Tests in progress.. Get some sleep. |
Send message Joined: 2 Jun 16 Posts: 169 Credit: 317,253,046 RAC: 0 |
Only 9 of 26 tasks started right away. I suspended the other 15 when I saw they were all group 3. I later aborted them with 0.0 CPU time. All 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED. 477 of 502GB free. Only like 480-530mb used per task info. Before creating more and more batches. Can the old ones be completely canceled? They are still being sent out to users until reaching 4 errors. |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
I will not generate more batches now. Most of tasks from the previous one are cancelled (except the ones already sent). At the moment percentage of successfully finished task growth and I will wait to get stats from already generated and computed work units. There is 120k of tasks where half of them will take 7-10 minutes of work where we usually finish 60-70k of "full time" jobs. Lets see results, give me some time, please :) Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 20 Dec 15 Posts: 4 Credit: 18,837,655 RAC: 0 |
This program is quite the disk space hog. I had to modify my computing preferences to raise or remove the disk space limits to get tasks to run to completion without running into the disk space exceeded errors. |
Send message Joined: 2 Jun 16 Posts: 169 Credit: 317,253,046 RAC: 0 |
Grabbed another set. 12 of 32 were batch 3/4. :( |
Send message Joined: 2 Jun 16 Posts: 169 Credit: 317,253,046 RAC: 0 |
23% and .dat2 and .dat3 are over 100mb and growing but BOINC is showing <3mb. 43min and same disk errors. The 19 others errored at 48min. https://universeathome.pl/universe/result.php?resultid=33270269 |
Send message Joined: 7 Jun 16 Posts: 9 Credit: 121,795,337 RAC: 0 |
Still got this error message: 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED See one WU with error on multiple computers: https://universeathome.pl/universe/workunit.php?wuid=14662143 |
Send message Joined: 15 Oct 17 Posts: 11 Credit: 4,735,011 RAC: 0 |
I've had a few batch 5 tasks have failed with EXIT_DISK_LIMIT_EXCEEDED, with plenty more in progress. One has just failed at almost 99%, and BOINC Manager's event log shows the underlying problem: 10/03/2018 08:22:16 | Universe@Home | Aborting task universe_bhdb_180109_5_21874979_20000_1-999999_875200_1: exceeded disk limit: 961.11MB > 858.31MB Workunits have <rsc_disk_bound>900000000.000000</rsc_disk_bound> (858.31MB). Disk usage in that task's slot directory leading up to the failure was primarily in the data1.dat2 and data1.dat3 files, with both weighing in at over 460MB (i.e. their total disk usage was above the limit). "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
Disk usage in that task's slot directory leading up to the failure was primarily in the data1.dat2 and data1.dat3 files, with both weighing in at over 460MB (i.e. their total disk usage was above the limit). Good find. I suggest that they just increase the disk limits. Most people probably have plenty of space. The science should come first, I would hope. |
Send message Joined: 28 Feb 15 Posts: 23 Credit: 42,229,680 RAC: 0 |
Some of the WUs running fine, with the same specs of maximum discusage. The WUs that crashing on my system, crash on other systems, too. I have now a WU with 100% and one data-file were only 135MB. I can´t look so fast for the other file, because the WU at this time was at 100% and next moment uploaded ... :-/ https://universeathome.pl/universe/result.php?resultid=33277772 The question is, why some WUs need so much diskspace? Can you resend some of this invalid with higher limit of allowed discuse? If they become valid, we find a way...if they become invalid (not by error discusage), theres a problem with the WUs and they never become valid. Tschau Norman |
Send message Joined: 11 Mar 15 Posts: 19 Credit: 390,816,392 RAC: 0 |
Is the parameter SS not being used at the moment ? In the log.txt file: 00:00:00 00:00:00 PARAMIN: SS = 1 00:00:00 00:00:00 PARAMIN unknown parameter: name: SS; value: 1 This is in WUs that work and I have seen different values set for SS |
Send message Joined: 21 Feb 15 Posts: 64 Credit: 65,733,511 RAC: 0 |
It is really a 50:50 chance to get a scientific useful result; that's not good enough. I hope you find the problem soon. I come back every day or so and try some wu's. |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
This is really a test that just started. They often go on for months. It looks like this will be finished much sooner. |
Send message Joined: 22 Nov 16 Posts: 1 Credit: 3,550,658 RAC: 0 |
There are just much more workunits that produce errors with the disk space message ... 10-Mar-2018 00:30:40 [Universe@Home] Aborting task universe_bhdb_180109_4_3894997_20000_1-999999_895200_0: exceeded disk limit: 670.96MB > 667.57MB 10-Mar-2018 00:42:35 [Universe@Home] Aborting task universe_bhdb_180109_4_3884997_20000_1-999999_885200_1: exceeded disk limit: 749.81MB > 667.57MB 10-Mar-2018 01:18:20 [Universe@Home] Aborting task universe_bhdb_180109_4_3819997_20000_1-999999_820200_1: exceeded disk limit: 701.19MB > 667.57MB 10-Mar-2018 02:06:00 [Universe@Home] Aborting task universe_bhdb_180109_4_3889997_20000_1-999999_890200_0: exceeded disk limit: 686.42MB > 667.57MB 10-Mar-2018 02:29:50 [Universe@Home] Aborting task universe_bhdb_180109_4_3814997_20000_1-999999_815200_1: exceeded disk limit: 689.97MB > 667.57MB 10-Mar-2018 05:58:21 [Universe@Home] Aborting task universe_bhdb_180109_4_3644997_20000_1-999999_645200_2: exceeded disk limit: 670.52MB > 667.57MB 10-Mar-2018 08:03:31 [Universe@Home] Aborting task universe_bhdb_180109_5_21044979_20000_1-999999_45200_1: exceeded disk limit: 904.93MB > 858.31MB 10-Mar-2018 08:51:13 [Universe@Home] Aborting task universe_bhdb_180109_3_30849970_20000_1-999999_850200_3: exceeded disk limit: 505.29MB > 476.84MB 10-Mar-2018 09:56:45 [Universe@Home] Aborting task universe_bhdb_180109_3_10504990_20000_1-999999_505200_4: exceeded disk limit: 550.82MB > 476.84MB 10-Mar-2018 11:32:05 [Universe@Home] Aborting task universe_bhdb_180109_3_10089990_20000_1-999999_90200_2: exceeded disk limit: 479.50MB > 476.84MB 10-Mar-2018 15:06:35 [Universe@Home] Aborting task universe_bhdb_180109_5_61049939_20000_1-999999_50200_1: exceeded disk limit: 865.53MB > 858.31MB 10-Mar-2018 16:06:10 [Universe@Home] Aborting task universe_bhdb_180109_5_61229939_20000_1-999999_230200_1: exceeded disk limit: 883.49MB > 858.31MB 10-Mar-2018 16:24:02 [Universe@Home] Aborting task universe_bhdb_180109_4_10064990_20000_1-999999_65200_3: exceeded disk limit: 683.80MB > 667.57MB Maybe an calculating error for the "real" filesize to be written ... On this drive I have more than 200 GB free ... so that is NOT the problem ... |
Send message Joined: 4 Feb 15 Posts: 4 Credit: 68,029,795 RAC: 0 |
Here are some messages from my BOINC, I personally don't think 858.31MB is very excessive usage of a disk, it would probably be if RAM is concerned, but with the disk even a few GB won't be that problem. Sa 10 Mär 2018 15:46:49 CET | Universe@Home | [checkpoint] result universe_bhdb_180109_5_31044969_20000_1-999999_45200_2 checkpointed It's about this WU, in the stderr stands only this: <core_client_version>7.2.42</core_client_version> Grüße vom Sänger |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
The question is, why some WUs need so much diskspace? Can you resend some of this invalid with higher limit of allowed discuse? If they become valid, we find a way...if they become invalid (not by error discusage), theres a problem with the WUs and they never become valid. The errors usually occur right at the end of the run, as they are finalizing something. Hopefully a little more disk space will fix it, though it could be something more serious as you suggest. |