1)
Message boards :
Number crunching :
extreme long wu's
(Message 2577)
Posted 27 Jan 2018 by Tex1954 Post: Does any admin actually read the message boards???? 8-) |
2)
Message boards :
Number crunching :
extreme long wu's
(Message 2573)
Posted 25 Jan 2018 by Tex1954 Post: I've been getting a TON of over 3&4 days long WU's but they do complete. Problem is, I still get the same 666.67 points for a 3&4 day long WU as I do for a 2 hour WU... Umm, maybe something could be done to correct this points problem? 8-) |
3)
Message boards :
Number crunching :
Erroneous validations or incorrect CPU time?
(Message 1676)
Posted 19 Oct 2016 by Tex1954 Post: This is weird... I KNOW an i3-2120 is not 14 times faster than than an X5680... unless this is an AVX instruction thing....?????? http://universeathome.pl/universe/workunit.php?wuid=7360350 The short task 49522 computer has this in stderr.out: <core_client_version>7.5.0</core_client_version> <![CDATA[ <stderr_txt> 04:49:57 (4972): Can't set up shared mem: -1. Will run in standalone mode. 09:19:32 (1637): Can't set up shared mem: -1. Will run in standalone mode. 10:16:30 (7498): Can't set up shared mem: -1. Will run in standalone mode. 10:46:40 (7498): called boinc_finish(0) </stderr_txt> ]]> Soo, wondering why that Wu gets same credit... or is the run time wrong due to a restart or something? I see this in a bunch of WU's with different other systems.... but the X5680 WU has clean stderr file... 8-) |
4)
Message boards :
Number crunching :
What do you guys make of this?
(Message 1675)
Posted 19 Oct 2016 by Tex1954 Post: Well, I don't do that and haven't had any problems with latest batch... 8-) Thank you Krzysztof for your speedy reply & fix . Much appreciated. |
5)
Message boards :
Number crunching :
What do you guys make of this?
(Message 1601)
Posted 30 Sep 2016 by Tex1954 Post: What about all the download errors? Lin-1240V2-1 5044 Universe@Home 9/30/2016 8:09:03 AM [error] MD5 check failed for universe_bh2_160803_9_10_20000_1-999999_679000 5045 Universe@Home 9/30/2016 8:09:03 AM [error] expected 61877bac4fe69388c25e46ce9a594f35, got 8e2d98ca66085d450811579c5958518b 5046 Universe@Home 9/30/2016 8:09:03 AM [error] Checksum or signature error for universe_bh2_160803_9_10_20000_1-999999_679000 5047 Universe@Home 9/30/2016 8:09:03 AM [error] MD5 check failed for universe_bh2_160803_9_10_20000_1-999999_684000 5048 Universe@Home 9/30/2016 8:09:03 AM [error] expected 1627d98f2d66bd1a9382c1f2cacb6c53, got 64ae30ef49df63871fca168c107238a4 5049 Universe@Home 9/30/2016 8:09:03 AM [error] Checksum or signature error for universe_bh2_160803_9_10_20000_1-999999_684000 Reboot/Reset don't fix problem... Seems only on Linux setups.. 8-) |
6)
Message boards :
Number crunching :
Computer not receiving Work units
(Message 1526)
Posted 6 Sep 2016 by Tex1954 Post: I can't get WU's either on any machine even if I reset the project... 440 Universe@Home 9/5/2016 11:54:27 PM Resetting project 441 Universe@Home 9/5/2016 11:54:33 PM Master file download succeeded 442 Universe@Home 9/5/2016 11:54:38 PM Sending scheduler request: To fetch work. 443 Universe@Home 9/5/2016 11:54:38 PM Requesting new tasks for CPU 444 Universe@Home 9/5/2016 11:54:40 PM Scheduler request completed: got 0 new tasks 445 Universe@Home 9/5/2016 11:54:40 PM No tasks sent 2059 Universe@Home 9/5/2016 11:50:56 PM Requesting new tasks for CPU 2060 Universe@Home 9/5/2016 11:50:58 PM Scheduler request completed: got 0 new tasks 2061 Universe@Home 9/5/2016 11:50:58 PM No tasks sent 2062 Universe@Home 9/5/2016 11:51:13 PM Resetting project 2063 Universe@Home 9/5/2016 11:51:16 PM Master file download succeeded 2064 Universe@Home 9/5/2016 11:51:22 PM Sending scheduler request: To fetch work. 2065 Universe@Home 9/5/2016 11:51:22 PM Requesting new tasks for CPU 2066 Universe@Home 9/5/2016 11:51:25 PM Scheduler request completed: got 0 new tasks 2067 Universe@Home 9/5/2016 11:51:25 PM No tasks sent Yikes! 8-) |
7)
Message boards :
Number crunching :
Long running work units
(Message 1191)
Posted 24 May 2016 by Tex1954 Post: My 4770k runs task in ~4900s, but today I've got 6 tasks >17000s long. Yes, they are "_10_" tasks. Looking at all those, I see a progression of WU's _5_ up to _10_ on all my setups. Currently, my 2P 24T setup has all those _10_ WU's y'all are abandoning I guess. So far, at 8 hours run time and they are about 68% complete. Should I abandon them? I think not... I'm here to help the project. However, it was and still is of some concern that the long tasks make the same points as the short tasks and that motivates folks to abandon them. For those only interested in points, I suppose that is to be expected. I KNOW there is a way to help compensate point-wise for long tasks... one only has to identify them and use a multiplier on the points, even for fixed point setups. Other projects do this... but their LONG tasks are identified ahead of time which is perhaps something that this project is unable to predict. Anyway, point production = electricity used in many peoples minds and I'm sure it would benefit the project to pic some average time breakpoints on a certain CPU (via FLOPS/Sec or something) and adjust point output using a simple multiple.. like 4 hours on a 3770 = 333, 4-7.9 - 666, 8-11.9 = 999 and so forth. In fact, one could simply use a Time(seconds)/FLOPS/Sec value like this: round((Time/FLOPs) /4) * 333 or something simple like that to determine points... Even use a fix CPU average like 3750 or 4750 for FLOPS so people could not cheat.. and use it on the fastest (lowest) time of two giving same points to both. (assumes Primary and Wingman) 8-) |
8)
Message boards :
Number crunching :
Long running work units
(Message 1165)
Posted 2 May 2016 by Tex1954 Post: This is what BM reports, but I not really believe in the numbers... Just see CPU usage... CPU usage is normal, as in 98% or higher. I see no problems in the longer running WU's, only that they run longer... excepting the ones that never finish! Hope you can figure those out. 8-) |
9)
Message boards :
Number crunching :
Long running work units
(Message 1163)
Posted 2 May 2016 by Tex1954 Post: Well, hope it helps. One other thing I notice (like others mentioned) is when the task properties are viewed, they report only 0.06 GFLOPS/Sec???? If the tasks are not doing much math, that is why credit new is so flaky. Computer: Linux-DX5680 Project Universe@Home Name universe_bh_362_16_20000_1-999999_335600_1 Application Universe BHspin 0.09 Workunit name universe_bh_362_16_20000_1-999999_335600 State Running Received 5/1/2016 10:50:38 PM Report deadline 5/15/2016 10:50:23 PM Estimated app speed 0.06 GFLOPs/sec Estimated task size 807 GFLOPs CPU time at last checkpoint 01:36:19 CPU time 01:42:02 Elapsed time 01:42:53 Estimated time remaining 02:20:58 Fraction done 42.160% Virtual memory size 12.80 MB Working set size 3.46 MB Directory slots/1 Process ID 2447 This is on a task near completion. If your WU's are not doing a lot of math, what are they doing? They use less than 4 MB of ram... just curious... 8-) |
10)
Message boards :
Number crunching :
Long running work units
(Message 1161)
Posted 2 May 2016 by Tex1954 Post: 2p setup just finished a couple dozen 12+ hours tasks like this one... and only 333 points? http://universeathome.pl/universe/workunit.php?wuid=4907485 I think even Credit New would give more points... but all in the same boat I guess... 8-) |
11)
Message boards :
Number crunching :
Long running work units
(Message 1159)
Posted 1 May 2016 by Tex1954 Post: I have now 4 long ones on two computers (total 8) that have run for 6 days on one setup and over 1 day on the other setup. Both are running Linux. I let them run because I was curious... I'll abort them all after I report what is in the slots directories now... Both setups are server grade with E3-1230V3 and an E3-1240V2 CPU's. Both run Linux Mint 17.3 . In any case, I checked the slot files and nothing is changing except the error.dat files some stuff... error.dat = error: function Lzahbf(M,Mc) should not be called for HM stars error: function Lzahbf(M,Mc) should not be called for HM stars unexpected remnant case for K=5-6: 254568 error.dat2 = error: bondi() accreted mass (2.801104) larger than envelope mass (2.786657) (190181) error.dat3 = error: bondi() accreted mass (2.801104) larger than envelope mass (2.786657) (190181) The boinc_mmap_file has some binary junk in it... Nothing in the stderr.txt file. log.txt = 00:00:00 00:00:00 PROGRAM START: Sun Apr 24 23:43:52 2016 00:00:00 00:00:00 no checkpoint.dat file found00:00:00 00:00:00 cleaning checkpoints 00:00:00 00:00:00 gw_cpfile: source file "data0.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "data1.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "data2.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "error.dat2" not present 00:00:00 00:00:00 reading checkpoint: istart: -1; pp: 0; n: -1 00:00:00 00:00:00 checkpoint read 00:00:00 00:00:00 default values set 00:00:00 00:00:00 Reading param.in file 00:00:00 00:00:00 PARAMIN: num_tested = 20000 00:00:00 00:00:00 PARAMIN: hub_val = 1000 00:00:00 00:00:00 PARAMIN: idum = -943500 00:00:00 00:00:00 PARAMIN: OUTPUT = 3 00:00:00 00:00:00 PARAMIN: Sal = -2.3 00:00:00 00:00:00 PARAMIN: Mmina = 5.0 00:00:00 00:00:00 PARAMIN: Mminb = 3.0 00:00:00 00:00:00 PARAMIN: Fa = 1 00:00:00 00:00:00 PARAMIN: ZZ = 0.0001 00:00:00 00:00:00 param.in file read 00:00:00 00:00:00 idum: -943500; num_tested: 20000 00:03:53 00:03:53 making checkpoint: j: 1000; iidd: 118910 00:03:53 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:03:53 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:03:53 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:03:53 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:03:53 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:03:53 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:03:53 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:03:53 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:07:50 00:03:57 making checkpoint: j: 2000; iidd: 242302 00:07:50 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:07:50 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:07:50 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:07:50 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:07:50 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:07:50 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:07:50 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:07:50 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:00:00 00:00:00 PROGRAM START: Sun May 1 06:16:53 2016 00:00:00 00:00:00 reading checkpoint: istart: 2000; pp: 242302; n: 2 00:00:00 00:00:00 checkpoint read 00:00:00 00:00:00 default values set 00:00:00 00:00:00 Reading param.in file 00:00:00 00:00:00 PARAMIN: num_tested = 20000 00:00:00 00:00:00 PARAMIN: hub_val = 1000 00:00:00 00:00:00 PARAMIN: idum = -943500 00:00:00 00:00:00 PARAMIN: OUTPUT = 3 00:00:00 00:00:00 PARAMIN: Sal = -2.3 00:00:00 00:00:00 PARAMIN: Mmina = 5.0 00:00:00 00:00:00 PARAMIN: Mminb = 3.0 00:00:00 00:00:00 PARAMIN: Fa = 1 00:00:00 00:00:00 PARAMIN: ZZ = 0.0001 00:00:00 00:00:00 param.in file read 00:00:00 00:00:00 idum: -943500; num_tested: 20000 00:00:00 00:00:00 random number generator initialised: 242302 00:00:00 00:00:00 PROGRAM START: Sun May 1 06:28:15 2016 00:00:01 00:00:01 reading checkpoint: istart: 2000; pp: 242302; n: 2 00:00:01 00:00:00 checkpoint read 00:00:01 00:00:00 default values set 00:00:01 00:00:00 Reading param.in file 00:00:01 00:00:00 PARAMIN: num_tested = 20000 00:00:01 00:00:00 PARAMIN: hub_val = 1000 00:00:01 00:00:00 PARAMIN: idum = -943500 00:00:01 00:00:00 PARAMIN: OUTPUT = 3 00:00:01 00:00:00 PARAMIN: Sal = -2.3 00:00:01 00:00:00 PARAMIN: Mmina = 5.0 00:00:01 00:00:00 PARAMIN: Mminb = 3.0 00:00:01 00:00:00 PARAMIN: Fa = 1 00:00:01 00:00:00 PARAMIN: ZZ = 0.0001 00:00:01 00:00:00 param.in file read 00:00:01 00:00:00 idum: -943500; num_tested: 20000 00:00:01 00:00:00 random number generator initialised: 242302 00:00:00 00:00:00 PROGRAM START: Sun May 1 06:57:59 2016 00:00:00 00:00:00 reading checkpoint: istart: 2000; pp: 242302; n: 2 00:00:00 00:00:00 checkpoint read 00:00:00 00:00:00 default values set 00:00:00 00:00:00 Reading param.in file 00:00:00 00:00:00 PARAMIN: num_tested = 20000 00:00:00 00:00:00 PARAMIN: hub_val = 1000 00:00:00 00:00:00 PARAMIN: idum = -943500 00:00:00 00:00:00 PARAMIN: OUTPUT = 3 00:00:00 00:00:00 PARAMIN: Sal = -2.3 00:00:00 00:00:00 PARAMIN: Mmina = 5.0 00:00:00 00:00:00 PARAMIN: Mminb = 3.0 00:00:00 00:00:00 PARAMIN: Fa = 1 00:00:00 00:00:00 PARAMIN: ZZ = 0.0001 00:00:00 00:00:00 param.in file read 00:00:00 00:00:00 idum: -943500; num_tested: 20000 00:00:00 00:00:00 random number generator initialised: 242302 This task is one of four on this setup running over 6 days. I've restarted it a couple times to see if something changes... seems the completion percent is moving up... 8-) |
12)
Message boards :
Number crunching :
Long running work units
(Message 1150)
Posted 21 Apr 2016 by Tex1954 Post: I have observed the same thing. It seems AMD runs faster on these "10" tasks but also some i3 and i5 CPU's seem to go faster. It's very interesting that I can find a minor correlation between the OS (Win vs. Linux). I see the same thing on my E3-1240V2 CPU's as I do on the 2p X5680 setup. However, they ALL get the same 333 points... how weird... In many cases, the i3/i5 CPU's that significantly do better are running windows instead of Linux, so some compiler thing may be going on as well... Tis for the developers to clear this up I think... :D |
13)
Message boards :
Number crunching :
Excessively Long Estimated Finish Times
(Message 997)
Posted 31 Dec 2015 by Tex1954 Post: Everything was fine on all my setups until a couple days ago... I don't think there is anything I could do on my end... Well, we will see how things go and I will upgrade to 7.6.22 and see if that does anything... 8-) |
14)
Message boards :
Number crunching :
Something is messy on the project prefs page
(Message 993)
Posted 30 Dec 2015 by Tex1954 Post: Well, I guess it's okay then... At least I understand it better now. Seems my setups use that override file anyway. This question came up with another project that could not read the globals or didn't have them setup as well. Thanks! 8-) |
15)
Message boards :
Number crunching :
Excessively Long Estimated Finish Times
(Message 992)
Posted 30 Dec 2015 by Tex1954 Post: Okay, I'm getting a LOT of tasks that have estimated completion times in DAYS rather than hours like these: This has the effect of telling the BOINC client that a 1-day cache is already FULL and it won't fetch any more work from ANY project if all the cores are loaded up with same. Believe me, I have a 24-core setup that STOPPED fetching ANY new WU's from ANY project because it had 24 of these 4-Day long tasks loaded up. The REALLY BAD THING is the tasks complete in no more than 4.8 hours!!! I've scanned hundreds of my tasks and the LONGEST I could find took 17334 seconds. Sooo, for whatever reason, something has changed in the estimated completion times and it isn't on my end. This is totally screwing up my setups, especially those that have trouble getting WU's from other projects that have FEW WU's to go around. As the tasks near completion, naturally the estimated time decreases, but very slowly until about 4 hours done. Meanwhile, BOINC thinks my 1-Day cache is full... what a lie... Please fix this. 8-) One setup has 62 of these 4-day long tasks in the cache... guess it won't be running anything else for years... sheesh... |
16)
Message boards :
Number crunching :
Something is messy on the project prefs page
(Message 988)
Posted 29 Dec 2015 by Tex1954 Post: I'm not sure about all this preferences setting where one project affects them all. I understand the idea, but there is NO reason I can think of where a CPU running several projects has to use the SAME local preferences for EACH project.. Also, I am not 100% sure which preferences are considered GLOBAL and which LOCAL. Suffice it to say, I don't think it wise for ANY project to mess with any other project settings under ANY circumstances except the truly global ones such as CPU's and CPU % and time between switching tasks and the disk use parameters. Certainly it should NOT share project preferences at all. Somewhere along the road this project insisted that my systems were setup to use WORK location preferences (which are project specific) while my actual location was DEFAULT for this project. It is still doing that!!!! So, to perhaps circumvent the error, I created a WORK preference and now the error messages seem to be stopped.. This really still needs to be fixed. 8-) |
17)
Message boards :
Number crunching :
upload problem
(Message 987)
Posted 29 Dec 2015 by Tex1954 Post: Same here... keep getting this when it tries to report the 14 completed tasks:::: Win7-5930K 819 Universe@Home 12/28/2015 11:23:12 PM Reporting 14 completed tasks 820 Universe@Home 12/28/2015 11:23:12 PM Requesting new tasks for CPU 821 Universe@Home 12/28/2015 11:23:17 PM [error] Can't parse task in scheduler reply: unexpected XML tag or syntax 822 Universe@Home 12/28/2015 11:23:17 PM [error] No close tag in scheduler reply 825 Universe@Home 12/28/2015 11:24:02 PM Reporting 14 completed tasks 826 Universe@Home 12/28/2015 11:24:02 PM Requesting new tasks for CPU 827 Universe@Home 12/28/2015 11:24:06 PM [error] Can't parse task in scheduler reply: unexpected XML tag or syntax 828 Universe@Home 12/28/2015 11:24:06 PM [error] No close tag in scheduler reply |
18)
Message boards :
Number crunching :
No New Work
(Message 986)
Posted 28 Dec 2015 by Tex1954 Post: Same here, get some tasks once in a while, server shows tons ready to send, but I can't get any. Tried Project Resets and everything else I could think of... no joy... Seems the server status isn't being updated in real time if at all... 8-) PS: Getting some of this lately... maybe that why? Sometimes detach/attach cures problem for a while, then it mess up again... Win7-5930K 58 Universe@Home 12/28/2015 4:47:41 PM Sending scheduler request: To fetch work. 59 Universe@Home 12/28/2015 4:47:41 PM Requesting new tasks for CPU 60 Universe@Home 12/28/2015 4:47:46 PM [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax 61 Universe@Home 12/28/2015 4:47:46 PM [error] No close tag in scheduler reply |
19)
Message boards :
Number crunching :
i need help
(Message 925)
Posted 16 Dec 2015 by Tex1954 Post: Getting probs again.... Had to re-attach again just to report completed tasks, then got about 7 compute errors, only 2 tasks running now. Win7-DX5680 209 Universe@Home 12/15/2015 9:29:04 PM Sending scheduler request: To fetch work. 210 Universe@Home 12/15/2015 9:29:04 PM Requesting new tasks for CPU 211 Universe@Home 12/15/2015 9:29:06 PM Scheduler request failed: HTTP internal server error 8-) |
20)
Message boards :
Number crunching :
i need help
(Message 908)
Posted 13 Dec 2015 by Tex1954 Post: Getting more problems again on 2P and 1P servers..... Win7-DX5680 1389 Universe@Home 12/13/2015 2:59:03 AM Sending scheduler request: To fetch work. 1390 Universe@Home 12/13/2015 2:59:03 AM Requesting new tasks for CPU and NVIDIA GPU 1391 Universe@Home 12/13/2015 2:59:08 AM [error] Can't parse file info in scheduler reply: unexpected XML tag or syntax 1392 Universe@Home 12/13/2015 2:59:08 AM [error] No close tag in scheduler reply 1393 Universe@Home 12/13/2015 4:16:02 AM Sending scheduler request: To fetch work. 1394 Universe@Home 12/13/2015 4:16:02 AM Requesting new tasks for CPU and NVIDIA GPU 1395 Universe@Home 12/13/2015 4:16:13 AM [error] Can't parse file info in scheduler reply: unexpected XML tag or syntax 1396 Universe@Home 12/13/2015 4:16:13 AM [error] No close tag in scheduler reply and this on another 1P server... Win7-1230V21 1360 Universe@Home 12/13/2015 4:16:01 AM Sending scheduler request: Requested by user. 1361 Universe@Home 12/13/2015 4:16:01 AM Requesting new tasks for CPU 1362 Universe@Home 12/13/2015 4:16:02 AM Scheduler request failed: HTTP internal server error 1363 Universe@Home 12/13/2015 4:19:40 AM Sending scheduler request: To fetch work. 1364 Universe@Home 12/13/2015 4:19:40 AM Requesting new tasks for CPU 1365 Universe@Home 12/13/2015 4:19:49 AM Scheduler request failed: HTTP internal server error Oh well, seems it isn't "MY" problem... I hope... 8-) PS: Seems the 1P has started working again, 2P still hung up... |