Message boards :
Number crunching :
Computer not receiving Work units
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 8 Feb 16 Posts: 19 Credit: 22,093,689 RAC: 0 |
I recently switched both my Raspberry Pi's (ARM/Linux) to Ultraviolet reionization. One of them only received a couple of tasks so may not yet qualify as reliable. But this host http://universeathome.pl/universe/show_host_detail.php?hostid=30574 has 47 consecutive valid tasks, so why is it not receiving new tasks ? Also, please check this unit (for the same host): http://universeathome.pl/universe/workunit.php?wuid=6629026. Why did it take 9 days for the second task for this unit to be sent ? Tom |
Send message Joined: 31 Jul 16 Posts: 4 Credit: 321,900 RAC: 0 |
My question is this,if my computer(s)have been labled BAD computers and then receive little or no work,and is bypassed by the GOOD computers,will it not take an extremely long time to get the 10 valid WUs needed to transition to a GOOD computer?? What with the month or more wait for a WU to be resent to a wingman,and the large amount of CANCELLED BY SERVER WUs this project has??Which seem to count against you?? Just sayin... Thanks |
Send message Joined: 10 Mar 15 Posts: 11 Credit: 309,361 RAC: 0 |
Definately there is HR problem... Well, that's not working either. I created a Linux VM to run UV tasks. A mere 2 errors from 405 more than meets your standard of reliability. But of course, no tasks are available despite the server status. Quarkstar WUs are even harder to get, even though 57 are "available". With the failing Win tasks, the rapid retirement/turnover of apps, retiring the Android app without warning, lack of OSX support and now this debacle, this project is rapidly becoming my least favorite. |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
Definately there is HR problem... Firstly, Android app was separate before because it computes smaller number of simulations then other systems in the past, now is available with BHspin2 application. QuarkStars have no available new tasks because previous batch isn't finished and we don't want generate tasks without real science target and waste your power. When all results comes back and we finish analyse it, new tasks will be generated. As you can see with UV application - connection of HR, reliability hosts and priorities doesn't work as expected (probably nobody before join all this requirements together) and we looking all the time to find proper balance between them. Also, Mac support will be added soon, but I can't do everything in same time and also adding next platform to HR tasks will (probably) increase current problem. A t least I have to be sure how to resolve current problem before I add next... Apologise if not everything working smooth but I'm trying my best in this situation... Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 16 Apr 16 Posts: 15 Credit: 4,409,800 RAC: 0 |
Perhaps you should stop generation of new work for each of the applications until the ready to send queue for a application is completely empty. Hopefully this will flush out all the unsent results that seem to be stuck in there or at least show how big this problem is. Once the queue is empty create a small batch of new work and then wait untill the queue is empty again. This will help you in managing what is in progess and prevent users from having a growing number of waiting for validation results. This is not a permanent solution but it will give you time to fix the settings problem |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
The new batches are helping computers to get reliable status (is no special conditions for it except HR), so I need to generate them anyway. Also I see on server side that number of computers without tasks are going lover every day. Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 16 Apr 16 Posts: 15 Credit: 4,409,800 RAC: 0 |
The new batches are helping computers to get reliable status (is no special conditions for it except HR), so I need to generate them anyway. You are missing the point. It looks to me that there are a lot of older unsent results. At the same time new work is generated and send out. If you let the queue run dry then all those old results will flush from the queue or show that there is a problem with sending them. You are avoiding fixing the problem that some resuls do not get send out and stay in the queue forever. It is beter to have some users without work for a short while then increasing the problems indefinitly. |
Send message Joined: 31 Jul 16 Posts: 4 Credit: 321,900 RAC: 0 |
Just want to say THANKS for all the work you do for the project.krzyszp ryan |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
No, I didn't miss the point. I'm constantly checking database and filter problematic WU to see if there is clear pattern, I don't need (at the moment) to dry server to find them, SQL queries doing it for me :) Obviously is still possibility that it will be necessary but I believe it is not a time to do it now. Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
Just want to say THANKS for all the work you do for the project.krzyszp Thank you :) I'm really enjoy doing something for real science project :) Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 3 Apr 15 Posts: 2 Credit: 27,565,438 RAC: 0 |
Well, I don't want to add to the discontent about not getting WUs (but I'm going to :) ). I can't seem to get any UV WUs. I get plenty of spin2. At one time I did get both UV and Quark, but nothing for a month. Did I get marked as unreliable somehow? |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
Your computer is marked as not reliable OR is now proper wingman available for it OR all tasks for particular app are already designed for other platforms (like e.g. QuarkStars). And I see - you have BHspin2 on your machines. Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 4 Feb 15 Posts: 49 Credit: 15,956,546 RAC: 0 |
Thanks Krzyszp for all the behind the scenes work that you are doing. I noticed today that when I requested UV tasks (server shows over 79,000 at the moment) I get the message that there is no work available. I doubt that 79,000 work units are pre-assigned to other platforms (I am using Linux) and have gotten work before (last lot a few days ago). I have stopped BHSpinv2 at the moment as I wanted to build up my hour total on the UV tasks, but now I can't get them. The 9 work units I received on the 30th are all still in Pending as I am the only person to have run these work units. No work has been sent any other volunteers (Wingmen) as yet. All the errors that I have (total 15 for UV and BHSpin out of hundreds run) are due to the Server cancelling work units (even 1 that had started running), not due to any error on my host. I will wait (nothing else to do really) and run some Primegrid in the mean time. Thanks again Conan |
Send message Joined: 2 Jun 16 Posts: 169 Credit: 317,253,046 RAC: 0 |
For analysing purpose we need whole batch of results in particular series because even if we have 95% of the serie computed we still need to wait for last 5% to start analysis... Is this really working? Or is the validation task not part of this. Many times tasks wait several days after my task has been sent back in just to get the wingman task sent out. If science needs to be validated, waiting so long to send out work slows down progress. I'd imaging also adds load to the servers with more work generations out there. I've got 33 tasks still that I've completed over 2 weeks ago. The wingman timeout several days ago but it wasn't out out immediately. I've seen other projects send out a duplicate wingman task within an hour of timeout, some several hours prior. |
Send message Joined: 28 Feb 15 Posts: 4 Credit: 25,011,420 RAC: 0 |
This morning (France) State: All (1214) • In progress (128) • Validation pending (250) • Validation inconclusive (0) • Valid (781) • Invalid (0) • Error (55) Application: All (1214) • Universe Ultraviolet reionization (273) • Universe BHspin (3) • Universe BHspin v2 (938) • Universe QuarkStars (0) With : Universe BHspin v2 (938) In progress (128) • Validation pending (172) • Validation inconclusive (0) • Valid (584) • Invalid (0) • Error (53) Universe Ultraviolet reionization (273) In progress (0) • Validation pending (79) • Validation inconclusive (0) • Valid (192) • Invalid (0) • Error (2) Oups!!! 10 minutes later : valid 781 becomes 779... State: All (1213) • In progress (128) • Validation pending (251) • Validation inconclusive (0) • Valid (779) • Invalid (0) • Error (55) Application: All (1213) • Universe Ultraviolet reionization (273) • Universe BHspin (3) • Universe BHspin v2 (937) • Universe QuarkStars (0) State: All (273) • In progress (0) • Validation pending (79) • Validation inconclusive (0) • Valid (192) • Invalid (0) • Error (2) Application: All (1213) • Universe Ultraviolet reionization (273) • Universe BHspin (3) • Universe BHspin v2 (937) • Universe QuarkStars (0) State: All (937) • In progress (128) • Validation pending (172) • Validation inconclusive (0) • Valid (584) • Invalid (0) • Error (53) Application: All (1213) • Universe Ultraviolet reionization (273) • Universe BHspin (3) • Universe BHspin v2 (937) • Universe QuarkStars (0) |
Send message Joined: 8 Feb 16 Posts: 19 Credit: 22,093,689 RAC: 0 |
Received lots of new tasks on both my Raspberry Pi's about an hour ago. If you made changes server side: it worked ! Thanks, Tom |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
Received lots of new tasks on both my Raspberry Pi's about an hour ago. If you made changes server side: it worked ! Yes, manually reversed priority for UV tasks helps. Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
On my Ubuntu 16.4 machine (i7-4770), I am now getting both the BHpin v2 and the UV reionization tasks, so it is working here too. |
Send message Joined: 4 Feb 15 Posts: 49 Credit: 15,956,546 RAC: 0 |
Thanks Krzyszp, also received quite a bit of work for the UV application. Thank you Conan |
Send message Joined: 21 Feb 15 Posts: 52 Credit: 318,272 RAC: 0 |
Thanks Krzyszp, also received quite a bit of work for the UV application. +1 But now: "Tasks are committed to other platforms" |