Message boards :
Number crunching :
Upload fails
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 4 Apr 15 Posts: 46 Credit: 43,128,567 RAC: 0 |
I had 4 WUs that were stuck and couldn't upload after they were done and crunched. But I did get 3 of them to manually upload via the "Retry Now" button on the Transfer tab in Boinc advanced view desktop program. The 4th WU is having trouble. I have clicked retry now 5 times, but it just defaults to retry every 3 hours. If I click on the properties of the stuck WU it says. ........ Application They are only sending out BH units right now as they are very small compared to the ULX units so upload and download very quickly. During the Pentathlon alot of people got the ULX units and they can be much bigger so take longer to upload. Add in that alot of people downloaded 7 days worth of tasks and didn't return them until the last day, it keeps others in the dark about how well you are doing, it is a 'race' of sorts. The back-off thing is a part of Boinc but I think the problem is with the Project not giving the Server enough time for each transfer to finish. Mine get to 100% and then sit there and back-off, I click retry and it goes to 0% and then back to 100% as it resends the whole data set again and then backs off again, rinse and repeat. The Server seems to be closing the connection before it recognizes it really does have the data. |
Send message Joined: 4 Apr 15 Posts: 46 Credit: 43,128,567 RAC: 0 |
Whoa there partner! You had best hold off until Tuesday on making new ULX tasks! Monday you will be getting hammered again with all the final bunkers! :D MOST of the time Projects get notified that they were selected to be a part of the Pentathlon about a week ahead of time. Sometimes Projects say 'no' but it's proably too late and they suffer for it, I have no idea if Universe got notified or what they said if they did. IF they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available., we would need that number from a few days ago to get a better idea. |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
Rosetta was pretty lucky because of COVID. The load from people that wanted to run a BOINC project vs f@h caused they to up their output. Considering a large portion of the current pentathlon members were already running rosetta, it didn't cause nearly the issues. The admin there said Coronavirus caused an increase of users by a factor of 4. Their servers didn't fall over, but I think their data is small to transfer, and they have a 10Gbit backbone, not 1. Bigger project with more money I guess. They also have 314TB of disk. |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
The back-off thing is a part of Boinc but I think the problem is with the Project not giving the Server enough time for each transfer to finish. Mine get to 100% and then sit there and back-off, I click retry and it goes to 0% and then back to 100% as it resends the whole data set again and then backs off again, rinse and repeat. The Server seems to be closing the connection before it recognizes it really does have the data. Are you sure it has the data? All you're seeing is it left your machine. The server or backbone could have been too busy to accept it. |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea. Shorter?! The ULX I've seen are much quicker to process than the BH ones. 2500 seconds vs 15000 seconds. |
Send message Joined: 5 May 18 Posts: 6 Credit: 11,915,800 RAC: 0 |
If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea. Yeah, the BH tasks are about double the time to process of ULX. |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea. I make it 6 times longer (15000 seconds vs 2500 seconds is a rough average from my completed tasks). And it's the other way round to suggested above. "If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter." surely means the BH are shorter, or the sentence doesn't make sense. Since BH are actually longer, disabling the ULX would not have caused a shortage of work, since there are more BH tasks, and they're much longer. They make up the majority of the processing time, but a very small amount of the data transfers. |
Send message Joined: 4 Feb 15 Posts: 846 Credit: 144,180,465 RAC: 0 |
1. Current ULX's are 10 times shorter then previous batches... I decide to make it pretty short as results files was terrible big in past few series - this quickly made large problems on both sides - server and users. 2. We got 1Gb line shared with storage server , not 10Gb as Rosetta... But, it is also problem with disks on main server - both of that generates bottleneck. Usually it is not a problem, but with 5 times bigger load we start struggling. I need to make some changes on main server (e.g. move /home folder to RAID 0 disks or for e.g. move some disk intensive tasks to other machine as I have done with database). 3. I been informed about competition about a week (or even more) before start. I just not expect troubles for longer then 6-12 hours expecting only short delay because of bunkers. Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 4 Apr 15 Posts: 46 Credit: 43,128,567 RAC: 0 |
If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea. 2,537.91 2,517.00 100.00 Universe ULX v0.15 x86_64-pc-linux-gnu 4,756.02 4,756.02 pending Universe BHspin v2 v0.19 x86_64-pc-linux-gnu You are sorta right...my BH tasks DO take longer than my ULX tasks but only twice as long This is using a NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 390.13 OpenCL: 1.2 |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
2,537.91 2,517.00 100.00 Universe ULX v0.15 x86_64-pc-linux-gnu I'm exactly right for my tasks. We may be looking at different versions of ULX, Krzysztof said he'd made them smaller. Or I'm looking at a different CPU - I have 4 wildly different computers, I just glanced through my completed tasks on the server and took a rough average. This is using a NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 390.13 OpenCL: 1.2 Using a what?! I thought Universe was CPU only? Just checked my preferences on the site, there is an option for AMD, but I don't think it does anything. I've switched it on just in case! But there's no Nvidia option. |
Send message Joined: 5 May 18 Posts: 6 Credit: 11,915,800 RAC: 0 |
2,537.91 2,517.00 100.00 Universe ULX v0.15 x86_64-pc-linux-gnu There are no gpu tasks. |
Send message Joined: 4 Feb 15 Posts: 846 Credit: 144,180,465 RAC: 0 |
Look at this task: https://universeathome.pl/universe/result.php?resultid=96881601 and compare to this (on same machine): https://universeathome.pl/universe/result.php?resultid=96791290 Check "Peak disk usage" on both tasks - result file is about 1/3 of peak usage.
Our applications doesn't use GPU :( Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
I wish you could make it do so. That would be FAST! I get through a phenomenal amount of Milkyway and Einstein tasks on my 5 GPUs. |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
There's an option for AMD in the preferences. I think it was planned at some point but never happened? |
Send message Joined: 22 Feb 15 Posts: 24 Credit: 250,365,396 RAC: 0 |
Krzysztof, Ignore the P3DN whiners and complainers. A few from P3DN understand the situation but there are some who continue to accuse and call people cheaters. It's been explained already. They only gripe when they are behind. Grow up! Thank you for your continued support to the Boinc community. -scole of TSBT |
Send message Joined: 4 Feb 15 Posts: 846 Credit: 144,180,465 RAC: 0 |
Yes. We got a developer who try to make GPU app for the project, unfortunately he resign because he got struggles to port application algorithm to GPU. This is not a surprise as our application base was written since 2002 when nobody expect GPU's with compute possibilities... This is very common problem with science app that not all algorithms can be ported to GPU. But... We still think about it, slowly change and clear code and maybe... But this not happens in this year definitely :( Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 22 Dec 19 Posts: 7 Credit: 52,762,433 RAC: 0 |
I got a good 100 uploads. I don't think my servers were able to upload anything at all! I do have other projects I'm running, so for the mean time, I'm adjusting this project to a lower importance. if I can't get my WUs uploaded, I'll prioritize projects where I can. |
Send message Joined: 10 Oct 19 Posts: 5 Credit: 100,594,667 RAC: 0 |
Krzysztof, Exactly! Ignore the fools from P3D and keep up the good work Krzysztof. |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
Yes. We got a developer who try to make GPU app for the project, unfortunately he resign because he got struggles to port application algorithm to GPU. This is not a surprise as our application base was written since 2002 when nobody expect GPU's with compute possibilities... That is a shame. But I will be adding 48 cores shortly on two xeon servers I'm building from scrap parts. If the pentathlon hasn't eaten all your tasks or melted your ethernet I'll do some more. |
Send message Joined: 30 Oct 16 Posts: 183 Credit: 18,395,933 RAC: 0 |
I got a good 100 uploads. I don't think my servers were able to upload anything at all! I just let mine do what they can. Every computer has 2 or more projects running, so if one is stuck, something else happens. Once things clear, boinc will make up for lost time on that project to forfill my weighting requests. |