Message boards : Number crunching : Upload fails
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
mikey
Avatar

Send message
Joined: 4 Apr 15
Posts: 46
Credit: 43,128,567
RAC: 0
Message 4251 - Posted: 8 May 2020, 20:36:21 UTC - in response to Message 4232.  

I had 4 WUs that were stuck and couldn't upload after they were done and crunched. But I did get 3 of them to manually upload via the "Retry Now" button on the Transfer tab in Boinc advanced view desktop program. The 4th WU is having trouble. I have clicked retry now 5 times, but it just defaults to retry every 3 hours. If I click on the properties of the stuck WU it says. ........ Application
Universe BHspin v2 0.19
Name
universe_bh2_190723_308_1524908476_20000_1-999999_910100 .


They are only sending out BH units right now as they are very small compared to the ULX units so upload and download very quickly. During the Pentathlon alot of people got the ULX units and they can be much bigger so take longer to upload. Add in that alot of people downloaded 7 days worth of tasks and didn't return them until the last day, it keeps others in the dark about how well you are doing, it is a 'race' of sorts.

The back-off thing is a part of Boinc but I think the problem is with the Project not giving the Server enough time for each transfer to finish. Mine get to 100% and then sit there and back-off, I click retry and it goes to 0% and then back to 100% as it resends the whole data set again and then backs off again, rinse and repeat. The Server seems to be closing the connection before it recognizes it really does have the data.
ID: 4251 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 4 Apr 15
Posts: 46
Credit: 43,128,567
RAC: 0
Message 4252 - Posted: 8 May 2020, 20:43:12 UTC - in response to Message 4250.  

Whoa there partner! You had best hold off until Tuesday on making new ULX tasks! Monday you will be getting hammered again with all the final bunkers! :D

In the future, if you get notified you will be in a competition, you may wish to remove all ULX tasks in advance. ;) (I don't know that you did get notified, just IF you get notified).


I don't know about Universe, but Rosetta which is also in the competition has the same number of tasks in the field as they always do. How come everyone went mad on Universe but not Rosetta? I think it's more the ULX that changed the load, not the Pentathlon.


Rosetta was pretty lucky because of COVID. The load from people that wanted to run a BOINC project vs f@h caused they to up their output. Considering a large portion of the current pentathlon members were already running rosetta, it didn't cause nearly the issues.


MOST of the time Projects get notified that they were selected to be a part of the Pentathlon about a week ahead of time. Sometimes Projects say 'no' but it's proably too late and they suffer for it, I have no idea if Universe got notified or what they said if they did. IF they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available., we would need that number from a few days ago to get a better idea.
ID: 4252 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4255 - Posted: 8 May 2020, 21:36:20 UTC - in response to Message 4250.  

Rosetta was pretty lucky because of COVID. The load from people that wanted to run a BOINC project vs f@h caused they to up their output. Considering a large portion of the current pentathlon members were already running rosetta, it didn't cause nearly the issues.


The admin there said Coronavirus caused an increase of users by a factor of 4. Their servers didn't fall over, but I think their data is small to transfer, and they have a 10Gbit backbone, not 1. Bigger project with more money I guess. They also have 314TB of disk.
ID: 4255 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4256 - Posted: 8 May 2020, 21:38:32 UTC - in response to Message 4251.  

The back-off thing is a part of Boinc but I think the problem is with the Project not giving the Server enough time for each transfer to finish. Mine get to 100% and then sit there and back-off, I click retry and it goes to 0% and then back to 100% as it resends the whole data set again and then backs off again, rinse and repeat. The Server seems to be closing the connection before it recognizes it really does have the data.


Are you sure it has the data? All you're seeing is it left your machine. The server or backbone could have been too busy to accept it.
ID: 4256 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4257 - Posted: 8 May 2020, 21:41:09 UTC - in response to Message 4252.  

If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea.


Shorter?! The ULX I've seen are much quicker to process than the BH ones. 2500 seconds vs 15000 seconds.
ID: 4257 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[H]auntjemima

Send message
Joined: 5 May 18
Posts: 6
Credit: 11,915,800
RAC: 0
Message 4258 - Posted: 8 May 2020, 21:42:05 UTC - in response to Message 4257.  

If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea.


Shorter?! The ULX I've seen are much quicker to process than the BH ones. 2500 seconds vs 15000 seconds.


Yeah, the BH tasks are about double the time to process of ULX.
ID: 4258 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4259 - Posted: 8 May 2020, 21:48:53 UTC - in response to Message 4258.  

If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea.


Shorter?! The ULX I've seen are much quicker to process than the BH ones. 2500 seconds vs 15000 seconds.


Yeah, the BH tasks are about double the time to process of ULX.


I make it 6 times longer (15000 seconds vs 2500 seconds is a rough average from my completed tasks). And it's the other way round to suggested above. "If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter." surely means the BH are shorter, or the sentence doesn't make sense. Since BH are actually longer, disabling the ULX would not have caused a shortage of work, since there are more BH tasks, and they're much longer. They make up the majority of the processing time, but a very small amount of the data transfers.
ID: 4259 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 4260 - Posted: 8 May 2020, 22:12:34 UTC - in response to Message 4259.  

1. Current ULX's are 10 times shorter then previous batches... I decide to make it pretty short as results files was terrible big in past few series - this quickly made large problems on both sides - server and users.
2. We got 1Gb line shared with storage server , not 10Gb as Rosetta... But, it is also problem with disks on main server - both of that generates bottleneck. Usually it is not a problem, but with 5 times bigger load we start struggling. I need to make some changes on main server (e.g. move /home folder to RAID 0 disks or for e.g. move some disk intensive tasks to other machine as I have done with database).
3. I been informed about competition about a week (or even more) before start. I just not expect troubles for longer then 6-12 hours expecting only short delay because of bunkers.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 4260 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 4 Apr 15
Posts: 46
Credit: 43,128,567
RAC: 0
Message 4261 - Posted: 8 May 2020, 23:18:59 UTC - in response to Message 4257.  

If they had disabled the ULX tasks that means there may not have been nearly enough BH tasks since they are shorter. Right now the Server says it has 625k BH tasks available and almost 32k ULX tasks in progress with zero available, we would need that number from a few days ago to get a better idea.


Shorter?! The ULX I've seen are much quicker to process than the BH ones. 2500 seconds vs 15000 seconds.


2,537.91 2,517.00 100.00 Universe ULX v0.15 x86_64-pc-linux-gnu

4,756.02 4,756.02 pending Universe BHspin v2 v0.19 x86_64-pc-linux-gnu

You are sorta right...my BH tasks DO take longer than my ULX tasks but only twice as long

This is using a NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 390.13 OpenCL: 1.2
ID: 4261 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4262 - Posted: 8 May 2020, 23:37:38 UTC - in response to Message 4261.  
Last modified: 8 May 2020, 23:40:03 UTC

2,537.91 2,517.00 100.00 Universe ULX v0.15 x86_64-pc-linux-gnu

4,756.02 4,756.02 pending Universe BHspin v2 v0.19 x86_64-pc-linux-gnu

You are sorta right...my BH tasks DO take longer than my ULX tasks but only twice as long


I'm exactly right for my tasks. We may be looking at different versions of ULX, Krzysztof said he'd made them smaller. Or I'm looking at a different CPU - I have 4 wildly different computers, I just glanced through my completed tasks on the server and took a rough average.

This is using a NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 390.13 OpenCL: 1.2


Using a what?! I thought Universe was CPU only? Just checked my preferences on the site, there is an option for AMD, but I don't think it does anything. I've switched it on just in case! But there's no Nvidia option.
ID: 4262 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[H]auntjemima

Send message
Joined: 5 May 18
Posts: 6
Credit: 11,915,800
RAC: 0
Message 4263 - Posted: 8 May 2020, 23:39:02 UTC - in response to Message 4262.  

2,537.91 2,517.00 100.00 Universe ULX v0.15 x86_64-pc-linux-gnu

4,756.02 4,756.02 pending Universe BHspin v2 v0.19 x86_64-pc-linux-gnu

You are sorta right...my BH tasks DO take longer than my ULX tasks but only twice as long


I'm exactly right for my tasks. We may be looking at different versions of ULX, Krzysztof said he'd made them smaller.

This is using a NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 390.13 OpenCL: 1.2


Using a what?! I thought Universe was CPU only? Just checked my preferences on the site, there is an option for AMD, but I don't think it does anything. I've switched it on just in case! But there's no Nvidia option.


There are no gpu tasks.
ID: 4263 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 4264 - Posted: 8 May 2020, 23:39:10 UTC - in response to Message 4261.  


2,537.91 2,517.00 100.00 Universe ULX v0.15 x86_64-pc-linux-gnu

4,756.02 4,756.02 pending Universe BHspin v2 v0.19 x86_64-pc-linux-gnu

Look at this task:
https://universeathome.pl/universe/result.php?resultid=96881601

and compare to this (on same machine):
https://universeathome.pl/universe/result.php?resultid=96791290

Check "Peak disk usage" on both tasks - result file is about 1/3 of peak usage.


You are sorta right...my BH tasks DO take longer than my ULX tasks but only twice as long
This is using a NVIDIA GeForce GTX 1080 Ti (4095MB) driver: 390.13 OpenCL: 1.2


Our applications doesn't use GPU :(
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 4264 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4265 - Posted: 8 May 2020, 23:41:56 UTC - in response to Message 4264.  
Last modified: 8 May 2020, 23:42:43 UTC


Our applications doesn't use GPU :(


I wish you could make it do so. That would be FAST! I get through a phenomenal amount of Milkyway and Einstein tasks on my 5 GPUs.
ID: 4265 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4266 - Posted: 8 May 2020, 23:43:48 UTC - in response to Message 4263.  


There are no gpu tasks.


There's an option for AMD in the preferences. I think it was planned at some point but never happened?
ID: 4266 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
scole of TSBT

Send message
Joined: 22 Feb 15
Posts: 24
Credit: 250,365,396
RAC: 416
Message 4267 - Posted: 8 May 2020, 23:49:28 UTC

Krzysztof,

Ignore the P3DN whiners and complainers. A few from P3DN understand the situation but there are some who continue to accuse and call people cheaters. It's been explained already. They only gripe when they are behind. Grow up!

Thank you for your continued support to the Boinc community.

-scole of TSBT
ID: 4267 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 4268 - Posted: 8 May 2020, 23:52:54 UTC - in response to Message 4266.  


There are no gpu tasks.


There's an option for AMD in the preferences. I think it was planned at some point but never happened?

Yes. We got a developer who try to make GPU app for the project, unfortunately he resign because he got struggles to port application algorithm to GPU. This is not a surprise as our application base was written since 2002 when nobody expect GPU's with compute possibilities...

This is very common problem with science app that not all algorithms can be ported to GPU.

But...

We still think about it, slowly change and clear code and maybe... But this not happens in this year definitely :(
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 4268 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ProDigit

Send message
Joined: 22 Dec 19
Posts: 7
Credit: 52,762,433
RAC: 2
Message 4269 - Posted: 8 May 2020, 23:53:57 UTC

I got a good 100 uploads. I don't think my servers were able to upload anything at all!
I do have other projects I'm running, so for the mean time, I'm adjusting this project to a lower importance.
if I can't get my WUs uploaded, I'll prioritize projects where I can.
ID: 4269 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bluestang

Send message
Joined: 10 Oct 19
Posts: 5
Credit: 100,594,667
RAC: 0
Message 4270 - Posted: 9 May 2020, 0:43:50 UTC - in response to Message 4267.  

Krzysztof,

Ignore the P3DN whiners and complainers. A few from P3DN understand the situation but there are some who continue to accuse and call people cheaters. It's been explained already. They only gripe when they are behind. Grow up!

Thank you for your continued support to the Boinc community.

-scole of TSBT


Exactly! Ignore the fools from P3D and keep up the good work Krzysztof.
ID: 4270 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4271 - Posted: 9 May 2020, 1:18:39 UTC - in response to Message 4268.  

Yes. We got a developer who try to make GPU app for the project, unfortunately he resign because he got struggles to port application algorithm to GPU. This is not a surprise as our application base was written since 2002 when nobody expect GPU's with compute possibilities...

This is very common problem with science app that not all algorithms can be ported to GPU.

But...

We still think about it, slowly change and clear code and maybe... But this not happens in this year definitely :(


That is a shame. But I will be adding 48 cores shortly on two xeon servers I'm building from scrap parts. If the pentathlon hasn't eaten all your tasks or melted your ethernet I'll do some more.
ID: 4271 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 30 Oct 16
Posts: 182
Credit: 18,395,933
RAC: 118
Message 4272 - Posted: 9 May 2020, 1:21:11 UTC - in response to Message 4269.  

I got a good 100 uploads. I don't think my servers were able to upload anything at all!
I do have other projects I'm running, so for the mean time, I'm adjusting this project to a lower importance.
if I can't get my WUs uploaded, I'll prioritize projects where I can.


I just let mine do what they can. Every computer has 2 or more projects running, so if one is stuck, something else happens. Once things clear, boinc will make up for lost time on that project to forfill my weighting requests.
ID: 4272 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : Upload fails




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek