Message boards : News : Longer work units
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
Das_Greff

Send message
Joined: 30 Jul 19
Posts: 2
Credit: 11,717,933
RAC: 0
Message 5552 - Posted: 19 May 2022, 6:12:07 UTC - in response to Message 5524.  
Last modified: 19 May 2022, 6:12:55 UTC

I'm in the same case, results keep pilling up in the transfer queue

I keep getting 502 and 503 responses to POST requests to "http://universeathome.pl/universe_cgi/file_upload_handler"

I have been offline for a couple of weeks.

I rarely read the forums.

My processing clusters generally run without any need to intervene.
I noticed the problem yesterday when I found a significant number of completed tasks not uploading.
They all appear to be stalled.
This appears to be a problem going back several days.
ID: 5552 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eoin Moore 1971

Send message
Joined: 7 Feb 19
Posts: 3
Credit: 3,914,000
RAC: 0
Message 5580 - Posted: 20 May 2022, 0:29:41 UTC - in response to Message 5552.  
Last modified: 20 May 2022, 0:31:37 UTC

I also have this problem, many completed WUs ready for upload for 1+ week.

Do WUs completed before deadline, but not uploaded before it, still contribute, or are they cancelled / wasted...

(edit) upload is delayed only, it seems, on my W10 box. Other machines seem to be uploading results normally
ID: 5580 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pawg

Send message
Joined: 10 Mar 15
Posts: 25
Credit: 16,679,590
RAC: 0
Message 5585 - Posted: 20 May 2022, 6:48:34 UTC - in response to Message 5580.  

I also have this problem, many completed WUs ready for upload for 1+ week.

Do WUs completed before deadline, but not uploaded before it, still contribute, or are they cancelled / wasted...

(edit) upload is delayed only, it seems, on my W10 box. Other machines seem to be uploading results normally

Click "retry transfers" and all the tasks should upload without any issue now
ID: 5585 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Das_Greff

Send message
Joined: 30 Jul 19
Posts: 2
Credit: 11,717,933
RAC: 0
Message 5608 - Posted: 21 May 2022, 9:53:25 UTC - in response to Message 5585.  

Yep, all queue empty now
ID: 5608 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eoin Moore 1971

Send message
Joined: 7 Feb 19
Posts: 3
Credit: 3,914,000
RAC: 0
Message 5611 - Posted: 21 May 2022, 14:39:55 UTC

Unfortunately, more than 40 pending results still. Manual retries just get "project backoff"

Some results had deadlines as early as May 11th-13th.

Oh well, will just wait for them to clear, and hope the results are still accepted as valid, though they are so late.
ID: 5611 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rsNeutrino

Send message
Joined: 1 Nov 17
Posts: 29
Credit: 291,940,933
RAC: 0
Message 5612 - Posted: 21 May 2022, 19:16:05 UTC - in response to Message 5611.  

Looking into your host list (https://universeathome.pl/universe/hosts_user.php?userid=87792) shows last contact with your AMD PC at "19 May 2022, 16:50:56 UTC", while your two Android-based devices look fine.
Unfortunately, there are no tasks in progress anymore, so if there are still some in your BOINC list, they are definitely too late.
Some tasks did get through on the 19th, but the CPU time looks very wrong https://universeathome.pl/universe/results.php?hostid=570210&offset=0&show_names=0&state=5&appid=.
Ten times longer than they should be... as if the PC is extremely slow/overloaded/thrashing, which could have a negative influence on the communication.
(RAM full? HDD/SSD full/slow/broken? hardware misbehaving with spamming interrupts and slowing everything down? check like cpu-z benchmarks against a reference)
I would check if something else blocks the internet connection of BOINC and/or the Universe project specifically on your PC, the project's servers are fine since yesterday. I suggest restarting your PC and looking for error messages in the boinc log viewer.
ID: 5612 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Eoin Moore 1971

Send message
Joined: 7 Feb 19
Posts: 3
Credit: 3,914,000
RAC: 0
Message 5613 - Posted: 21 May 2022, 22:34:49 UTC - in response to Message 5612.  

Looking into your host list (https://universeathome.pl/universe/hosts_user.php?userid=87792) shows last contact with your AMD PC at "19 May 2022, 16:50:56 UTC", while your two Android-based devices look fine.
Unfortunately, there are no tasks in progress anymore, so if there are still some in your BOINC list, they are definitely too late.
Some tasks did get through on the 19th, but the CPU time looks very wrong https://universeathome.pl/universe/results.php?hostid=570210&offset=0&show_names=0&state=5&appid=.
Ten times longer than they should be... as if the PC is extremely slow/overloaded/thrashing, which could have a negative influence on the communication.
(RAM full? HDD/SSD full/slow/broken? hardware misbehaving with spamming interrupts and slowing everything down? check like cpu-z benchmarks against a reference)
I would check if something else blocks the internet connection of BOINC and/or the Universe project specifically on your PC, the project's servers are fine since yesterday. I suggest restarting your PC and looking for error messages in the boinc log viewer.


Hi rsNeutrino,

thanks for the detailed reply.

You are correct, and I had not noticed, some of the CPU Time figures seem very long (even, I should note in passing, for the last few validated WUs - one was 50,000+ seconds...)

Will try all you recommend.

Other BOINC projects working normally (except for many validation inconclusive WUs on MW@H), communicating normally, and not taking an inordinate time to complete WUs, as far as I can tell.

Lots of space left on SSD, RAM usage seldom more than 50% even with productivity packages running. Hard to see what the problem is.

I don't usually babysit my BOINC install, but will be watching it more carefully for a while.

Regards,

Eoin
ID: 5613 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Grant (SSSF)

Send message
Joined: 23 Apr 22
Posts: 167
Credit: 69,772,000
RAC: 0
Message 5614 - Posted: 22 May 2022, 0:15:53 UTC - in response to Message 5613.  
Last modified: 22 May 2022, 0:27:10 UTC

Other BOINC projects working normally (except for many validation inconclusive WUs on MW@H), communicating normally, and not taking an inordinate time to complete WUs, as far as I can tell.
There are no signs of any other BOINC projects on that system with the account you are posting here with.
Are you running multiple instances of BONC on it? That could explain the excessively long runtimes, with the other BOINC instance & it's projects getting all the CPU time.

EDIT- ok, things here at the Universe forums are rather odd. Every other project i've seen when you click on someone's account, it shows you all of the projects they are with.


Being a laptop, the most likely cause is thermal throttling. I would check your CPU temps & clock speeds while it is trying to process Universe work.
I would also reduce the size of your cache- 2 weeks to return something that takes 4hrs or less to process is rather excessive (even with the past couple of weeks contest insanity).
If you run multiple projects, you're better off with no cache. Even with a single project, if it's reliable then there's no need for more than a few hours cache. Here i generally run with an 8 hour cache.
Grant
Darwin NT
ID: 5614 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 4 Apr 15
Posts: 46
Credit: 43,128,567
RAC: 0
Message 5618 - Posted: 22 May 2022, 3:15:56 UTC - in response to Message 5614.  

EDIT- ok, things here at the Universe forums are rather odd. Every other project i've seen when you click on someone's account, it shows you all of the projects they are with .


That's Project choice on the Server side to show that or not
ID: 5618 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Khali

Send message
Joined: 5 May 22
Posts: 3
Credit: 446,667
RAC: 0
Message 5624 - Posted: 23 May 2022, 16:26:26 UTC

Has the huge backlog been cleared up yet? I am still get project back off every time I try to upload. Here is my log of my last few attempts.

5/23/2022 12:22:41 PM | | Project communication failed: attempting access to reference site
5/23/2022 12:22:42 PM | | Internet access OK - project servers may be temporarily down.
5/23/2022 12:22:42 PM | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_1: transient HTTP error
5/23/2022 12:22:42 PM | Universe@Home | Backing off 00:25:18 on upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_1
5/23/2022 12:22:42 PM | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_0: transient HTTP error
5/23/2022 12:22:42 PM | Universe@Home | Backing off 00:19:13 on upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_0
ID: 5624 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 4 Apr 15
Posts: 46
Credit: 43,128,567
RAC: 0
Message 5625 - Posted: 23 May 2022, 18:40:24 UTC - in response to Message 5624.  

Has the huge backlog been cleared up yet? I am still get project back off every time I try to upload. Here is my log of my last few attempts.

5/23/2022 12:22:41 PM | | Project communication failed: attempting access to reference site
5/23/2022 12:22:42 PM | | Internet access OK - project servers may be temporarily down.
5/23/2022 12:22:42 PM | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_1: transient HTTP error
5/23/2022 12:22:42 PM | Universe@Home | Backing off 00:25:18 on upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_1
5/23/2022 12:22:42 PM | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_0: transient HTTP error
5/23/2022 12:22:42 PM | Universe@Home | Backing off 00:19:13 on upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_0


All of mine cleared several days ago but that "transient HTTP error" means their website is down or unreachable at the moment.
ID: 5625 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 10 May 20
Posts: 310
Credit: 4,733,484,700
RAC: 0
Message 5627 - Posted: 23 May 2022, 19:13:01 UTC - in response to Message 5624.  

Has the huge backlog been cleared up yet? I am still get project back off every time I try to upload. Here is my log of my last few attempts.


Long ago.

A proud member of the OFA (Old Farts Association)
ID: 5627 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Khali

Send message
Joined: 5 May 22
Posts: 3
Credit: 446,667
RAC: 0
Message 5629 - Posted: 23 May 2022, 22:48:46 UTC - in response to Message 5625.  

Has the huge backlog been cleared up yet? I am still get project back off every time I try to upload. Here is my log of my last few attempts.

5/23/2022 12:22:41 PM | | Project communication failed: attempting access to reference site
5/23/2022 12:22:42 PM | | Internet access OK - project servers may be temporarily down.
5/23/2022 12:22:42 PM | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_1: transient HTTP error
5/23/2022 12:22:42 PM | Universe@Home | Backing off 00:25:18 on upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_1
5/23/2022 12:22:42 PM | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_0: transient HTTP error
5/23/2022 12:22:42 PM | Universe@Home | Backing off 00:19:13 on upload of universe_bh2_190723_400_528884472_20000_1-999999_885100_1_r1963607943_0


All of mine cleared several days ago but that "transient HTTP error" means their website is down or unreachable at the moment.


I have been getting that error for several days now. Obviously the website is not down. Anyone know what the fix might be?
ID: 5629 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 10 May 20
Posts: 310
Credit: 4,733,484,700
RAC: 0
Message 5631 - Posted: 24 May 2022, 2:19:03 UTC - in response to Message 5629.  
Last modified: 24 May 2022, 2:20:03 UTC

Close down BOINC and delete all http_temp_xxx files in the BOINC directory.

Change these entries in the cc_config.xml file

<max_file_xfers>18</max_file_xfers>
<max_file_xfers_per_project>8</max_file_xfers_per_project>

Change max_file_xfers appropriately to whatever it takes to cover all your attached projects. If just running Universe make the value 8, the same value for the max_file_xfers_per_project value.

Universe tasks upload 5 separate files on each completed task. If you have left the cc_config.xml file at its 2 xfer upload defaults, you wouldn't be successful on a single scheduler connection. You need to blast all the files upstream in the one single first connection.

By getting rid of all the dormant http_temp_xxx files you start a brand new scheduler connection and you should be able to upload all the finished results files in one go. The leftover http_temp_xxx files are holding the scheduler database open for non-existent files by now. And they are preventing you from getting your uploads home.

A proud member of the OFA (Old Farts Association)
ID: 5631 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4

Message boards : News : Longer work units




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek