Message boards : Number crunching : Server Thread
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 16 · Next

AuthorMessage
DukeBox

Send message
Joined: 28 Feb 22
Posts: 1
Credit: 273,740,667
RAC: 0
Message 5422 - Posted: 14 May 2022, 13:09:22 UTC

I do not mind these kind of (temporary) mass crunching actions however when its start impacting others and the stability of the project they should stop it.
I've tried for 3 days to upload my data and still most of it is in retry status. I'm sorry for Krzysztof but you have lost me as a participant.
ID: 5422 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
frankhagen

Send message
Joined: 12 Aug 17
Posts: 21
Credit: 58,957,280
RAC: 0
Message 5423 - Posted: 14 May 2022, 13:18:22 UTC - in response to Message 5422.  

and another unhappy camper has moved on.

that is exactly what happens everytime - a short time of loads of results and afterwards partitipation of regular participants will plumet.

but it is the decision of project-owners - they get what they called for.
ID: 5423 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Grant (SSSF)

Send message
Joined: 23 Apr 22
Posts: 167
Credit: 69,772,000
RAC: 0
Message 5448 - Posted: 16 May 2022, 5:26:39 UTC

Things are seriously bad at present.
Web site barely responding, forums often not responding.

I've had uploads that took 30min of Elapsed time to go through- in other words several hours from when the Task was completed to when it was actually returned to the Project, along with needing dozens of Updates to finally get a Scheduler response that isn't an error to report work & get more allocated.
Then another half hour or more of Retrying pending transfers to get them to eventually download.
Grant
Darwin NT
ID: 5448 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
frankhagen

Send message
Joined: 12 Aug 17
Posts: 21
Credit: 58,957,280
RAC: 0
Message 5454 - Posted: 16 May 2022, 11:37:36 UTC
Last modified: 16 May 2022, 11:38:53 UTC

FUBAR!

and the congrats go to

sILLYgERMANY.
ID: 5454 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[DPC]Fractus

Send message
Joined: 21 Feb 22
Posts: 1
Credit: 8,886,667
RAC: 0
Message 5456 - Posted: 16 May 2022, 12:55:00 UTC

I don't really mind the upload getting stuck for a day or so, but BOINC prevents me from getting new tasks because of the upload hampering: Mon 16 May 2022 02:09:32 PM CEST | Universe@Home | Not requesting tasks: too many uploads in progress
I'm running out of tasks in a few hours. Shame
ID: 5456 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pawg

Send message
Joined: 10 Mar 15
Posts: 25
Credit: 16,679,590
RAC: 0
Message 5461 - Posted: 16 May 2022, 15:56:30 UTC

2 days of chaos left
ID: 5461 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greger

Send message
Joined: 29 Oct 15
Posts: 4
Credit: 1,018,962,467
RAC: 0
Message 5463 - Posted: 16 May 2022, 17:24:15 UTC

I agree with frankhagen.

This happens over and over each year and people get tired of these events, I hold 9339 task right now and 50 of them is trying to download the rest of them trying to upload.
I project need to stresstest servers it would be great it was done outside production.
I doubt that these short spikes gains cover what they lose in long run that people move to another project or even stop using boinc at end. Some would care or be able reach forum to get this info so they step out.

This project have long deadline so it will get sorted out after this mess but some project have very short deadline. Solutions admins would take would only help temporarily like add more servers increase bandwidth, increase batches or dataset to units. In the end they fall back to normal when event is done as demand is not needed and waste of space or cost for hardware or service would be to high.

For those who like these events you can try Archive Team or similar project that you can better use of host then hammer servers at boinc projects.
ID: 5463 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 27 Nov 20
Posts: 12
Credit: 21,243,433
RAC: 0
Message 5469 - Posted: 16 May 2022, 21:44:20 UTC - in response to Message 5399.  

Me too. Some uploads take about 30 minutes of upload time to upload a 20-byte file. Here is a recent bunch.
Mon 16 May 2022 05:15:10 PM EDT | Universe@Home | Computation for task universe_bh2_190723_400_3612156388_20000_1-999999_160100_1 finished
Mon 16 May 2022 05:15:13 PM EDT | Universe@Home | Started upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_0
Mon 16 May 2022 05:15:13 PM EDT | Universe@Home | Started upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_1
Mon 16 May 2022 05:15:18 PM EDT |  | Project communication failed: attempting access to reference site
Mon 16 May 2022 05:15:18 PM EDT | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_0: transient HTTP error
Mon 16 May 2022 05:15:18 PM EDT | Universe@Home | Backing off 00:02:08 on upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_0
Mon 16 May 2022 05:15:18 PM EDT | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_1: transient HTTP error
Mon 16 May 2022 05:15:18 PM EDT | Universe@Home | Backing off 00:02:36 on upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_1
Mon 16 May 2022 05:15:18 PM EDT | Universe@Home | Started upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_2
Mon 16 May 2022 05:15:18 PM EDT | Universe@Home | Started upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_3
Mon 16 May 2022 05:15:20 PM EDT |  | Internet access OK - project servers may be temporarily down.
Mon 16 May 2022 05:15:35 PM EDT |  | Project communication failed: attempting access to reference site
Mon 16 May 2022 05:15:35 PM EDT | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_2: transient HTTP error
Mon 16 May 2022 05:15:35 PM EDT | Universe@Home | Backing off 00:02:46 on upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_2
Mon 16 May 2022 05:15:35 PM EDT | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_3: transient HTTP error
Mon 16 May 2022 05:15:35 PM EDT | Universe@Home | Backing off 00:02:24 on upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_3
Mon 16 May 2022 05:15:35 PM EDT | Universe@Home | Started upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_4
Mon 16 May 2022 05:15:35 PM EDT | Universe@Home | Started upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_5
Mon 16 May 2022 05:15:36 PM EDT |  | Internet access OK - project servers may be temporarily down.
Mon 16 May 2022 05:18:23 PM EDT |  | Project communication failed: attempting access to reference site
Mon 16 May 2022 05:18:23 PM EDT | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_4: transient HTTP error
Mon 16 May 2022 05:18:23 PM EDT | Universe@Home | Backing off 00:03:36 on upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_4
Mon 16 May 2022 05:18:23 PM EDT | Universe@Home | Temporarily failed upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_5: transient HTTP error
Mon 16 May 2022 05:18:23 PM EDT | Universe@Home | Backing off 00:02:37 on upload of universe_bh2_190723_400_3612156388_20000_1-999999_160100_1_r817589796_5
ID: 5469 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Khali

Send message
Joined: 5 May 22
Posts: 3
Credit: 446,667
RAC: 0
Message 5474 - Posted: 17 May 2022, 0:30:26 UTC

I find the whole situation utterly ridiculous. I have not been able to upload or download anything from the server for 6 days now.
ID: 5474 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
vaughan

Send message
Joined: 4 Feb 15
Posts: 7
Credit: 158,219,834
RAC: 0
Message 5476 - Posted: 17 May 2022, 2:32:27 UTC - in response to Message 5474.  

I agree with Khali.

The main project page reckons the SSDs can cope but clearly the project is hopelessly overwhelmed. I'm moving on to other projects such as PrimeGrid and SRBase that are able to cope with the workload.
ID: 5476 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Grant (SSSF)

Send message
Joined: 23 Apr 22
Posts: 167
Credit: 69,772,000
RAC: 0
Message 5480 - Posted: 17 May 2022, 6:09:04 UTC - in response to Message 5476.  
Last modified: 17 May 2022, 6:23:36 UTC

The main project page reckons the SSDs can cope but clearly the project is hopelessly overwhelmed. I'm moving on to other projects such as PrimeGrid and SRBase that are able to cope with the workload.
The project is able to cope with the usual workload- when it's not being smashed by a group of people have a competition.
3-4 days and things should be back to normal once the servers can catch up after the relentless -what is effectively a DDoS attack- comes to an end.
Grant
Darwin NT
ID: 5480 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Grant (SSSF)

Send message
Joined: 23 Apr 22
Posts: 167
Credit: 69,772,000
RAC: 0
Message 5481 - Posted: 17 May 2022, 6:20:07 UTC
Last modified: 17 May 2022, 6:52:12 UTC

And things are now even worse today than they were yesterday. 30min+ for uploads to go through- that's with over a hour or more of manual intervention.
Forums timing out even more often. Website even more sluggish.

And looking at the server graph- there is barely a graph there as the severs can't even supply occasional status updates. And what is there shows jut how bad things are- the amount of work being done is now less than prior to the competition. From roughly 250k-300k per day down to around 100k.
The servers are so overwhelmed that people can't upload work, and if you can't upload it then you can't report it, and if you can't report it you can't get more work.


EDIT- 55min Elapsed time (so far) for one upload that just can't get through.
Grant
Darwin NT
ID: 5481 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 27 Nov 20
Posts: 12
Credit: 21,243,433
RAC: 0
Message 5486 - Posted: 17 May 2022, 12:11:51 UTC - in response to Message 5481.  

It is so bad that first I set the project to no new tasks.
Now I have additionally suspended the entire project so as to generate no new uploads. I will not resume the project until all my uploads have gone through, I have at least two screens full of them. Occasionally one or two go through. Iff I baby sit them.

It is so bad I could not even post this: took three tries.
ID: 5486 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wildhagen

Send message
Joined: 25 Feb 22
Posts: 6
Credit: 77,625,333
RAC: 0
Message 5490 - Posted: 17 May 2022, 14:36:03 UTC

This is getting completely ridiculous. I cannot upload or report ANYTHING for days on end now.

When will we be able to upload again???
ID: 5490 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wildhagen

Send message
Joined: 25 Feb 22
Posts: 6
Credit: 77,625,333
RAC: 0
Message 5491 - Posted: 17 May 2022, 14:36:41 UTC

This is getting completely ridiculous. I cannot upload or report ANYTHING for days on end now.

When will we be able to upload again???
ID: 5491 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
supdood

Send message
Joined: 4 May 18
Posts: 11
Credit: 12,895,467
RAC: 0
Message 5492 - Posted: 17 May 2022, 14:45:52 UTC

What's getting ridiculous is the continued griping by folks who clearly haven't bothered to read the messages posted before them. Yes, this is frustrating for those of us who were happily contributing before the competition came and overloaded the server capacity. No, the servers don't suck--they are handling over 3x the normal load and have been doing so continuously for nearly two weeks. That seems pretty robust to me. No, the admin isn't an amateur. This one might surprise you, but no, your foul language and selfish attitude doesn't help anything. There are other BOINC projects worthy of your CPU cycles--or take a break for a few days! Either is better than ranting and cussing about it on a message board.

The competition ends at 19 May 2022 00:00 UTC. I'm guessing that we should see server loads decreasing within several hours after that time as participants finish the tasks in their queue.

In summary, calm down, wait it out, and engage with SETI Germany to determine how their competition can have a lesser impact on projects going forward.
ID: 5492 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 847
Credit: 144,180,465
RAC: 0
Message 5493 - Posted: 17 May 2022, 15:47:45 UTC - in response to Message 5489.  

I'm tired of this bullshit project. Fucking server sucks, the whole thing is run by amateurs who don't know what the fuck they're doing. Can't upload shit. Website nearly unresponsive. No sign of any action by the "administrator". Fuck off, I'm gone to put my time into a project that's worth the effort.

Bye
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 5493 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SuicideCabbage

Send message
Joined: 13 Nov 17
Posts: 1
Credit: 134,548,667
RAC: 0
Message 5497 - Posted: 17 May 2022, 17:04:13 UTC

Did you happen to forget to increase the rather low default number of maximum sockets available on the machine? When I do win the lottery and get a pipe, the transfer speeds are decent so it doesn't appear to be a bandwidth issue. I've also noticed if one upload is successful, I normally manage to get 2-3 out right after that. I'd look into that, as well as maybe some other tuning recommendations in the BOINC documentation. https://boinc.berkeley.edu/trac/wiki/MultiHost#Increasingnetworkperformance The server seems to be strong enough, even if it can't quite keep up the feeder. The bottleneck is for sure the network, but I am not convinced it's a bandwidth limitation but rather a miss-configured setting on the host itself. See https://levelup.gitconnected.com/linux-kernel-tuning-for-high-performance-networking-high-volume-incoming-connections-196e863d458a for another quick overview on tuning the kernel for a high number of connections. If you determine this is an issue and want more information, private message me.
ID: 5497 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 847
Credit: 144,180,465
RAC: 0
Message 5498 - Posted: 17 May 2022, 17:27:13 UTC - in response to Message 5497.  
Last modified: 17 May 2022, 17:31:17 UTC

The problem is now in very large number of concurrent HTTP connections.
Our server handles about 300 simultaneous connections and it is maximum what it can do.

Setting more makes server very unstable (in terms of deamons, not machine or operating system) and cause more problems then it solve.

Most (if not all) BOINC documented method of increasing server performance I have already implemented, but we are limited by network and what system can handle.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 5498 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 16 · Next

Message boards : Number crunching : Server Thread




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek