Message boards : News : BHDB application
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

AuthorMessage
scole of TSBT

Send message
Joined: 22 Feb 15
Posts: 24
Credit: 250,365,396
RAC: 416
Message 2668 - Posted: 10 Mar 2018, 18:21:42 UTC
Last modified: 10 Mar 2018, 18:45:00 UTC

I have plenty of disk space. Increase <rsc_disk_bound>900000000.000000</rsc_disk_bound> to
<rsc_disk_bound>1800000000.000000</rsc_disk_bound> and lets see how they run
ID: 2668 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Gunnar Hjern

Send message
Joined: 4 Nov 16
Posts: 20
Credit: 118,453,585
RAC: 0
Message 2669 - Posted: 10 Mar 2018, 19:33:41 UTC - in response to Message 2664.  
Last modified: 10 Mar 2018, 19:47:02 UTC

Well, at least it seems that things have started to go better now.
I have 8 valids and 13 pending, so I'll keep on crunching at least until tomorrow.

I've noticed that no task with a number less that
180109_5_89xxxxxx_xxx...
completes, but I havn't had any errors with tasks above that number, so now I've ditched all of them below.

Some "lemons" are still coming in, but now most new tasks have numbers above
180109_5_100xxxxxx_xxx...

Let's see what's gonna happen next.

Nice weekend to all of U!

//Gunnar
ID: 2669 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 11 Mar 15
Posts: 19
Credit: 390,816,392
RAC: 6,268
Message 2670 - Posted: 10 Mar 2018, 19:50:21 UTC - in response to Message 2669.  

I've had _48 and _59 complete and validate and have 6 more _59 WUs that have completed and are pending.
Haven't had any above _59 yet.
ID: 2670 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile pututu
Avatar

Send message
Joined: 7 Jun 16
Posts: 9
Credit: 121,795,337
RAC: 0
Message 2671 - Posted: 10 Mar 2018, 23:52:46 UTC

I've yet to find which WU numbers that are stable. When I downloaded some WUs, I manually check which WU has more than one error and abort the WU. See some examples below.
https://universeathome.pl/universe/workunit.php?wuid=14675480
https://universeathome.pl/universe/workunit.php?wuid=14676630

Can the project admin set the max number of error in the "max # of error/total/success tasks" to be like 1 or 2? Currently it is set to 4 which I think is too high.
ID: 2671 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Gunnar Hjern

Send message
Joined: 4 Nov 16
Posts: 20
Credit: 118,453,585
RAC: 0
Message 2672 - Posted: 11 Mar 2018, 0:20:59 UTC - in response to Message 2671.  

Hi Pututu!

The wuid's that you list doesn't tell me much. It's rather the wu-names,
those that you find on the topmost row on those liked pages,
that I found out tells more of which version they have:

The wuid=14675480
has the name:
"universe_bhdb_180109_5_70479930_20000_1-999999_480200"
and due to my experience tasks with lower number than
...180109_5_89xxxxxx_...
are not likely to complete! :-(

Tasks with that number on the other side, and higher, has been 100% successful for me this evening! :-)
As the tasks names are clearly visible in the rightmost column in the client, it's easy to find and ditch all the lower numbered wu's.
Since aprox. 18:00 this evening I've completed no less than 73 wu's, of which 43 are validated and 30 pending!

It's obvious that this project now is working again! :-) :-) :-)

Nice weekend!!!

//Gunnar
ID: 2672 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 169
Credit: 317,253,046
RAC: 6
Message 2673 - Posted: 11 Mar 2018, 2:13:58 UTC - in response to Message 2672.  
Last modified: 11 Mar 2018, 2:15:52 UTC

Hi Pututu!

The wuid's that you list doesn't tell me much. It's rather the wu-names,
those that you find on the topmost row on those liked pages,
that I found out tells more of which version they have:

The wuid=14675480
has the name:
"universe_bhdb_180109_5_70479930_20000_1-999999_480200"
and due to my experience tasks with lower number than
...180109_5_89xxxxxx_...
are not likely to complete! :-(

Tasks with that number on the other side, and higher, has been 100% successful for me this evening! :-)
As the tasks names are clearly visible in the rightmost column in the client, it's easy to find and ditch all the lower numbered wu's.
Since aprox. 18:00 this evening I've completed no less than 73 wu's, of which 43 are validated and 30 pending!

It's obvious that this project now is working again! :-) :-) :-)

Nice weekend!!!

//Gunnar


All tasks should work, not just some.

I've had two of the 999.. tasks error.
https://universeathome.pl/universe/result.php?resultid=33321174
ID: 2673 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2674 - Posted: 11 Mar 2018, 2:57:06 UTC - in response to Message 2672.  
Last modified: 11 Mar 2018, 3:06:44 UTC

The wuid=14675480
has the name:
"universe_bhdb_180109_5_70479930_20000_1-999999_480200"
and due to my experience tasks with lower number than
...180109_5_89xxxxxx_...
are not likely to complete! :-(

Tasks with that number on the other side, and higher, has been 100% successful for me this evening! :-)

That rule mostly works the same for me on my i7-4770 (Ubuntu 16.04). However, on my Ryzen 1700 (Lubuntu 17.10) there are a lot of errors even above that number, but a few successes even below.

I think that Krzysztof will be working on it for a while to get it right.
ID: 2674 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 2675 - Posted: 11 Mar 2018, 6:43:09 UTC

Yes, I found another error. Because of saving and zipping result files takes a moment Manager think that is something wrong with application and kill it. This is explained here:
I tested the workunit standalone and found no issues. In looking through the client code it looks like this condition occurs when the client finds that the boinc finish file has been written to disk but the science application process is still running. Since the finish file was written then there must be a hang in boinc_finish somewhere. Or it could be a bug or race condition in the client causing a false positive.

So, in app version 0.03 I have added 2 second "sleep" command before call boinc_finish() function to prevent this.
Also, I just discovered then in some conditions result temporary files can growth up to... 1.3GB!

So, from series "6" (short WU's, 5 to 15 minutes) and batches above "6" (normal, long WU's) I had to set limit to 1.5GB.

At the moment I see that series 6 and higher are finish properly...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2675 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2676 - Posted: 11 Mar 2018, 7:12:13 UTC - in response to Message 2675.  

At the moment I see that series 6 and higher are finish properly...

Yes, they are all OK on both my i7-4770 and Ryzen 1700 machines. Good job.
ID: 2676 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jwalck

Send message
Joined: 28 Nov 17
Posts: 4
Credit: 35,241,140
RAC: 0
Message 2677 - Posted: 11 Mar 2018, 9:40:21 UTC

I had a bunch of failed series 5 (and some successful too, ~30 % of WUs). Series 6 seem to behave well and are steaming on now!
ID: 2677 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JagDoc
Help desk expert

Send message
Joined: 21 Feb 15
Posts: 83
Credit: 873,439,097
RAC: 988
Message 2679 - Posted: 11 Mar 2018, 12:35:03 UTC - in response to Message 2632.  

Would this application modified for raspberry pi's? Or only desktops and laptops?

It will be available for Armbian based devices probably tomorrow.

Any news about app for ARM linux?
Many Odroid-XU4, -HC1, -MC1 and -C2 waiting for work.
ID: 2679 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 21 Feb 15
Posts: 64
Credit: 65,733,511
RAC: 330
Message 2682 - Posted: 11 Mar 2018, 13:21:30 UTC

Looks good for me !
Congrats to you, Krzysztof for a good job!
ID: 2682 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 21 Feb 15
Posts: 64
Credit: 65,733,511
RAC: 330
Message 2683 - Posted: 11 Mar 2018, 13:33:56 UTC

I have updated one of my Androids to the new URL hoping for work. Unfortunately ...

Do you have plans to generate work for the Androids?
ID: 2683 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xii5ku

Send message
Joined: 9 Nov 17
Posts: 21
Credit: 563,207,000
RAC: 13
Message 2684 - Posted: 11 Mar 2018, 13:46:24 UTC - in response to Message 2675.  

Krzysztof 'krzyszp' Piszczek wrote:
Because of saving and zipping result files takes a moment Manager think that is something wrong with application and kill it.
[...]
So, in app version 0.03 I have added 2 second "sleep" command before call boinc_finish() function to prevent this.

Instead of the sleep, isn't there a way to actually wait for the compression and write to be finished? Should be possible if these are done in threads of (or child processes of) the science application, but I don't know if they are.
ID: 2684 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Quasar

Send message
Joined: 1 Mar 18
Posts: 2
Credit: 825,167
RAC: 0
Message 2685 - Posted: 11 Mar 2018, 13:53:52 UTC
Last modified: 11 Mar 2018, 13:58:29 UTC

Sorry. Near 50 tasks are ready for 2 days, internet stabile, but I can not send them and download new near 100. What to do? Bonic show - "sending complete tasks", but nothing happened for 2 days.
ID: 2685 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 2686 - Posted: 11 Mar 2018, 15:47:58 UTC - in response to Message 2679.  

Would this application modified for raspberry pi's? Or only desktops and laptops?

It will be available for Armbian based devices probably tomorrow.

Any news about app for ARM linux?
Many Odroid-XU4, -HC1, -MC1 and -C2 waiting for work.

I had to sort out problems with app before make compilation for Android.
I will unpack my Odroid tonight or tomorrow and compile application for it :)
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2686 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 2687 - Posted: 11 Mar 2018, 15:50:04 UTC - in response to Message 2684.  

Krzysztof 'krzyszp' Piszczek wrote:
Because of saving and zipping result files takes a moment Manager think that is something wrong with application and kill it.
[...]
So, in app version 0.03 I have added 2 second "sleep" command before call boinc_finish() function to prevent this.

Instead of the sleep, isn't there a way to actually wait for the compression and write to be finished? Should be possible if these are done in threads of (or child processes of) the science application, but I don't know if they are.

No, this app didn't generate child processes at all, so the 2sec wait is the only solution for the moment. Anyway, for ~2h tasks this isn't to long I think ;) (I know, waiting for finish will be more secure and elegant, but I can't do this in this app).
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2687 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 21 Feb 15
Posts: 64
Credit: 65,733,511
RAC: 330
Message 2688 - Posted: 11 Mar 2018, 16:12:55 UTC

I had to sort out problems with app before make compilation for Android.
I will unpack my Odroid tonight or tomorrow and compile application for it :)
____________
Krzysztof 'krzyszp' Piszczek


You are my hero !
ID: 2688 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chris

Send message
Joined: 29 Jan 18
Posts: 6
Credit: 523,333
RAC: 0
Message 2689 - Posted: 11 Mar 2018, 16:27:31 UTC

Universe@home transfers are not going through.
Is there a reason for that or am I doing something wrong?

ID: 2689 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 841
Credit: 144,180,465
RAC: 2
Message 2690 - Posted: 11 Mar 2018, 16:32:50 UTC

Server is delayed with transfers because of large quantity of very short WU's. It should clear in about 3-4 hours.
Btw, what shows messages in logs when you try retry upload?
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2690 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

Message boards : News : BHDB application




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek