1) Message boards : Number crunching : Computer not receiving Work units (Message 1573)
Posted 22 Sep 2016 by Henk Haneveld
Post:
The work distribution has stopped again.

I have not recieved any work on my Windows host for several days and now my tablet cannot get any work for Android either.
2) Message boards : Number crunching : Computer not receiving Work units (Message 1557)
Posted 12 Sep 2016 by Henk Haneveld
Post:
Yes, it is deadlock situation so I will not use any more reliable hosts option.

Still no work. Is the disableing of this option in progress and should I just wait until your done or is there something else that is blocking work transmission.
3) Message boards : Number crunching : Computer not receiving Work units (Message 1550)
Posted 11 Sep 2016 by Henk Haneveld
Post:
If you look on your account page at "computers on this account" and then "details" and then "Application details" you will see that the valid task count is per application. So running one application to get "trusted" status overall will not work.
4) Message boards : Number crunching : Computer not receiving Work units (Message 1548)
Posted 11 Sep 2016 by Henk Haneveld
Post:
I am not getting any results at the moment on one of my hosts. Because cancelled results are marked as error, this host has lost its "trusted" status.
And resends are not send to a host without a "trusted" status.

I have therefor a Catch 22 problem. I get no work because my host is not "trusted" and I cannot earn the "trusted" status because I get no work.

This is a deadlock situation in de server setup.
5) Message boards : Number crunching : Distribution to Android devices (Message 1509)
Posted 4 Sep 2016 by Henk Haneveld
Post:
I noticed something interesting today. The BH2s results I downloaded yesterday where created on 1 september. The results downloded today where created on 29 august.

Is the problem with old unsent results just a simple sorting problem in the ready to send queue?
Are newly created results put before old results?
6) Message boards : Number crunching : Computer not receiving Work units (Message 1482)
Posted 1 Sep 2016 by Henk Haneveld
Post:
The new batches are helping computers to get reliable status (is no special conditions for it except HR), so I need to generate them anyway.
Also I see on server side that number of computers without tasks are going lover every day.

You are missing the point. It looks to me that there are a lot of older unsent results. At the same time new work is generated and send out.
If you let the queue run dry then all those old results will flush from the queue or show that there is a problem with sending them.
You are avoiding fixing the problem that some resuls do not get send out and stay in the queue forever.
It is beter to have some users without work for a short while then increasing the problems indefinitly.
7) Message boards : Number crunching : Computer not receiving Work units (Message 1479)
Posted 1 Sep 2016 by Henk Haneveld
Post:
Perhaps you should stop generation of new work for each of the applications until the ready to send queue for a application is completely empty. Hopefully this will flush out all the unsent results that seem to be stuck in there or at least show how big this problem is.

Once the queue is empty create a small batch of new work and then wait untill the queue is empty again.
This will help you in managing what is in progess and prevent users from having a growing number of waiting for validation results.

This is not a permanent solution but it will give you time to fix the settings problem
8) Message boards : Number crunching : Computer not receiving Work units (Message 1473)
Posted 31 Aug 2016 by Henk Haneveld
Post:
For analysing purpose we need whole batch of results in particular series because even if we have 95% of the serie computed we still need to wait for last 5% to start analysis...
In some WU's is happened that particular WU isn't computed by first host in time, then it going to second one where happened same and sometimes for few WU's we need to wait months... This is a reason why we had starting to use reliable hosts for some of batches (in fact, every 4th batch now base on reliable hosts) including all UV tasks (as they are short).

This is clearly not working. Because if it did it would be impossible to have results waiting for validation because the partner result is not yet send out. You have only to check my results page to see I have results in that status for more then 4 weeks. It is possible that your number of users is to low to have this police to work correctly.

Edit: May I suggest that you give resends (results with -2 and higher) priority when sending out work above -0 and -1 results.

Edit2: Also give resends a shorter return deadline.
9) Message boards : Number crunching : Computer not receiving Work units (Message 1471)
Posted 31 Aug 2016 by Henk Haneveld
Post:
There is another problem as well, about half of tasks is created for "reliable hosts" which means that only hosts with min. 10 consecutive correct tasks done gets new jobs.

I have had several results that where "cancelled by server" listed as error on my results page. This means that reliable hosts lose that status while no error occured.
If the validation process is solid there is no real need for the use of reiiable hosts.
10) Message boards : Number crunching : Distribution to Android devices (Message 1456)
Posted 27 Aug 2016 by Henk Haneveld
Post:
The problem with result status validation pending because of the unsent status for its partner result is still not fixed
I have 30 returned results from 4 weeks ago with this problem.
11) Message boards : Number crunching : Distribution to Android devices (Message 1356)
Posted 13 Aug 2016 by Henk Haneveld
Post:

1. A lot of partner results are still waiting to be send out. So there is either a problem with distribution or the number of users with Android devices is very low.
That affects every client. We all have pendings because of waiting to be send out.

I understand that there may be some delay but if it takes 2 weeks to send out the wingman result there is something wrong.
12) Message boards : Number crunching : Download errors (Message 1349)
Posted 12 Aug 2016 by Henk Haneveld
Post:
The Process creation failed error happens because the Windows exe file is not compiled for Win XP.

Change major version id from 6 to 5 and compile again.

I have fixed this for myself by editing the BHspin2 exe file using PEinfo.exe
13) Message boards : Number crunching : Distribution to Android devices (Message 1324)
Posted 7 Aug 2016 by Henk Haneveld
Post:
My number of results with status wating for validation for Bhspin V2 keeps growing so I did some cheking.

I found 2 problems.

1. A lot of partner results are still waiting to be send out. So there is either a problem with distribution or the number of users with Android devices is very low.

2. In those case where there is a wingman I found that all of those using Android version 6.01 end in "error while computing".
I use Android version 4.4.2. Those run slow but finish without problems.
14) Message boards : Number crunching : Android App run dry (Message 1177)
Posted 14 May 2016 by Henk Haneveld
Post:
4 results returned (1 valid, 3 pending)
4 more in progress

No errors.
15) Message boards : Number crunching : Android App run dry (Message 1169)
Posted 9 May 2016 by Henk Haneveld
Post:
I am willing to have a go at long run times.







Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek