21) Message boards : Number crunching : Couldn't Get Input Files (Message 5090)
Posted 16 Feb 2022 by Brummig
Post:
I've added an Android device to my U@H fleet, but it errors out on every WU with:
<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
  <file_name>BHspin2_20_arm-android-linux-gnu</file_name>
  <error_code>-224 (permanent HTTP error)</error_code>
  <error_message>permanent HTTP error</error_message>
</file_xfer_error>
<file_xfer_error>
  <file_name>jobbh2_20.0.xml</file_name>
  <error_code>-224 (permanent HTTP error)</error_code>
  <error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>

One of the WUs went to another Android device, and that also failed with the same error message.

WCG WUs are completing successfully.

Does anyone have any suggestions?
22) Message boards : News : Short breake. (Message 5088)
Posted 16 Feb 2022 by Brummig
Post:
Lets crunch again.

Like we did last summer ♫

Thanks, Krzysztof. What with WCG down for an extended time, and Einstein having problems with BRP4, my Pies are getting so hungry they are starting to eat their their way out of their cases.
23) Message boards : Number crunching : Pi Power-Up (Message 5043)
Posted 14 Jan 2022 by Brummig
Post:
Some time ago I noticed that my Raspberry Pi 2 was significantly outperforming my Pi 3 running Universe@Home. I didn't have the time to look into it, but I just assumed it was a weak power supply on the Pi 3. The Pi 3 in question was powered from the USB socket on the back of my router (and it wasn't convenient to change this arrangement), whilst the Pi 2 was powered from a switched-mode power supply on a HAT that I designed myself. Over the holiday period I decided to look into it, and discovered my assumption was wrong.

To help me, I wrote two scripts, which I've provided below. These did indeed reveal the Pi 3 was running under-voltage, and as a result throttling the CPU, but not for the reason I thought. I discovered it also ran under-voltage when plugged into a 2 Amp wall-wart (and yes, it was bang on 5 V). There then followed much switching around of Pies, power supplies, and USB cables. This revealed that both Pies always ran under-voltage when I used a random USB cable that came with something. It made no difference whether the cable was anonymous rubbish, or came with a good quality branded product. The only cables with which they didn't run under-voltage were either of the two Anker USB cables I bought simply to allow me to charge and use a device at the same time without finding myself on a short leash that somehow shortened itself further by slowly tying itself in knots. With these cables it made no difference which (potentially adequate) power supply I used, despite the fact that these are quite long cables. I've since bought two more, and these also allow my Pies to perform to their full potential. This is probably because these Anker cables simply have a lower resistance than generic cables. As an added bonus, they are (approximately) Raspberry Pi red. Here's an Amazon link (UK site): https://www.amazon.co.uk/gp/product/B01DEMTOQ6/.

So you might want to use my scripts to check on your own Pies.

This isn't the end of the Pi 2 vs Pi 3 story, but I'll leave that for a second exciting, fun-packed episode.

~/bin/show-status:
#!/bin/bash

# I was fiddling around with the most convenient way (for me) to deal with the need to run vcgencmd as root.
# In the end I left it as a script where the script is called without "sudo", but the first command demands the password.
# You may want to do something different.

sudo echo ""

# Assume Bullseye...
VCGENCMD=/usr/bin/vcgencmd
if [ ! -f "$VCGENCMD" ];
then
  # Nope
  echo "Please upgrade me."
  VCGENCMD=/opt/vc/bin/vcgencmd
fi

awk 'NR==3 {print "WiFi Signal Strength=" $3 "00 %"}''' /proc/net/wireless
sudo $VCGENCMD measure_volts
sudo $VCGENCMD measure_temp
sudo $VCGENCMD measure_clock arm

source <(sudo $VCGENCMD get_throttled)

# Bit Hex value   Meaning
# 0          1    Under-voltage detected
# 1          2    CPU frequency capped
# 2          4    Currently throttled
# 3          8    Soft temperature limit active
# 16     10000    Under-voltage has occurred
# 17     20000    CPU frequency capping has occurred
# 18     40000    Throttling has occurred
# 19     80000    Soft temperature limit has occurred

if [[ $(($throttled & 0x1)) -ne 0 ]];
then
  echo "Under-voltage detected"
elif [[ $(($throttled & 0x10000)) -ne 0 ]];
then
  echo "Under-voltage has occured"
fi

if [[ $(($throttled & 0x2)) -ne 0 ]];
then
  echo "Frequency capped"
elif [[ $(($throttled & 0x20000)) -ne 0 ]];
then
  echo "Frequency capping has occured"
fi

if [[ $(($throttled & 0x4)) -ne 0 ]];
then
  echo "Currently throttled"
elif [[ $(($throttled & 0x40000)) -ne 0 ]];
then
  echo "Throttling has occured"
fi

if [[ $(($throttled & 0x8)) -ne 0 ]];
then
  echo "Soft temperature limit"
elif [[ $(($throttled & 0x80000)) -ne 0 ]];
then
  echo "Soft temperature limit has occured"
fi

echo ""

~/bin/monitor-status:
#!/bin/bash
watch -n 30 "show-status"
24) Message boards : Number crunching : Process still present 5 min after writing finish file (Message 5038)
Posted 12 Jan 2022 by Brummig
Post:
OK, thank you.
25) Message boards : Number crunching : Process still present 5 min after writing finish file (Message 5034)
Posted 12 Jan 2022 by Brummig
Post:
Three times now since December I've had tasks on different Raspberry Pies fail with:
<core_client_version>7.16.16</core_client_version>
<![CDATA[
<message>
Process still present 5 min after writing finish file; aborting</message>
<stderr_txt>
00:20:43 (10500): called boinc_finish(0)

</stderr_txt>
]]>

This is generated after the task has been crunching for a length of time that suggests it completed normally (as does the error message). The oldest of these tasks was completed successfully by two other hosts, and validated (the other two are "in progress" by wingmen).

Most tasks on the hosts in question are completing and being validated.

What's this all about? Is it something I need to fix my end?
26) Message boards : News : No tasks (Message 5033)
Posted 12 Jan 2022 by Brummig
Post:
Well done, Krzysztof, for quickly sorting out a major failure. I did see your message on BOINC Stats, though only after someone mentioned it on the BOINC forum (https://boinc.berkeley.edu/forum_thread.php?id=10279&postid=106682), which I was monitoring for news of Universe@Home. To me that would seem like a more logical place to post when the Universe@Home server is dead, unless you can have a Raspberry PI ready and waiting to provide a "Normal service will be resumed as soon as possible" message :).
27) Message boards : Number crunching : Cannot add Universe to a new Computer (Message 4958)
Posted 29 Nov 2021 by Brummig
Post:
@jackielan2000. Is that 32 bit or 64 bit Win XP? Universe@Home on Windows is 64 bit only.
28) Message boards : Number crunching : When will I get my results validated ? (Message 4952)
Posted 22 Nov 2021 by Brummig
Post:
My advice is the same as I would give you for a glass of Guinness. Wait patiently until it's ready, and then enjoy.

All your tasks are currently listed here:

https://universeathome.pl/universe/results.php?hostid=492004

Some are awaiting validation by other hosts. They may complete a task before your host does, or they may take so long the task is passed to a third host (and if you're really unlucky, a fourth). When a task is validated, your tasks list will be updated.

As for your uncompleted tasks, if you haven't yet hit the deadline, just leave them to run.

Completed and validated tasks are eventually deleted from the list on the server, but they are not forgotten.
29) Message boards : Number crunching : Double or Triple your task throughput on Windows! (Message 4673)
Posted 26 Mar 2021 by Brummig
Post:
Yes, exactly, it's a non-standard approach to error message handling for an app run from the command line, but of course that is not how boinc is intended to be run. There are in fact several similar files in ~/.BOINC, but it looks like only stderrgui.txt is a problem. So I've followed your suggestion (thank you) and linked it to /dev/null (ln -s /dev/null stderrgui.txt), and after a few hours of running all seems OK (I'm paying particular attention to the size of the virtual disk). I'll report back if it explodes in my face again.

I have the byobu flavour of tmux running on an antique laptop that runs Linux without a gui, but I think linking stderrgui.txt to /dev/null is a much neater solution for a Windows PC (assuming it continues to work). It's nice being able to simply close windows to reduce workspace clutter.
30) Message boards : Number crunching : Double or Triple your task throughput on Windows! (Message 4670)
Posted 25 Mar 2021 by Brummig
Post:
I did run one Einstein task, and there didn't seem to be much difference in speed compared with running directly under Windows. Moreover, if I can't run GPU tasks in WSL2, overall less work will get done unless I run Einstein both natively and in WSL2. But thanks for the suggestion.

It turns out it's not a great idea to run boinc in your home directory, because boinc makes that the home directory for its data. Since boinc creates the directory ~/.BOINC, running it there seems as good a place as any. I've not been able to get boinc to run as a service in WSL2, but I have changed how I run it. To avoid an ever expanding nohup.out file, I now use nohup boinc &>/dev/null &.

However, a somewhat bigger problem than nohup.out is the file ~/.BOINC/stderrgui.txt. This is generated by boincmgr. Despite boincmgr appearing to work just fine, it keeps adding the following line to stderrgui.txt:

(boincmgr:923): Gtk-CRITICAL **: 23:29:23.739: gtk_box_gadget_distribute: assertion 'size >= 0' failed in GtkScrollbar


After several hours of running boincmgr, stderrgui.txt filled almost all the available space on my SSD, and then the file started being supplemented with disk error reports. The PC rapidly became too slow to be usable and I resorted to an IEC 60320 C13 shutdown. The file had in fact grown to a whopping 32GB (no, that's not a typo). Obviously I deleted it, but to recover my real disk space I compacted the virtual disk using the instructions here:

https://stephenreescarter.net/how-to-shrink-a-wsl2-virtual-disk/

So for the moment at least, I'll stick to boinctui. If switching from boincmgr to boinctui, note that it's necessary to shut down boinc completely and delete the content of gui_rpc_auth.cfg (it's a password, so make a copy of it, just in case you need it again).
31) Message boards : Number crunching : Double or Triple your task throughput on Windows! (Message 4666)
Posted 23 Mar 2021 by Brummig
Post:
I've finally gotten around to doing this myself, and I can confirm the speed improvement. I installed the Ubuntu 20 flavour of WSL2, and then installed boinc with sudo apt install boinc. I have also installed VcXsrv, but getting that to work with WSL2 requires jumping through a few hoops, and there's no need for boinc purposes; from the command line interface you can either use boinccmd, or you can install and use boinctui. However, if you do run VcXsrv, you can run the boincmgr GUI and everything integrates very nicely into Windows; there's no need to have a Linux desktop on top of the Windows desktop, and if you follow the steps below, no need to have open terminal windows cluttering your workspace.

Now to get boinc running you can just type boinc& into the WSL terminal followed by your chosen boinc manager, but if you close the terminal window, boinc will terminate (very possibly ungracefully). However, the underlying Linux machine continues to run, so if you want to close the terminal window but leave boinc running, type instead nohup boinc &. Similarly you can use nohup to run boincmgr without it being tied to a terminal window. I don't know if the WSL boinc client closes gracefully when the Windows PC closes down, but you can do so manually by typing boinccmd --quit into the WSL terminal.

The Linux functionality I've described is of course well known and documented, but the part that was new to me is how WSL functions, namely as a Linux machine running continuously in the background with the WSL user interface simply a terminal on to that.

Are there any other projects that benefit in the same way from this speed improvement?
32) Message boards : News : No tasks (Message 4625)
Posted 4 Feb 2021 by Brummig
Post:
Have you checked your neighbours, Jim, to see if the courier left your black hole with them? If you no longer have one of your neighbours, then you know where it is.
33) Message boards : News : No tasks (Message 4529)
Posted 27 Nov 2020 by Brummig
Post:
Thank you for letting us know, Krzysztof. Best wishes for a speedy recovery.
34) Message boards : Number crunching : Large number of Abandoned WU's on Raspberry Pi 3 (Message 4508)
Posted 19 Nov 2020 by Brummig
Post:
"Abandoned" means the server thinks the device detached from the project. I've seen it on this project whenever one of my Pies trashes its SD card and needs to be rebuilt. All the WUs that were running get marked as "Abandoned" when I re-attach the rebuilt Pi. Does that give you any clues to work on?

The Universe@Home server was briefly uncontactable yesterday; I had a few uploads stall, though no serious problems.
35) Message boards : Number crunching : WUs stuck on 100% completion and then had to abort (ANDROID) (Message 4306)
Posted 18 May 2020 by Brummig
Post:
I have been seeing something similar. I'll find that the task is at 100% and Boinc has stopped running it. If I force the issue by suspending the other tasks, the task will restart from zero. I have been aborting these when I find them, because recently I had a task time-out on Android. I don't know if it had been resetting back to zero endlessly, but it's a possibility. I have no problems with other projects on this host, or problems completing Universe@Home tasks on time (when they behave themselves). I think there may be a problem with what happens when Universe@Home tasks on Android get to 100%.
36) Message boards : Cafe : Spam Message Boards (Message 4013)
Posted 23 Jan 2020 by Brummig
Post:
Whilst using the search box I found a message on a spam message board called "Gungnir", all in Vietnamese:
https://universeathome.pl/universe/forum_forum.php?id=12
There is a second Vietnamese spam message board here:
https://universeathome.pl/universe/forum_forum.php?id=14
The former is in active use, whilst the latter was active in the first half of 2019. I'm guessing (from the other message boards with an ID greater than 9) that these are team message boards, a feature I didn't even know existed.
37) Message boards : Science : Offensive message function doesn't work (Message 4012)
Posted 23 Jan 2020 by Brummig
Post:
Currently the button functions, but when you click the OK button to make your report the server responds with:
You must provide at least one recipient email address.You must provide at least one recipient email address.send email failed

Your report could not be recorded. Please wait a while and try again.

If this is not a temporary error, please report it to the project developers.

I was trying to report a couple of spam posts.
38) Message boards : Number crunching : my credits are gone on boincstats. (Message 3986)
Posted 10 Jan 2020 by Brummig
Post:
Please see https://universeathome.pl/universe/forum_thread.php?id=490#3940
39) Message boards : Number crunching : No work available (on Raspberry Pi) (Message 3434)
Posted 11 Mar 2019 by Brummig
Post:
I assume it's this unfixed problem again:

https://universeathome.pl/universe/forum_thread.php?id=375

All five of my U@H hosts have now run dry.
40) Message boards : Number crunching : Project has no tasks available (Message 3370)
Posted 15 Feb 2019 by Brummig
Post:
Well my queue is twice the length of your queue, but since my hosts are told for very long periods of time (days or weeks) by U@H that there are no tasks available, even with that queue they run dry. And then when they do get tasks, it may only be one or two, so the queue doesn't get refilled.


Previous 20 · Next 20




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek