Message boards : Number crunching : Pi Autoshifter
Message board moderation

To post messages, you must log in.

AuthorMessage
Brummig
Avatar

Send message
Joined: 23 Mar 16
Posts: 95
Credit: 23,431,842
RAC: 23
Message 5137 - Posted: 5 Mar 2022, 17:39:37 UTC

Previously on Universe@Home I wrote about how the USB cable used to feed power to a Raspberry Pi is critical if you want it to run at full speed: https://universeathome.pl/universe/forum_thread.php?id=615. So having replaced all my Pi power cables, was I a happy cruncher? Were my Pi 3's now outrunning my Pi 2? No, as it turns out there is another problem.

A Pi 2 can run four tasks night and day, and with no heatsink the CPU temperature will remain entirely reasonable. If you try to do the same on a Pi 3, even with the standard (small) heatsink that is often used, the CPU temperature will rise and rise until the Pi starts throttling in an attempt to not boil its own brains. However, the temperature at which it does this is rather alarming (over 80 °C). There is a strong relationship between temperature and the reliability and lifetime of electronic components, so this is bad. Of course the throttling also means the processor slows down, and by quite a lot. So much so, that a Pi 2 with all four cores running flat out may well end up outperforming a Pi 3 that is throttling itself.

I really didn't want the CPUs in my Pi 3's running at close to their maximum temperature rating, but adding more capable cooling is not an option as both of them wear HATs. However, one of them does live outside, where in the winter it can get pretty cold. To take advantage of that I needed a means of automatically adjusting the workload, preferably not installing additional software packages in order to do so. I couldn't get the CPU frequency to change reliably, so I gave up on that. There is a BOINC option to change the percentage of time for which each CPU works, but this works rather crudely, just turning the task on and off using a PWM-style approach. That is visually irritating if you use boinctui to monitor BOINC. So I opted to automatically adjust the percentage of cores used according to CPU temperature, taking care to play nicely with the method by which the GUI BOINC manager controls computing options. The script below is my solution, and is run by cron every fifteen minutes (*/15 * * * *) on all my three BOINC Pies. It has been working nicely for several months now.

For various reasons my Pies are tucked away in strange places. My Pi 2 lives in the airing cupboard, a few centimetres away from the (heavily insulated) hot water cylinder, so the ambient temperature is often elevated above room temperature. Every time I have checked on it I have found it was running all four cores, and it has an RAC on U@H of 1,602. Its current CPU temperature is 68 °C (four cores running).

One of my Pi 3's lives behind the sofa in the living room. It typically runs two cores, but occasionally it drops back to one core to keep the CPU at a reasonable temperature. It has it has an RAC on U@H of 1,360. Its current CPU temperature is 73 °C (two cores running).

My Pi 3 that lives outside mostly runs two cores, but when the ambient temperature drops below about 5 °C it makes more and more use of a third core. It it has an RAC on U@H of 1,664. Its current CPU temperature is 71 °C (two cores running, 8 °C ambient).

So as you can see, despite the slower cores the Pi 2 is comfortably outrunning the Pi 3 in the living room, and is currently a match for the Pi 3 outside. However, as we head through Spring, I fully expect the Pi 2 to start outrunning the Pi 3 in the garden. Those little heatsinks (that are not even needed on the Pi 2) really only buy you a few minutes in which to work the CPU flat out. Without substantial heatsinking, IMHO it's better to crunch using a Pi 2 than a Pi 3. I don't have a Pi 4 to test, but I would not be surprised to find the same issue.

boinc-autoshifter
#!/bin/bash

if ! [ $(id -u) = 0 ]; then
   echo "Must be run as root!"
   exit 1
fi

# Assume Bullseye...
VCGENCMD=/usr/bin/vcgencmd
if [ ! -f "$VCGENCMD" ];
then
  # Nope
  echo "Please upgrade me."
  VCGENCMD=/opt/vc/bin/vcgencmd
fi

if [ ! -f "/var/lib/boinc-client/global_prefs_override.xml" ];
then
  # Create overrides file

  echo "Creating computing preferences override of percentage of BOINC CPUs."
  echo -e "<global_preferences>\n   <max_ncpus_pct>50.000000</max_ncpus_pct>\n</global_preferences>" > /var/lib/boinc-client/global_prefs_override.xml
  chown boinc:boinc /var/lib/boinc-client/global_prefs_override.xml
fi

# Set variables to initial values

pcpus=$(grep -oPm1 "(?<=<max_ncpus_pct>)[^<]+" /var/lib/boinc-client/global_prefs_override.xml)
pcpus_new=$pcpus
pcpus_int=$(echo $pcpus | sed 's/\..*$//')
pcpus_trailer=$(echo $pcpus | sed 's/^[^.]*//')
source <($VCGENCMD measure_temp | sed "s/'C//g")

echo "Current temperature is: "$temp" °C"

# Change the number of cpus?

if awk "BEGIN {exit !($temp < 65)}";
then
  if awk "BEGIN {exit !($pcpus <= 75)}";
  then
    pcpus_new=$(expr $pcpus_int + 25)$pcpus_trailer
  else
    pcpus_new=100.000000
  fi
elif awk "BEGIN {exit !($temp > 75)}";
then
  if awk "BEGIN {exit !($pcpus > 25)}";
  then
    pcpus_new=$(expr $pcpus_int - 25)$pcpus_trailer
  else
    echo "Warning: overtemperature and unable to reduce BOINC load."
    pcpus_new=25.000000
  fi
fi

if awk "BEGIN {exit !($pcpus != $pcpus_new)}";
then
  echo "Switching percentage of BOINC CPUs from "$pcpus" to "$pcpus_new
  sed -i "s/<max_ncpus_pct>$pcpus<\/max_ncpus_pct>/<max_ncpus_pct>$pcpus_new<\/max_ncpus_pct>/g" /var/lib/boinc-client/global_prefs_override.xml
  boinccmd --read_global_prefs_override
fi
ID: 5137 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 10 May 20
Posts: 308
Credit: 4,733,484,700
RAC: 13,061
Message 5138 - Posted: 5 Mar 2022, 21:48:07 UTC - in response to Message 5137.  

You do know that they make HAT's with fans installed in them to overcome the throttling issues. Right?

I have 3 SoC computers that all have fans cooling the cpu and I never have throttling issues running all four cores 24/7.

A proud member of the OFA (Old Farts Association)
ID: 5138 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 23 Mar 16
Posts: 95
Credit: 23,431,842
RAC: 23
Message 5144 - Posted: 6 Mar 2022, 12:44:09 UTC - in response to Message 5138.  
Last modified: 6 Mar 2022, 12:48:39 UTC

Yes, of course I know there are HATs with fans, but I can't fit them to these Pies due to the way these Pies are used for their primary purposes. Plus I don't want to use fans anyway, on account of the dust, noise, and reliability issues, and the increased power consumption. For me, one of the nice features of Pies is the lack of an obligatory fan.
ID: 5144 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers
Avatar

Send message
Joined: 10 May 20
Posts: 308
Credit: 4,733,484,700
RAC: 13,061
Message 5145 - Posted: 6 Mar 2022, 17:17:56 UTC - in response to Message 5144.  

You can buy fans that are not loud and last a long time. My Noctua NF-A4X10 fan on my Jetson Nano is quiet and undetectable. Been running non-stop for 3 years now.

A proud member of the OFA (Old Farts Association)
ID: 5145 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
franz

Send message
Joined: 15 Jan 17
Posts: 5
Credit: 20,712,233
RAC: 0
Message 5235 - Posted: 29 Mar 2022, 9:45:27 UTC

I have 4 odroids and 2 raspberrys; i thought i'd chime in about my own experience with running BOINC on them.

First, with BOINC, in my experience much better (more efficient) to use all CPU cores at low frequency, than to use less cores at high frequency, because power efficiency goes down with rising frequency. I don't know why OP had problems with limiting frequency, maybe he was trying to do it with hardware limiting; you only need to limit the linux kernel scaling frequency, there's no need to mess with hardware limits. It's as simple as this:

for F in /sys/devices/system/cpu/cpu?/cpufreq/scaling_max_freq ; do echo MAX_FREQ >$F ; done


... where MAX_FREQ is the desired max frequency in kilohertz.

Also i must disagree with OP saying that little heatsinks only help a for a few minutes. They help a lot, even the small ones, but only if air can move through them easily; if the whole Pi is in a box without holes, any heatsink will be useless. It's also likely that hats will limit the airflow.

I have a RPi3 outside, with a small aluminium heatsink (from transistor), no hats, only some sensors via GPIO. It's limited to 1000 Mhz, running U@H on all cores, with ambient -2 to 10C (winter) it runs at around 40-45C, no throttling. RAC is around 5000 (U@H tasks only). At summer it gets much hotter and starts throttling, so i underclock it more.

I can't compare RPi2 vs RPi3 as i don't have the RPi2. But RPi4 is no contest; it's significantly faster, with the same 1000Mhz it's about +50% RAC at U@H, with similar heat output.

And as for the "80C is alarming to people..." : that might be true, but only for people who have no insight into electronics. Most electronic components are rated for 100+ C; some transistors are even rated 150+ C. Pretty much all x86 laptops sold today (and past years) have throttling set to 100C. So 80C on RPi is perfectly safe. As for longevity, 80C might affect it, but not longevity of the CPU; more likely with capacitors. More damaging to CPU than 80C is frequent temperature cycling (going hot-cold-hot-cold etc). Ofc you might want to reduce temp for other reasons (cost and/or efficiency).
ID: 5235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 23 Mar 16
Posts: 95
Credit: 23,431,842
RAC: 23
Message 5238 - Posted: 30 Mar 2022, 10:59:18 UTC - in response to Message 5235.  

As a designer of aerospace electronics, I'm required to derate components against their maximum temperature, which in the case of semiconductors is the maximum junction temperature as specified by the manufacturer. For microprocessors Boeing requires a 20-30 °C derating, depending on how critical is the equipment. Commercial grade microprocessors don't go anywhere near 150 °C maximum junction temperature; the BCM2837 appears to be rated at 85 °C. Modern commercial electronics is typically pushed far closer to the temperature limits than is acceptable in aerospace applications, leading to short lifespans if a component spends most of its life at elevated temperature (of course much commercial electronics spends most of its life idling or completely unpowered). One can exceed the manufacturer's rating without the component immediately failing, but don't expect it to have a very long life. That's probably OK for some kinds of ordnance, but not a good idea if you want something to run 24/7 for many years. Don't forget also that one hot component can cause "hot-boxing", taking other components beyond their rated temperature.

Keeping the temperature down improves lifetime and reliability, and that's important to me as I am not one of those people who treats electronics as disposable. As a species, we have become far too wasteful of precious resources. I would prefer not to run the processor 24/7 at the safety limit, but you're free to do that if you like, or even turn off the limiting altogether if you so desire. I'm just offering a way to get the temperature down without fitting fans or massive heatsinks.

Regarding heat cycling, I did consider that and set the temperature limits in the code to minimise it. I could improve the situation with more sophisticated control, but following monitoring I'm happy with how it's working.

As I made clear in my OP, both Pi 3s have HATS. One has room for a heatsink about 3 mm high, whilst the other has to make do with a custom-made sheet of aluminium about 1 mm thick. Both Pies are in confined spaces. I did run some experiments with and without heatsinks, and with different levels of confinement. The heatsinks did make a difference, but not a big difference. Their performance did improve when the Pi was run out of a case, but that's not a practical solution. In the case of one of the Pies I was able to improve the ventilation. Maybe one day I'll find the time to make a better case. Probably not.

I wanted to automate the work limiting, as the Pi in the garden sees wide temperature swings, even during a single day. Whilst I was able to set the Pi to a frequency and have it stay there, setting the frequency dynamically just didn't work, and trying to make it work was wasting far too much time.
ID: 5238 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Pi Autoshifter




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek