Message boards :
Number crunching :
Odd performance on different computers
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 Mar 20 Posts: 6 Credit: 313,036,633 RAC: 0 |
Hi, I'm crunching bh2 spin on several different types of computers and I have a very odd performance :) i7-3770k 3.7GHz, 4c8t, 32GB RAM (limited to 8GB in LXC, 4 tasks in parallel) crunches a task in less than 2 hours Ryzen 2700 3.3GHz, 8c16t, 32GB RAM (unlimited memory, 12 tasks in parallel) crunches a task in almost 3.5 hours?! PIne64+ 1.152GHz (Cortex A53, 4c4t, 2GB RAM, 4 tasks in parallel) takes about 16 hours (I can understand that one, slower cores and very slow clock) why is Ryzen so much slower when it is much newer? Is there some compile time flag optimized for Intel? I've even tried disabling smp on the Ryzen machine but no change there, still over 3 hours per tasks. Cheers Ashley :) |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
I see only ARM machines on your account, did you use another account for other architectures? Also, both Intel and AMD are Linux based? Just for info - our tasks not use much memory, but ULX can use some space for temporary files and slow disks can delay finishing time (not much, but always a little). Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 14 Mar 20 Posts: 6 Credit: 313,036,633 RAC: 0 |
yeah using a team account atm The Ryzen is Windows based, would it make *that* much difference? And the Ryzen is NVMe SSD whereas the i7 is slow 7k2 RPM HDDs. Since it uses very little RAM I'd expect it to trash cache at least if not something else, but even there the Ryzen wins :) |
Send message Joined: 4 Feb 15 Posts: 847 Credit: 144,180,465 RAC: 0 |
The Ryzen is Windows based, would it make *that* much difference? Yes. Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 14 Mar 20 Posts: 6 Credit: 313,036,633 RAC: 0 |
Wow. In that case I may as well move the BOINC to a linux virtualized guest and see how that fares! :) thanks for the info. |
Send message Joined: 25 Mar 16 Posts: 12 Credit: 828,261,700 RAC: 0 |
With the Ryzen using Linux, WU durations should come down to around an hour. Unless the i7 is doing other CPU stuff as well, running just 4 threads benefits run times as well vs. 8 threads running on SMT cores. |
Send message Joined: 4 Apr 15 Posts: 46 Credit: 43,128,567 RAC: 0 |
Hi, I'm crunching bh2 spin on several different types of computers and I have a very odd performance :) On the i7 youhave 4c8t and you run 4 tasks at a time...time to crunch under 2 hours on the AMD you have 8c16t and you run 12 tasks at a time...time to crunch almost 3.5 hours Notice anything funny there...you are running too many threads on the AMD and are using the virtual cores for 4 of the tasks reducing the efficiency of each task, cut it back to 8 threads at a time and see if it doesn't speed up alot. |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
Notice anything funny there...you are running too many threads on the AMD and are using the virtual cores for 4 of the tasks reducing the efficiency of each task, cut it back to 8 threads at a time and see if it doesn't speed up alot. (1) Cutting it back to 8 threads (in effect, 8 full cores) increases the speed of each task, but it does not increase the efficiency. It reduces it. The total throughput will be greater if you use the virtual cores (all 16 of them). (2) Yes, Linux is better than Windows. Use Windows on WCG/MCM, or Rosetta, or if you want astronomy on MilkyWay n-body or Einstein. They all do well on Windows. (3) I have crunched a lot of BHspin v2 on both Ryzens and Intel. The speed on a Ryzen 2700 is almost the same (maybe slightly more) than on an i7-3700 under comparable conditions; i.e., using the same percentage of the cores. (4) At the moment, I am using a Ryzen 2600 under Ubuntu 18.04.4, with 11 cores devoted to Universe and 1 core reserved for a GPU, but not in use at the moment. I am averaging 1 hour 33 minutes on BHspin v2 and about 32 minutes 31 seconds (very consistently) for ULX. |
Send message Joined: 14 Mar 20 Posts: 6 Credit: 313,036,633 RAC: 0 |
yup, drained the windows host, fired up a linux VM with 12c/16GB RAM and the tasks are much faster than on windows (even 25 - 30% faster than on the i7-3770) on all U@H, Rosetta, WCG, ... and SMT may slow down about 10% - 15% each thread BUT you get double the amount of them in parallel. The final number of crunches/second is much higher. |
Send message Joined: 28 Apr 20 Posts: 2 Credit: 7,430,367 RAC: 0 |
Maybe you got heating issues? It will likely downclock itself to not get damaged. I checked a few of the new AMD processors, and it seems like all of them are wattage beasts (serveral hundred watts) |
Send message Joined: 14 Mar 20 Posts: 6 Credit: 313,036,633 RAC: 0 |
nope it was pure windows vs linux apparently. No idea why but it's just the way it is. and R2700 on stock is nothing hard to cool, I had overclocked i5-2500k cooled down before with this cooler :P |
Send message Joined: 10 May 20 Posts: 310 Credit: 4,733,484,700 RAC: 0 |
The Ryzen is Windows based, would it make *that* much difference? Two new hosts working on BHSpin2 tasks. Both Linux based. Both running the same cpu clocks. One host is 3X faster than the other. AMD Ryzen9 3950X runtimes = 37 minutes AMD Threadripper 2920X runtimes = 93 minutes Are there different species of tasks with very large differences in runtimes? Is a "311" task very different from a "308" task? [Edit] Does the BHspin2 application use any advanced SIMD instruction like SSE3/4 or AVX/2? [Edit2] Neither host is overcommitted on cpu core usage. A proud member of the OFA (Old Farts Association) |
Send message Joined: 10 May 20 Posts: 310 Credit: 4,733,484,700 RAC: 0 |
The Ryzen is Windows based, would it make *that* much difference? Main difference between the two hosts was the OS. Ubuntu 20.04 for the 3950X and Ubuntu 18.04 for the 2920X. Soon as I upgraded the 2920X to Ubuntu 20.04 the times between the two hosts matched pretty much. The 3950X does have a 100Mhz core clock advantage though. Both hosts running a fixed all-core clock. Both hosts running 90% of all cores. So the throughput of the 32 core is better of course. I am not the only one to notice the almost 2X speed improvement of moving from Ubuntu 18 to Ubuntu 20. Three other team members saw the exact same improvement. That was on both Intel and AMD hardware. And that was with the same 5.4.0-42-generic kernel on both OS'. A proud member of the OFA (Old Farts Association) |
Send message Joined: 19 Aug 18 Posts: 11 Credit: 2,255,279 RAC: 0 |
I confirm the same improvement due to the upgrade from OS guest Ubuntu 18.04 to 20.04 for my host ID 544662 : with AMD A10-7800 Radeon R7.OS Windows host 2004 with virtualbox 5.2.32 . The average cpu time decreased from 151 minutes to 71 minutes.(doubled performance) The measured speed for floating point calculations has increased from 2.64 to 3.19 billion operations by second (according to the project estimate). Trying to compare the situation with other crunchers , i tried to select the fastest computers , present on this project , only looking at the cpu times of the work units processed. Top 5 : Host ID 522605 : Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz Arch Linux [5.7.11-arch1-1|libc 2.31 (GNU libc)] Host ID 549269 : AMD Ryzen 9 3950X Gentoo Base System release 2.7 [5.8.5-gentoo-x86_64|libc 2.32 (Gentoo 2.32-r1 p1)] Host ID 569321 : Intel Core Processor (Skylake, IBRS) Fedora 32 (Workstation Edition) [5.7.12-200.fc32.x86_64|libc 2.31 (GNU libc)] Host ID 506492 : AMD Ryzen 9 3950X Linux Mint 20 [5.4.0-45-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)] Host ID 570669 : Intel(R) Core(TM) i5-9400 CPU @ 2.90GHz Ubuntu 20.04.1 LTS [5.4.0-44-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)] I don't know if there is overclocking... So without to try to say which is the best configuration , i only advise windows crunchers to at least run this project with a VM and a Linux OS guest if they want to not waste their resources . To install Boinc inside a VM is not so complicated : Ubuntu Mint |
Send message Joined: 1 Nov 17 Posts: 29 Credit: 291,940,933 RAC: 0 |
I advise against using Mint because I tried both Mint and Ubuntu and Mint had some tasks running at Windows speed, so no performance increase at all, with a runtime of around 4 hours, and some at Ubuntu speed with around 1 hour runtime. Maybe it is also hardware dependent how often it works, with things like CPU cache. Or Mint got an update in the last 2-3 months introducing the changes Ubuntu 20.04 already got. But I have yet to encounter an instance where Ubuntu 20.04 DIDN'T work, so I think it would be the safer and easier route to just use that. Also, did you try setting the thread count for the VM (VirtualBox) near the total threads of the Windows machine? I had massive lag problems with the newest version, wenn setting 13 or more of 16 threads usable by the VM. Only Hyper-V works smoothly. |
Send message Joined: 19 Aug 18 Posts: 11 Credit: 2,255,279 RAC: 0 |
Sorry , rsNeutrino , but i don't have the appropriate hardware to do a bedtest . Maybe another cruncher could answer you... But i know that the virtualbox performances decrease with the number of VMs running simultaneously.This bad effect appears mainly on the computers with numerous processors.It's negligible for small computers with 4 threads for instance. |