Prospectors, Miners, and 49er's: Dual GPU-CPU Mining on the ODROID-XU4/MC1/HC1/HC2

There are many people using the XU4/MC1/HC1/HC2 for CPU crypto-mining, so what could be better than using your GPU for mining as well? The algorithm performance isn't viable for many popular coins but in the right situation it might make sense, such as for new coins or coins with a low difficulty. If nothing else, it's another fun tool for your toolbag.

After looking around at the available options, work began on getting the genesis mining fork of SGMiner compiled. SGMiner-GM 5.5.5 is an OpenCL GPU crypto miner and is the most recently maintained version of SGMiner. It has been around a while, supports more crypto algorithms than earlier versions, and has no usage fee. It includes mining for Credits, Scrypt, NScrypt, X11, X13, X14, X15, Keccak, Quarkcoin, Twecoin, Fugue256, NIST, Fresh, Whirlcoin, Neoscrypt, WhirlpoolX, Lyra2RE, Lyra2REV2, Pluck, Yescrypt, Yescrypt-multi, Blakecoin, Blake, Vanilla, Ethash, Cryptonight, and Equihash.

The program source is available for download from https://goo.gl/Gp25ep, and the forum support thread can be viewed at https://goo.gl/hDVmbF.

Figure 1 - XU4 Dual pool mining Monacoin with CPUMiner-Multi and SGMiner-GM 5.5.5 using Lyra2REv2
Figure 1 - XU4 Dual pool mining Monacoin with CPUMiner-Multi and SGMiner-GM 5.5.5 using Lyra2REv2

It is possible, in conjunction with CPUMiner-Multi or a coin specific miner like VeriumMiner, to concurrently CPU and GPU mine. Extensive testing has not been done, but a number of dual-mining configurations, including scrypt2, Lyra2REv2, and cryptonight (CPU only) solo and pool-mining, have been completed successfully. It is possible to solo-mine on one and pool mine on the other while running other crypto algorithms. CPU temperatures while dual-mining require the large CPU cores to be slowed down so please pay attention to the temperatures if you try this!

CPUMiner-Multi supports more than 45 crypto algorithms, making it quite useful for dual-mining multiple algorithms. If you’re not yet familiar with it, check it out at https://goo.gl/hUQG3F. Another helpful dual-mining program for those mining Verium (VRM) is a fork of VeriumMiner by fireworm71 at https://goo.gl/6ET7bj.

The minder can run 1-way and 3-ways at the same time which allows for better memory utilization. It appears that if the GPU (SGMiner) is started first, you end up with more memory to use for the CPU crypto algorithm while dual-mining. Below is the command line used mining Verium (4 large cores 3-way and 1 small core 1-way) while also GPU mining Monacoin with Lyra2REv2:

$ ~/cpuminer -o stratum+tcp://yourpool.na:port -u username -p password --randomize --no-redirect -t 4 -1 1 --cpu-affinity-stride 1 --cpu-affinity-default-index 4 --cpu-affinity-oneway-index 0

Compile SGMiner-GM 5.5.5

The following instructions are typical for SGMiner, with the exclusion of the source file edits. For general reference and configuration information there is a good install Wiki for x86 Ubuntu 16.04 at https://goo.gl/qnFmb2. First, download the latest ARM Computer Vision and Machine Learning library from https://goo.gl/LdFvy5.

Please note that the uncompressed package will not fit on a 8GB SD card. You can delete the unnecessary libraries from ./arm_compute-v17.12-bin/lib to get it down to size. Keep the linux-armv7a libraries and delete the android-* and linux-arm8*. Default installation is /usr/lib/arm_compute-v17.12-bin

$ cd /usr/lib
$ tar -xvzf ~/arm_compute-v17.12-bin.tar.gz
$ cd ~/
$ rm arm_compute-v17.12-bin.tar.gz
Download the AMD APP SDK from https://goo.gl/cZeDJc. This is for a root installation from ~/. See the installation notes for a non-root installation at https://goo.gl/Hw7vkP. The default installation is /opt/AMDAPPSDK-3.0.
$ tar -xvjf AMD-APP-SDKInstaller-v3.0.130.136-GA-linux32.tar.bz2
$ ./AMD-APP-SDK-v3.0.130.136-GA-linux32.sh
$ rm AMD-APP-SDK-v3.0.130.136-GA-linux32.sh
$ rm AMD-APP-SDKInstaller-v3.0.130.136-GA-linux32.tar.bz2
Download AMD Display Library (ADL) SDK from https://goo.gl/CqhZq1:
$ apt-get install unzip
$ unzip ADL_SDK_V10.2.zip -d /opt/ADL_SDK_V10.2
$ rm ADL_SDK_V10.2.zip
Install the dependencies with the following command:
$ apt-get install automake autoconf pkg-config
$ libcurl4-openssl-dev libjansson-dev libssl-dev libgmp-dev make $ g++ git libgmp-dev libncurses5-dev libtool mali-fbdev
Note that mali-fbdev is needed if using Ubuntu minimalist image, otherwise use Mali-T628-ODROID for the Debian minimalist image.

Download Git and move headers with the following commands:

$ git clone https://github.com/genesismining/sgminer-gm
$ cd sgminer-gm
$ cp /opt/ADL_SDK_V10.2/include/*.h ./ADL_SDK
Some of the versions of SGMiner I’ve looked at have similar compile issues; others have additional problems. Here is what to change in the SGMiner-5.5.5 source code to get it to compile correctly. Make the following edits in 4 files:

Change line 32 of kernel/lyra2rev2.cl from:

#pragma OPENCL EXTENSION cl_amd_printf : enable
to:
#pragma OPENCL EXTENSION cl_amd_printf : disable
Change kernel/skein256.cl starting on line 49-59 from:
__constant static const int ROT256[8][4] =
 {
 46, 36, 19, 37,
 33, 27, 14, 42,
 17, 49, 36, 39,
 44, 9, 54, 56,
 39, 30, 34, 24,
 13, 50, 10, 17,
 25, 29, 39, 43,
 8, 35, 56, 22,
 };
to:
__constant static const int ROT256[8][4] =
 {
 {46, 36, 19, 37},
 {33, 27, 14, 42},
 {17, 49, 36, 39},
 {44, 9, 54, 56},
 {39, 30, 34, 24},
 {13, 50, 10, 17},
 {25, 29, 39, 43},
 {8, 35, 56, 22}
 };
Change line 58 of ocl/build_kernel.c from:
sprintf(data->compiler_options, "-I \"%s\" -I \"%s/kernel\" -I \".\" -D WORKSIZE=%d",
to:
sprintf(data->compiler_options, "-I %s -I %s/kernel -I . -D WORKSIZE=%d",
Change line 66 from:
strcat(data->compiler_options, " -I \"");
to:
strcat(data->compiler_options, " -I ");
Change line 68 from:
strcat(data->compiler_options, "\"");
to:
strcat(data->compiler_options, "/");
Change algorithm/cryptonight.c starting on line 139 from:
__asm__("mul %%rdx":
"=a" (lo), "=d" (hi):
"a" (a), "d" (b));
to:
//__asm__("mul %%rdx":
//"=a" (lo), "=d" (hi):
//"a" (a), "d" (b));
Cryptonight becomes dysfunctional by commenting out the assembly optimization. Do not use Cryptonight, WhirlpoolX, Ethash, or Equihash, since after fixing the extended assembly above, it compiles, but there is another problem which lacks an easy fix. It appears that these OpenCL kernels are using AMD OpenCL extensions that aren't supported on the ARM platform and therefore cannot compile and initialize the GPU. The kernels may need to be rewritten in order to get them to function. This needs further exploration, as Cryptonight is used by more coins and may be economically viable for GPU and CPU mining on this device. I will continue working on this.

Issue the following commands in the base SGMiner-GM directory to finish the compilation:

$ git submodule init
$ git submodule update
$ autoreconf -fi
$ CFLAGS="-Os -Wall -march=native -std=gnu99 -mfpu=neon" LDFLAGS="-L/usr/lib/arm_compute-v17.12-bin/lib/linux-armv7a-neon-cl" ./configure --disable-git-version --disable-adl --disable-adl-checks --prefix=/opt/sgminer
In the configuration summary, you should see that OpenCL was found and that GPU mining was enabled. If it is not, then OpenCL is not setup correctly and must be fixed before proceeding. The Hardkernel Ubuntu images come with OpenCL setup. This build was done on ubuntu-16.04.3-4.14-minimal-odroid-xu4-20171213.img successfully. Check your proceeding steps for accuracy.
------------------------------------------------------------------------
sgminer 5.5.5-gm-a
------------------------------------------------------------------------

Configuration Options Summary:

Use git version......: no
libcurl(GBT+getwork).: Enabled: -lcurl
curses.TUI...........: FOUND: -lncurses
OpenCL...............: FOUND. GPU mining support enabled
ADL..................: Detection overrided. GPU monitoring support DISABLED

Compilation............: make (or gmake)
CPPFLAGS.............:
CFLAGS...............: -Os -Wall -march=native -std=gnu99 -I/opt/AMDAPPSDK-3.0/include
LDFLAGS..............: -L/usr/lib/arm_compute-v17.12-bin/lib/linux-armv7a-neon-cl -lpthread
LDADD................: -ldl -lcurl submodules/jansson/src/.libs/libjansson.a -lpthread -L/opt/AMDAPPSDK-3.0/lib/x86 -lOpenCL -lm -lrt

Installation...........: make install (as root if needed, with 'su' or 'sudo')
prefix...............: /opt/sgminer
Make and install the package:
$ make -j5
$ make install

Quick Tests

$ ./sgminer --version

$ sgminer 5.5.5-gm-a

$ ./sgminer -n

[20:41:54] CL Platform vendor: ARM
[20:41:54] CL Platform name: ARM Platform
[20:41:54] CL Platform version: OpenCL 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
[20:41:54] Platform devices: 2
[20:41:54] 0 Mali-T628
[20:41:54] 1 Mali-T628
[20:41:54] 2 GPU devices max detected
According to the install Wiki, "The first of these will fail if any libraries are missing, so if we get a version number then the compiled binary duly executes on our system. The second checks for OpenCL GPU devices on the default OpenCL platform. If both commands work without error and the latter indicates the correct OpenCL platform, you're well on the way to a working installation."

Assuming you have accounts set up at pools or are solo mining, a quick way to configure is by using the command line instead of a configuration file. You can learn more about all of this at the install wiki and under ./sgminer/doc/configuration.md. Using a simple script for testing is quick and easy because some variables need to be set.

#!/bin/bash

$ export GPU_FORCE_64BIT_PTR=0
$ export GPU_USE_SYNC_OBJECTS=1
$ export GPU_MAX_ALLOC_PERCENT=100
$ export GPU_SINGLE_ALLOC_PERCENT=100
$ export GPU_MAX_HEAP_SIZE=100
$ ./sgminer -k algorithm -o stratum+tcp://pool.na:port -u user.worker -p password -I 14 -w 64 -d 0,1 --thread-concurrency 8192
The intensity (-I 14) and work size (-w 64) can be tuned for better (or worse) performance. Since the Mali-T628 has two devices, both are selected (-d 0,1). Device 0 has 4 cores and Device 1 has 2 cores. More information on GPU settings is located in ./sgminer/doc/gpu.md.

When you start SGMiner, there is a long 30 to 40 second delay while the kernels for both GPU devices are created and loaded. The screen only has a couple of lines and it may look frozen. Be patient. It will then turn black for about 10-15 seconds, after which time it will show the curses interfaces. For testing, you can use a -T in the command line to disable the curses terminal interface and use simple text. It shows more information during the initialization process. Some hardware errors while running are normal. If you find you’re getting a lot of hardware errors, try adjusting the intensity, since each algorithm will be different and needs to be tuned. This is where using a configuration file is useful. You can use different settings for different algorithms and pools.

My XU4/MC1 cluster is divided into four thermal groups and runs at speeds to maintain 24/7/365 operation in the 70-79℃ range. The MC1s run the coolest of all the ODROIDs. Figure 2 shows one dual-pool mining Verium with scrypt (CPUMiner) and Monacoin with Lyra2REv2 (SGMiner)--two hours for benchmark purposes. With this combination and frequency rate, the CPU hash rate decreased approximately 19% while GPU mining and the GPU hash rate decreased approximately 4% during CPU mining. This will, of course, vary depending on the algorithm and other configuration factors. There has been one 24 hour test of thirty ODROIDs dual-mining with no issues.

Figure 2 - Running SGMiner on the ODROID-XU4 cluster
Figure 2 - Running SGMiner on the ODROID-XU4 cluster

Regardless of the fact that a few of the OpenCL kernels are not working, this is still the best option that I'm aware of for GPU mining on the XU4/MC/HC1/HC2. The good news is that there are many other crypto algorithms SGMiner supports, but be aware that only a few were tested. Let everyone know if you find more that have a problem. When more headway is made on getting the other kernels working it will be posted on the forum at https://forum.odroid.com/viewtopic.php?f=98&t=29571.

Dual GPU-CPU Mining Test

The Dual GPU-CPU Mining Test is intended to study the effects of CPU frequency change on GPU operational temperature for 1 hour 50 minutes with an ambient air temperature of approximately 76F (24.44C). For the first ten minutes of the test, only the GPU was used to mine in order to establish the baseline GPU operational temperature using Monacoin with Lyra2REv2 (SGMiner) Pool with the following options:

-I 14 -w 64 -d 0,1 --thread-concurrency 8192
For the remainder of the dual-mining test, CPU Verium with Scrypt (CPUMiner 8 threads No affinity) Solo and GPU Monacoin with Lyra2REv2 (SGMiner 1 thread) Pool were used with the following options:
-I 14 -w 64 -d 0,1 --thread-concurrency 8192
The CPU frequency was decreased by 100 Mhz every ten minutes to 1.2 Ghz then raised 100 Mhz every five minutes to 1.9 Ghz. It was then changed to 1.6 Ghz for the remainder of the test.

The GPU mined at 51℃ for the first ten minutes of the test, and then rose with the temperature of the CPU cores forming a plateau at each frequency change. The GPU never exceeded 72℃ except for a few brief spike to 74℃ The temperature drops in the GPU during the test appear to be correlated to the frequency change of the CPU cores. The GPU hash rate (71 kh/s) was steady during the whole test, while the CPU hash rate varied according to the frequency setting, as expected.

Figure 3 - Dual Mining GPU/CPU Core Temperatures vs. Frequency
Figure 3 - Dual Mining GPU/CPU Core Temperatures vs. Frequency

A quick note about rejected shares for new miners. There are numerous reasons why you might get rejected or stale shares. While it could be an error, most are caused by network latency. Two possible scenarios exist, where the first is that your rig is mining away on a block, finds a valid share and submits to the network. In the meantime, the block was solved and a new block and work issued. When your share is submitted, it is now stale and will be rejected. The ST indicator in SGMiner indicates the number of stale shares you have submitted. This is not an error, and there is not much you can do about it. You can reduce the chance of having a problem by not mining to a pool on the other side of the world, therein creating more latency for your miner. Find a server in your own country or as close to it as possible. Most pools offer multiple geographically dispersed servers for this reason.

The second scenario is that you get lucky and find a block, but when the solution is presented someone else already submitted a valid solution before you did. You now have an orphaned block. These are two of the most common causes, and unless you’re getting a lot of rejects, it shouldn't be a problem. If you’re getting a lot of rejects and have a lot of GPU HW errors, you’re probably pushing your GPU too hard and need to adjust the intensity, work size, or number of threads. As always, just because you can mine a coin does not mean you find a coin if solo mining, or a valid share if pool mining. A good example would be to try and mine bitcoin with anything other than an ASIC device (Application Specific Integrated Circuit). The hash rate and difficulty is beyond other hardware's capability, unless you get extremely lucky. If so, stop! You just won the lottery!

Most pools will not show a hash rate or that you’re even mining until you submit a valid share. When the block changes and no new shares have been submitted, you’re back to not showing up at the pool. If you’re mining an economically mismatched coin for the device, don't be surprised when your miner is not seen by the pool. Find a coin you want to mine and match the appropriate HW device for the difficulty and hash rate. Alternately, using the HW devices available to you, see what coins are possible to mine with its capability. Have fun with it, and good luck micro-mining!

Be the first to comment

Leave a Reply