Using ODROIDs in High Performance Computing (HPC) – ARM: Head and Shoulders Above The Competition

A modern datacenter uses far too much electricity and air conditioning to run efficiently. An ARM-based Internet Service Provider can deliver web pages for substantially less power than conventional architectures (http://tinyurl.com/ApacheOnARM). At the same time, ARM cores are rapidly being adopted by the scientific Datacenter community because ARM devices compute faster in floating-point math intensive operations, for a fraction of the energy costs, and have an architectural roadmap of even more performance per watt to come. There is also a corresponding growth in interest in HPC (High Performance Computing) and its uses in broader domains from the ARM developer community. In academia, there is also High Performance Extreme Computing (http://www.ieee-hpec.org/) and IEEE Supercomputing, which were demonstrated at the recent SC13 conference in Denver, Colorado. Additionally, IEEE Cluster, which was held in Indiana in 2013, has seen a marked rise in ARM-centric publications.

The majority of modern supercomputing centers have thousands to tens of thousands of cores dedicated to their particular processing needs. Any time a programmer can run an application at an improved performance ratio (per watt, dollar, or square meter) is a win to the Datacenter stakeholders as well as the computationally scientific domain customers they support. This is increasingly the case for situations where applications must be run multiple times, and where multiple applications share resources in HPC, a lot like the cloud and a throwback to old-style timesharing.

Why ARM?

The ARMv7 architecture has proven to be up to the challenge of HPC in a number of ways that previous ARM architectures were not. One might use ARM despite energy efficiency as the technology represents a growth path for fast embedded computing. When a Datacenter is composed of over 10,000 cores, considerable advantages are realized through incremental improvements. These small changes can add up to significant savings in space, power, and cooling. When memory is shared between the CPU and the GPU on ARM SoCs (System on a Chip), double the SIMD (Single Instruction, Multiple Data) extensions on Cortex-A15 NEON GPU, and considerably larger memory access potential, benefits are realized at the place we need them most; where the application and data sets meet on-die. With growing acceptance of GP-GPU (General Purpose GPU) computing and expansion of HPC-type applications based on big data apps, the fast computing modes of ARM are relevant in more ways than ever, with a technology path towards an ever-expanding share of HPC.

Why ODROID?

Today, ODROID has an Exynos family processor and at least 4 ARM cores. The upcoming Exynos5 series has 8 cores, 4 of which are ARM Cortex-A15. Hardkernel, as well as the RunTime Computing Solutions research consortium, have demonstrated substantial power and performance improvements of the XU in comparison to other contemporary architectures. With Hardkernel’s ambitious release schedule of new technologies, adopters of this platform follow Moore’s law and ARMs rollout efficiently, allowing them to join the wave of newer, better, lower cost, higher performance systems as they emerge with meaningful upward compatibility.

What’s more, Cortex-A15 wins on most ARM-HPC benchmarks (for instance, the NAS Parallel Benchmarks at http://tinyurl.com/ODROID-HPC). The RunTime Computing Solutions team has recently demonstrated pivotal advantages of the A15 over the A9 on the HPCC Challenge, the preferred benchmark for HPC (http://hpcchallenge.org/). This test uses just the A15 cores on the Exynos 5410 and maintains them at maximum capacity during the test (http://tinyurl.com/ODROID-LINPACK) which is not a completely equitable comparison. However, meaningful information can still be gleaned from it; XUJessie is twice as good as U2Whisper in G-HPLINPACK, the first test. The authors demoed the SOX BOX at SC13 featuring many hardware mods which improved performance. See our sites for more info.

Conclusion

Today we can run many HPC applications on ODROID, and as the upward pressures of energy efficiency cause industry professionals to rethink Datacenter design, progressively more centers will adopt these architectures. The future is bright for designers as the push towards exascale computing ushers in a new and exciting theme in embedded SoC technologies.

Be the first to comment

Leave a Reply