In this article, Cooper Filby and Anthony Skjellum of Runtime Computing Solutions LLC http://www.runtimecomputing.com, outline the setup and configuration of a basic “headless” cluster with the end goal of running parallel programs based on message passing, using the Message Passing Interface (MPI) parallel programming model in particular. There are a number of prebuilt Linux operating systems available for ODROID boards from the Hardkernel website. To get started, download the Ubuntu Server image for your ODROID model and extract the .IMG.XZ archived image using an archiving tool such as 7zip on windows, or by typing “xz” from the Linux command line. Finally, you can copy to the medium of your choice, such as an SD card or an eMMC module, using the “dd” command on Linux/OS X systems or the Win32DiskImager.exe for ODROID on Windows. For more detailed instructions on copying over the OS, please refer to Bohdan Lechnowsky’s article titled “Installing an OS on an ODROID” from the January 2014 issue of ODROID Magazine. We recommend using the eMMC modules available from Hardkernel for better performance, but SD cards work well too.
Connecting to your ODROID
Since we opted to use the Ubuntu Server image for our ODROIDs, we can connect to our XU-E systems (we’ll call them nodes for simplicity from now on) via the ssh protocol using Terminal (or Putty if running Windows) in order to continue setting up our cluster. Because of potential initial hostname and MAC address conflicts that we will resolve in the next section, we will need to boot the first ODROID and set a few settings before starting the second.
[Editor’s Note: If one is available, a development machine running Linux or Windows is recommended to more easily setup and reboot the cluster, troubleshoot hardware problems, and other necessary debugging. An alternative to using a separate computer is to plug a USB keyboard and HDMI cable into the first ODROID and use it directly to bootstrap the cluster instead of via SSH as described in the next few paragraphs. Press Ctrl-Alt-F1 to use the framebuffer console if X11 is not running.]
In order to connect to your ODROID, you'll need to discover the hostname or IP address of the board. For the Ubuntu server image we used on our XU+E cluster, the default hostname is “odroid-server”, while for other images we've used, it's been “odroid”. Most home networks should support DNS by default, which will allow you to connect simply by the hostname. If this fails, you can alternatively connect using the IP address assigned to the ODROID by your router instead. If neither of the hostnames resolves for you, check your router's lease table to search for the IP address, often labeled as the DHCP client table in the router’s admin panel.
Since we used identical copies of the same image on both nodes, by default they had a hostname conflict, which we resolved by bringing them online one at a time, then changing the individual network settings. If you don’t have access to the router’s admin panel, you can also make use of the nmap command to scan your network for hosts to find the ODROIDs, if you know your network information. For example: “nmap 192.168.1.0/24”. Look for a host that has port 22 open.
Power on one of the ODROIDs, then enter “ssh odroid@ubuntu- server” (or “ssh email@example.com”, if using the IP address) in the Terminal or Putty window of the host computer, which will establish a secure connection to the ODROID. To login, type “odroid” as the password.
Once the command prompt appears, you may want to run "sudo apt-get update && sudo apt-get upgrade" to ensure that your OS is up to date. Furthermore, we recommend you run the "passwd" command and change the password for the odroid user to something a little more secure, or creating new user accounts with the "adduser" command, such as by running "sudo adduser kilroy". (Generally speaking, do three things key with your node passwords: make them long, make them hard to guess, and store it in a secure location.)
Before getting both ODROIDs online, we need to change a few settings as to eliminate hostname and MAC address conflicts that may occur on your home network with an ODROID cluster. To change the hostname, we will need to edit two files, /etc/hostname and /etc/hosts, changing "odroid-server’ to the hostname of your choice and rebooting the machine so the changes take effect. For the purposes of this article we will use odroid-server0 and odroid- server1 to refer to the first and second ODROID respectively. Alternatively, if your operating system supports it, you can also type "sudo odroid-config" to change the hostname. You can use other names of your choice; they have to be unique to each node.
The MAC address conflict was a subtle issue that we encountered when we first set up multiple ODROID XU+E’s. We found that, by default, the onboard ethernet devices all shared the same MAC address, which made it impossible to work on a single ODROID if multiple were powered online and on the same network. If the two ODROIDs you’re working have identical MAC addresses, there are two straightforward ways to resolve this: 1) configure one (or both) of the ODROIDs to use a different MACaddress, or 2) setup USB ethernet dongles, which should all have unique MAC addresses. The specific values you choose really don’t matter, as long as you keep them unique on your Local Area Network (LAN).
To change the MAC address of the onboard device, edit /etc/network/interfaces with your text editor of choice, and add the line "hwaddress ether newmac", where newmac is an address in the format “b6:8d:67:7b:cb:e0” underneath the following labels:
auto eth0 iface eth0 inet dhcpThen, reboot the ODROID so the changes take effect. Make sure to verify the new address using the ifconfig command. Alternatively, you can opt to plug your USB Ethernet adapters into the USB 3.0 slot, and then run "ifconfig -a | grep eth", which should yield a list similar to this:
eth0 Link encap:Ethernet HWaddr b6:8d:67:7b:cb:e0 eth2 Link encap:Ethernet HWaddr 00:13:3b:99:92:b1By default, eth0 will be the onboard 10/100 ethernet connection, while the second ethernet device (in this case, eth2) will be the USB Ethernet Adapter. If only eth0 shows up, try reseating your USB Ethernet adapter and/or verifying that it works on another machine. To set up the adapter for using DHCP on boot to get an IP address, we will need to modify /etc/network/ interfaces and add the following two lines between the entries for auto lo and auto eth0:
auto eth2 iface eth2 inet dhcpUse the appropriate ethernet device id previously found with ifconfig (in this case, eth2). Then, power down the ODROID, put the ethernet cable that was attached to the the onboard device into the USB ethernet adapter, and power the ODROID back on. If, for some reason, you aren’t able to connect, try plugging the cable back into the onboard slot and verifying that the USB ethernet adapter is still showing up using the “ifconfig -a” command. It’s also possible that the ethernet device ID itself has changed if the adapter is unseated, in which case you can update the /etc/network/interfaces file accordingly.
At this point, the ODROID should be configured and accessible on the network. Before heading on to the MPI section, configure the second ODROID using the same steps described above.
Message Passing Interface (MPI)
Now that we have two nodes configured appropriately, we can now start looking towards how we can execute HPC jobs on our two-node cluster. A parallel programming environment such as MPI helps you do this. MPI takes care of starting up the processes that make up the parallel programming model, and provides a standardized application programming interface (API) for those cooperating, communicating sequential processes to use to make the parallel program work. To accomplish this, we will make use of MPI, or Message Passing Interface, which provides an API that allows nodes to send and receive messages while processing jobs. A command called either mpirun or mpiexec will start all the processes needed across your ODROIDS under your control. There are two common open source MPI implementations available for download - MPICH and OpenMPI. For ODROID clustering purposes over Ethernet, both work equally well. Both of these MPI implementations are available through ap.
To install MPICH, run "sudo apt-get install mpich2", or run "sudo apt-get install openmpi-bin" to install OpenMPI as an alternative.
What you can do once you’ve loaded MPI:
- 1) Run example programs that use multiple cores on a single ODROID
- 2) Run example programs that use both ODROIDS and a total of 8 cores.
- 3) Learn how to build your own MPI programs.
In this article, we’ve focused on showing you how to do the first and the second approaches. You can read the example programs that come with OpenMPI and MPICH to learn more. There are also a number of excellent online tutorials and a few good books on programming MPI, such as “Using MPI” from MIT Press (one of us co-authored that book).
Building it Better
The content of this article represents just a fraction of what we will be able to do with our cluster down the line. While this setup is more than adequate for handling two nodes and only a few users, if we want to grow our cluster, we will want to make use of a dedicated head node to better handle a larger number of users and nodes. In addition to allowing us to hide cluster traffic from the rest of the network, this head node will also host services that will streamline cluster management, such as LDAP for user management, Puppet for content management, NFS for file sharing, and various networking services.
In part two of this series, we will begin to convert odroid-server0 into a proper head node.