The magazine of the Melbourne PC User Group

Linux Clusters - for the bookshelf
Major Keary
 
 

Major Keary tells of the incredible performance achieved by a stack of very ordinary desktop computers

"A cluster of computers, or simply a cluster, is a collection of computers that are connected together and used as a single computing resource" [Encyclopedia of Computer Science].

Linux lends itself to clusters because the operating system is highly reliable, the necessary software is open source, special hardware does not have to be acquired for the purpose, and there is literature on the subject. Hardware can be off-the-shelf equipment or even recycled computers. It is even possible to create a bootable cluster CD (BCCD) that uses multiple CD-ROMs on a single computer. BCCD is a solution in certain circumstances and lends itself to learning about clusters; further information about available systems can be found at http://bccd.cs.uni.edu/, or search the Net for "clusterknoppix"

There is a distinction between high performance and high availability clusters. High performance means just that. Super computers are high performance systems, and the high performance can be achieved with a cluster — even if it is made up of recycled machines. For example, an American university built a high performance cluster from 1100 dual- processor Mac G5s; it achieved speeds of up to 10 teraflops, which put it up with the fastest supercomputers (further details can be found in High Performance Linux Clusters, reviewed below).

High availability clusters are designed to meet failover requirements. "Failover is the basic technique used to achieve high availability in clusters: computer failure is detected, and the work it was doing is 'failed over' to another in the cluster" [Encyclopedia of Computer Science]. The system requires constant monitoring of all machines in a cluster, which is achieved by regularly exchanging heart-beat messages between the machines. If one does not respond there is an assumption of failure and its functions are re-allocated. It is used to ensure continued operation of mission critical servers.

Another cluster type is called load-balancing and is designed to share the workload between machines in the cluster.

If you want to learn about high performance and high availability clusters there are two excellent texts, "High Performance Linux Clusters" and "Linux Enterprise Cluster".

High Performance Linux Clusters

Described as "a comprehensive getting-started guide", the full title High Performance Linux Clusters with OSCAR, Rocks, openMosix MPI. OSCAR, Rocks, openMosix, and MPI are open source software packages used for building a high performance cluster; the book describes each of them with details of installation and use.

Hardware is not discussed at length, but there is a brief chapter that walks the reader through design decisions and other practical matters.
Most of the content relates to software and installation. The first part of the book describes cluster architecture and discusses choice of Linux distribution, file system issues (NFS has its limitations), other services and configuration tasks, and cluster security. It addresses practical issues, offers advice on alternative approaches, and explains how the various programs work. Part 2 explains the principle software components and their installation.

Part 3 is about building clusters; it deals with the configuration of multiple machines, ways of automating the process, and tools for the task. There is further in-depth discussion of Rocks and OSCAR, and notes on scheduling software and parallel filesystems.

Part 4 is the largest section of the book and deals with cluster programming. It includes chapters on MPI, designing parallel systems, debugging parallel programs, and profiling parallel programs.

High Performance Linux Clusters is remarkably well presented; the text is supported with helpful diagrams and plenty of example code.

Joseph Sloan: High Performance Linux Clusters
ISBN 0-596-00570-9
Published by O'Reilly,
350 pp.,
RRP $74.95 incl. GST

Linux Enterprise Cluster

This book is about building a highly available cluster; it has the sub-title, "Build a Highly Available Cluster with Commodity Hardware and Free Software" and comes with a companion CD that contains source code for all the software required. The CD also contains all the diagrams used to support the text "to aid you in pitching a clustering solution to the decision- makers in your organisation".

As well as presenting a practical, detailed guide to building a high-availability cluster the author discusses theoretical issues and provides useful side notes and other explanatory material. The book is a well crafted manual on how to build a highly-available cluster and also provides a sound understanding of the technology.

The specifics of how to build a cluster don't come until almost halfway through the book. The first half provides the information required to undertake the task of building a cluster; it discusses tools, concepts, and issues that the cluster-builder has to use, understand, and appreciate.
Following the part on cluster theory and practice (which includes building a cluster, Linux virtual server, load balancing, and the network file system) the book turns to the important topic of maintenance and monitoring.

A remarkably thorough introduction to highly-available load-balancing clusters that offers detailed discussion of how to build one. The companion CD is a valuable resource, both for the necessary software and for presentations.
 

Karl Kopper: The Linux Enterprise ClusterI
ISBN 1-59327-036-4
Published by No Starsh Press
430pp + CD
RRP $89.95 incl. GST

Reprinted from the August 2005 issue of PC Update, the magazine of Melbourne PC User Group, Australia

[ About Melbourne PC User Group ]