CFD and Transport on Beowulf Clusters

Computational Fluid Dynamics Lab
The University of Texas at Austin



Latest Results
Go here to see some of the latest results from the CFDLab Beowulf Cluster.


Key Personnel
Project Director:    Graham F. Carey
Research Faculty:    Robert McLay, Alexandre Ardelea
Research Assistants:    Bill Barth, Benjamin Kirk
Undergraduate Helper:    John Peterson
Acknowledgments:    Supported by Intel, Compaq, and the Texas Advanced Technology Program




Research Summary
This project involves the investigation of tightly coupled systems of PCs for distributed parallel computation.  Of particular interest are flow and transport simulation on moderate size clusters of 8, 16 and 32 PC's.  Such systems have sometimes been termed BEOWULF systems and are a subject of recent interest because of the promise of good price-performance through the use of commodity-off-the-shelf (COTS) components.  The approach has been facilitated by the development of the MPI message passing library standard and recent operating system enhancements.


BEOWULF Systems
Work on this type of distributed parallel commodity-off-the-shelf system was initiated at NASA Goddard in 1993 using Intel 486 processors.  These systems are also being investigated at several of the other National Laboratories and major Universities.  Some of the main attributes are: 



CFDLab Systems
The University of Texas CFDLab has had three different PC clusters over the years, made possible by various grants from Intel and Compaq, and research support through the Texas Advanced Technology Program.

The first CFDlab cluster was the original "bbrox1," and was composed of 16 Pentium II 266MHz compute nodes and one control node. Each of these machines had 128MB of 66MHz EDO RAM and a 100Mbit FastEthernet card. Faster processors and newer hardware have recently become more readily-available and these machines have been converted to handle less computationally-intensive tasks. Several of them now serve quite capably as workstations for visiting guests and as dedicated firewall installations.

The CFDLab's second cluster, known as "zaphod2," is composed of 16 500MHz EV56 Alpha compute nodes with 512MB 100MHz RAM and FastEthernet cards. This cluster was donated to the lab by Compaq. Again, an additional node was installed to control the cluster.

The most recently-installed cluster in the lab is the rechristened "bbrox,". It is currently composed of 16 machines with a total of 32 550Mhz Xeon Pentium III processors. Each of the 16 machines has 1GB of PC100 SDRAM, FastEthernet, and Myrinet for low-latency, high-speed communications. An account of some of the hardware issues which were involved in the creation of this cluster can be found here.


Constructing the Systems
The main steps in setting up the systems were:
  1. Selecting a "head box" for each cluster that will act as the control node and gateway to the rest of the lab. These machines have two ethernet cards and hard drives, but otherwise are identical to other cluster machines.


  2. Installing Redhat Linux 6.2 on each head machine. We chose to do a full installation of the system sofware for maximum versatility.


  3. Customizing the head machine to increase security and eliminate unneeded services. We use the SSH protocol for lab communications, so this package had to be installed. The head machines were configured to run specific services needed by their cluster, including:
    • DHCP for node IP configuration and booting
    • TFTP for supplying kernels to compute nodes
    • NFS for supplying data to the diskless compute nodes
    A minimal kernel was then compiled for the compute nodes.


  4. Once the head machines were configured, the Etherboot package was installed. This package supplies scripts that can be used to create root directories for each of the compute nodes. It also provides software necessary for netbooting Intel-based machines. The Alpha machines can be directed to netboot via SRM.

    A test node was then brought up to debug the configuration. Once the head machine and single compute node were behaving properly, the configuration was copied for the rest of the compute nodes.


  5. To give each new machine its own identity the following files must be modified:

      /tftpboot/HOSTNAME/etc/sysconfig/network
      /tftpboot/HOSTNAME/etc/sysconfig/network-scripts/ifcfg-eth0
      /tftpboot/HOSTNAME/etc/fstab
      /tftpboot/HOSTNAME/etc/exports

    and an entry must be made for the client in the head machine's DHCPD configuration file (/etc/dhcpcd.conf). The entire cluster was then brought up.


  6. More details on this approach can be found on the Etherboot diskless Linux system page.


Domain Decomposition Algorithm
The parallel algorithm work will focus on the use of domain decomposition techniques using gradient iterative solvers [3,7].  For example, local cell, element or nodal computations on subdomain grids can be made independently in parallel with subdomain contributions to global matrix vector products, global dot products etc. computed in parallel using MPI for interprocessor communication [5].  We have applied this approach successfully in parallel computations on large supercomputer systems [e.g. see 4] and on loosely-coupled IBM RISC workstations [1].


MPI Communications
Some examples illustrating the use of MPI in our software for the subdomain communications follow.  Here we refer to the inner region of a subdomain as the picture and the interface elements as being part of the frame.
  1. Post asynchronous sends of on-workstation inner frame data to adjacent (east, west, north, and south) workstations.  The conn (4) array represents the MPI id numbers of the adjacent processors:
    MPI_ISEND (eastbuffer_snd,ny,MPI_DOUBLE_PRECISION,conn(east),1, MPI_COMM_WORLD, requests1,ierr)


  2. Post asynchronous receives for off-workstation outer frame data from adjacent workstations:
    MPI_IRECV (eastbuffer_rcv,nx,MPI_DOUBLE_PRECISION,conn(east),2, MPI_COMM_WORLD, requestr1,ierr)


  3. Compute matrix-vector multiply on the picture.


  4. Test to see if all information has been received, if not wait:
    MPI_WAIT(requestr1,stat,ierr)


Applications
  1. Some of the applications studies are directed to the solution of Rayleigh-Benard-Marangoni problems for coupled flow and heat transfer.


  2. Generic aspects of pattern formation are studied on the basis of an abstract reaction-diffusion PDE system - the Bruselator model.  When the system is perturbed with a time-periodic forcing term, a spatial reorganization is induced leading to various patterns.  Depending on the forcing frequency and amplitude, locked regimes i.e. standing wave patterns in the forms of labyrinths and phase fronts are observed.  The aim is to understand the mechanism of selection and the stability of these patterns through a systematic numerical investigation in the parameter space.  The interest is focused on subjects like: phase locking in multi-arm spirals, pi/4 front instabilities, and Nonequilibrium Ising-Bloch (NIB) bifurcations.





Footnotes:
  1. The name bbrox is a shortening of "Beeblebrox", the last name of the most charismatic (and the only two-headed) character in the Hitchhiker's Guide to the Galaxy series of books by Douglas Adams.
  2. The name zaphod was the first name of "Zaphod Beeblerox", the same character mentioned in footnote 1, above.


References:
  1. Berner, A. and G. F. Carey, Parallel Workstation Clusters and MPI for Sparse Systems in Computational Science, in Proceedings of Parallel CFD '97, Manchester, England, 1997.


  2. Bova, S. W. and G. F. Carey, An Entropy Variable Formulation andApplications for the Two-Dimensional Shallow Water Equations. IJNMF, Vol 23, 29-46, 1996.


  3. Carey, G. F., Parallel Supercomputing, Wiley, U. K., 1989.


  4. Carey, G. F., C. Harle, R. McLay and S. Swift, MPP Solution of Rayleigh-Benard-Marangoni Flows, Proceedings of Supercomputing '97, San Jose, CA, 1997.


  5. Gropp, W., E. Lusk and A. Skjellem, Using MPI:  Portable Parallel Programming with the Message Passing Interface, MIT Press, Cambridge, MA, 1994.


  6. Karypis, G. and V. Kumar, METIS:  Unstructured Graph Partitioning and Sparse Matrix Ordering System, Tech. Rept., Dept. Computer Science, U. Minnesota, 1995.


  7. LeTallec, P., Domain Decomposition Methods in Computational Mechanics Advances, North Holland, pp 121-220, 1994.


  8. Warren, M. S., J. K. Salmon, D. J. Becker, M. P. Goda, T. Sterling, G.S. Wincklemans.  Pentium Pro inside:  1:  A Treecode at 430 Gflops on ASCI Red   2:  Price/Performance on Loki and Hyglac.  Supercomputing 1997.  Los Alamitos.  IEEE Comp. Soc.


University of Texas at Austin | College of Engineering | UT Aerospace

Site Created By: CFDLab Web Team
Last modified: December 17 2004 10:41:54.