/ VirtualPlanetBuilder on a Single System Image (SSI) Cluster Example

OpenSceneGraph

Virtual Planet Builder

VirtualPlanetBuilder on a Single System Image (SSI) Cluster Example

This page briefly describes the setup and procedure used to generated a large terrain database on a cluster of 8 nodes. This is definitely not the only way to use VirtualPlanetBuilder, but serves as a specific example.

I used OpenSceneGraph (rev 8413) and VirtualPlanetBuilder (rev 914) checked out from svn around June 2008.

Prerequisites

This section describes some aspects of the cluster setup that are related to running VirtualPlanetBuilder.

Passwordless SSH

VirtualPlanetBuilder by default uses ssh to execute osgdem commands on the compute nodes. You should therefore setup ssh so that the user that will run the vpbmaster command can login to the other machines without a password. I used public/private key pairs to do this. Consult Google for detailed instructions.

If all is setup correctly, you should be able to login from the node where you will run vpbmaster to any other node without a password, e.g.:

jpd@rootnode:~$ ssh node01
Linux node01 2.6.24 #1 SMP Wed Jul 9 16:58:57 SAST 2008 i686
Last login: Sat Jul 19 11:52:59 2008 from rootnode
jpd@node01:~$

X Server on Nodes

When osgdem executes on the compute nodes, it tries to open a window on display :0.0. An X server must therefore be running on the node (the first server on most Linux distros should default to :0.0). It is important to log in to the server. All the nodes in this example had an NVidia graphics card with the 169.12 driver installed.

To test if everything is working correctly, do something like the following:

jpd@rootnode:~$ ssh node01 "export DISPLAY=:0.0 ; xeyes"

The xeyes application should run and display its window on node01's X server.

Data Directories

All nodes that will participate in the terrain building need to have access to the input data (readable) as well as access to a directory to store the output files (writable). In this example I will assume that a directory called "/glusterfs" is visible from all machines and is writable.

The following should for example display the same listing:

jpd@rootnode:~$ ls /glusterfs
jpd@rootnode:~$ ssh node05 "ls /glusterfs"

VPB Setup

Data Reprojection and Translation

VirtualPlanetBuilder seems happiest when all input data uses the same projection and datum. It is also a good idea to reproject the input data so that it can be used in future VirtualPlanetBuilder runs (yes, you will run it more than once ;).

For this example I used latlong and WGS84 for all the files and also converted everything to GeoTiff format. The following shows how I converted the stunning Blue Marble Next Generation (BMNG) data I downloaded from here.

gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr -180 90 -90   0 world.topo.bathy.200407.3x21600x21600.A1.jpg A1.tif
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr -90  90   0   0 world.topo.bathy.200407.3x21600x21600.B1.jpg B1.tif
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr   0  90  90   0 world.topo.bathy.200407.3x21600x21600.C1.jpg C1.tif
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr  90  90 180   0 world.topo.bathy.200407.3x21600x21600.D1.jpg D1.tif
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr -180  0 -90 -90 world.topo.bathy.200407.3x21600x21600.A2.jpg A2.tif
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr  -90  0   0 -90 world.topo.bathy.200407.3x21600x21600.B2.jpg B2.tif
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr    0  0  90 -90 world.topo.bathy.200407.3x21600x21600.C2.jpg C2.tif
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr   90  0 180 -90 world.topo.bathy.200407.3x21600x21600.D2.jpg D2.tif

For most data that already has projection data included, the following command seemed to work:

gdalwarp -t_srs "+proj=latlong +datum=WGS84" -r bilinear $name ../reprojected/$newname

I usually place all reprojected data into separate directories that group them by layer. E.g. the converted BMNG tif files would be placed in a directory called "/glusterfs/BMNG".

Machine Pool File

VirtualPlanetBuilder uses an input file to describe the list of machines that will be used during the build. Below is the simple one I used, called machinepool.txt.

Machine {
	hostname node01
	processes 1
}
Machine {
	hostname node02
	processes 1
}
Machine {
	hostname node03
	processes 1
}
Machine {
	hostname node04
	processes 1
}
Machine {
	hostname node05
	processes 1
}
Machine {
	hostname node06
	processes 1
}
Machine {
	hostname node07
	processes 1
}
Machine {
	hostname node08
	processes 1
}

The names specified in the file should be the names that are reachable using ssh. If you have multiple cores per machine with enough memory, you could try increasing the processes.

Command Line

Below is the command line that I used (all should be on one line when executing without the // comments):

vpbmaster --machines machinepool.txt                // Use the machine pool file described earlier
          --geocentric                              // I want a geocentric database
          -d /glusterfs/DTED                        // My elevation data can be found here (the directory contains only tifs)
          --layer 0 -t /glusterfs/BMNG              // On texture layer 0 I want the BMNG data
          --layer 0 -t /glusterfs/SPOT_reproj       // as well as some nice satellite photos
          --layer 1 -t /glusterfs/Maps_reproj       // On Layer 1 I want some 1:50000 scale maps
          --terrain --compressed                    // Use the new terrain format and compress the texture data
          -o spot_maps/terrain.ive                  // Put the output files under this directory

The command was executed from the master node in a directory that is visible to all nodes:

jpd@rootnode:~$ cd /glusterfs/generate

That's it. The vpbmaster command created 473 tasks and after 50 hours of processing created 1.5 million files with a total size of 487GB.

Below is a snippet of the output:

End of run: tasksPending=0 taskCompleted=473 taskRunning=0 tasksFailed=0
MachinePool::reportTimingStats()
    Machine : node01
        Task::type=''   minTime=616.267390      maxTime=5068.351865     averageTime=2751.044383 totalComputeTime=181568.929248  numTasks=66
    Machine : node02
        Task::type=''   minTime=761.555931      maxTime=25294.378808    averageTime=4027.837882 totalComputeTime=181252.704694  numTasks=45
    Machine : node03
        Task::type=''   minTime=702.828047      maxTime=4933.171209     averageTime=2936.182243 totalComputeTime=179107.116819  numTasks=61
    Machine : node04
        Task::type=''   minTime=605.509313      maxTime=5256.497770     averageTime=2794.609616 totalComputeTime=178855.015440  numTasks=64
    Machine : node05
        Task::type=''   minTime=704.703147      maxTime=5562.438701     averageTime=3005.987053 totalComputeTime=180359.223195  numTasks=60
    Machine : node06
        Task::type=''   minTime=658.961472      maxTime=10080.521703    averageTime=3329.892155 totalComputeTime=179814.176365  numTasks=54
    Machine : node07
        Task::type=''   minTime=702.050721      maxTime=5662.709409     averageTime=3052.297707 totalComputeTime=180085.564685  numTasks=59
    Machine : node08
        Task::type=''   minTime=703.251535      maxTime=5755.713908     averageTime=2803.070356 totalComputeTime=179396.502756  numTasks=64
Finished run successfully.
Total elapsed time = 181923.496816

Cluster Overview

The cluster used in this example consists of 9 machines connected using Gigabit Ethernet. The master (root) node contains all applications on its filesystem. The 8 other nodes are identical machines that boot over the network using NFS as their root filesystems. All nodes have NVidia 6 series PCI-Express graphics cards.

All nodes run Debian Lenny. By virtue of using the same NFS mount, all client nodes have identical software installed.

The 8 client nodes have a 500GB disk each. The disks are pooled into one large 4TB filesystem using the excellent GlusterFS. By mounting the GlusterFS filesystem from all nodes, they all see the exact same data.

Misc Tips

 Use GNU screen to start the vpbmaster command. You can then monitor the progress over the network and save the command line output to review later.

 Sun Grid Engine makes reprojecting hundreds of files in parallel easier. You just submit the whole batch at once using find -exec and a simple script.

 Since ssh to all nodes is set up already, install dsh (distributed shell) on the root node. Running a single command on all nodes then become as easy as:

dsh -a "ls /glusterfs"

Using GNU/Linux and OpenSceneGraph is fun.

Enjoy.