Virtual Planet Builder
VirtualPlanetBuilder on a Single System Image (SSI) Cluster Example
- Details
- Category: Virtual Planet Builder
- Published: 14 February 2013
- Written by J.P. Delport
- Hits: 17968
This page briefly describes the setup and procedure used to generated a large terrain database on a cluster of 8 nodes. This is definitely not the only way to use VirtualPlanetBuilder, but serves as a specific example.
I used OpenSceneGraph (rev 8413) and VirtualPlanetBuilder (rev 914) checked out from svn around June 2008.
Prerequisites
This section describes some aspects of the cluster setup that are related to running VirtualPlanetBuilder.
Passwordless SSH
VirtualPlanetBuilder by default uses ssh to execute osgdem commands on the compute nodes. You should therefore setup ssh so that the user that will run the vpbmaster command can login to the other machines without a password. I used public/private key pairs to do this. Consult Google for detailed instructions.
If all is setup correctly, you should be able to login from the node where you will run vpbmaster to any other node without a password, e.g.:
jpd@rootnode:~$ ssh node01 Linux node01 2.6.24 #1 SMP Wed Jul 9 16:58:57 SAST 2008 i686 Last login: Sat Jul 19 11:52:59 2008 from rootnode jpd@node01:~$
X Server on Nodes
When osgdem executes on the compute nodes, it tries to open a window on display :0.0. An X server must therefore be running on the node (the first server on most Linux distros should default to :0.0). It is important to log in to the server. All the nodes in this example had an NVidia graphics card with the 169.12 driver installed.
To test if everything is working correctly, do something like the following:
jpd@rootnode:~$ ssh node01 "export DISPLAY=:0.0 ; xeyes"
The xeyes application should run and display its window on node01's X server.
Data Directories
All nodes that will participate in the terrain building need to have access to the input data (readable) as well as access to a directory to store the output files (writable). In this example I will assume that a directory called "/glusterfs" is visible from all machines and is writable.
The following should for example display the same listing:
jpd@rootnode:~$ ls /glusterfs jpd@rootnode:~$ ssh node05 "ls /glusterfs"
VPB Setup
Data Reprojection and Translation
VirtualPlanetBuilder seems happiest when all input data uses the same projection and datum. It is also a good idea to reproject the input data so that it can be used in future VirtualPlanetBuilder runs (yes, you will run it more than once ;).
For this example I used latlong and WGS84 for all the files and also converted everything to GeoTiff format. The following shows how I converted the stunning Blue Marble Next Generation (BMNG) data I downloaded from here.
gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr -180 90 -90 0 world.topo.bathy.200407.3x21600x21600.A1.jpg A1.tif gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr -90 90 0 0 world.topo.bathy.200407.3x21600x21600.B1.jpg B1.tif gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr 0 90 90 0 world.topo.bathy.200407.3x21600x21600.C1.jpg C1.tif gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr 90 90 180 0 world.topo.bathy.200407.3x21600x21600.D1.jpg D1.tif gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr -180 0 -90 -90 world.topo.bathy.200407.3x21600x21600.A2.jpg A2.tif gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr -90 0 0 -90 world.topo.bathy.200407.3x21600x21600.B2.jpg B2.tif gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr 0 0 90 -90 world.topo.bathy.200407.3x21600x21600.C2.jpg C2.tif gdal_translate -of GTiff -a_srs "+proj=latlong +datum=WGS84" -a_ullr 90 0 180 -90 world.topo.bathy.200407.3x21600x21600.D2.jpg D2.tif
For most data that already has projection data included, the following command seemed to work:
gdalwarp -t_srs "+proj=latlong +datum=WGS84" -r bilinear $name ../reprojected/$newname
I usually place all reprojected data into separate directories that group them by layer. E.g. the converted BMNG tif files would be placed in a directory called "/glusterfs/BMNG".
Machine Pool File
VirtualPlanetBuilder uses an input file to describe the list of machines that will be used during the build. Below is the simple one I used, called machinepool.txt.
Machine { hostname node01 processes 1 } Machine { hostname node02 processes 1 } Machine { hostname node03 processes 1 } Machine { hostname node04 processes 1 } Machine { hostname node05 processes 1 } Machine { hostname node06 processes 1 } Machine { hostname node07 processes 1 } Machine { hostname node08 processes 1 }
The names specified in the file should be the names that are reachable using ssh. If you have multiple cores per machine with enough memory, you could try increasing the processes.
Command Line
Below is the command line that I used (all should be on one line when executing without the // comments):
vpbmaster --machines machinepool.txt // Use the machine pool file described earlier --geocentric // I want a geocentric database -d /glusterfs/DTED // My elevation data can be found here (the directory contains only tifs) --layer 0 -t /glusterfs/BMNG // On texture layer 0 I want the BMNG data --layer 0 -t /glusterfs/SPOT_reproj // as well as some nice satellite photos --layer 1 -t /glusterfs/Maps_reproj // On Layer 1 I want some 1:50000 scale maps --terrain --compressed // Use the new terrain format and compress the texture data -o spot_maps/terrain.ive // Put the output files under this directory
The command was executed from the master node in a directory that is visible to all nodes:
jpd@rootnode:~$ cd /glusterfs/generate
That's it. The vpbmaster command created 473 tasks and after 50 hours of processing created 1.5 million files with a total size of 487GB.
Below is a snippet of the output:
End of run: tasksPending=0 taskCompleted=473 taskRunning=0 tasksFailed=0 MachinePool::reportTimingStats() Machine : node01 Task::type='' minTime=616.267390 maxTime=5068.351865 averageTime=2751.044383 totalComputeTime=181568.929248 numTasks=66 Machine : node02 Task::type='' minTime=761.555931 maxTime=25294.378808 averageTime=4027.837882 totalComputeTime=181252.704694 numTasks=45 Machine : node03 Task::type='' minTime=702.828047 maxTime=4933.171209 averageTime=2936.182243 totalComputeTime=179107.116819 numTasks=61 Machine : node04 Task::type='' minTime=605.509313 maxTime=5256.497770 averageTime=2794.609616 totalComputeTime=178855.015440 numTasks=64 Machine : node05 Task::type='' minTime=704.703147 maxTime=5562.438701 averageTime=3005.987053 totalComputeTime=180359.223195 numTasks=60 Machine : node06 Task::type='' minTime=658.961472 maxTime=10080.521703 averageTime=3329.892155 totalComputeTime=179814.176365 numTasks=54 Machine : node07 Task::type='' minTime=702.050721 maxTime=5662.709409 averageTime=3052.297707 totalComputeTime=180085.564685 numTasks=59 Machine : node08 Task::type='' minTime=703.251535 maxTime=5755.713908 averageTime=2803.070356 totalComputeTime=179396.502756 numTasks=64 Finished run successfully. Total elapsed time = 181923.496816
Cluster Overview
The cluster used in this example consists of 9 machines connected using Gigabit Ethernet. The master (root) node contains all applications on its filesystem. The 8 other nodes are identical machines that boot over the network using NFS as their root filesystems. All nodes have NVidia 6 series PCI-Express graphics cards.
All nodes run Debian Lenny. By virtue of using the same NFS mount, all client nodes have identical software installed.
The 8 client nodes have a 500GB disk each. The disks are pooled into one large 4TB filesystem using the excellent GlusterFS. By mounting the GlusterFS filesystem from all nodes, they all see the exact same data.
Misc Tips
Use GNU screen to start the vpbmaster command. You can then monitor the progress over the network and save the command line output to review later.
Sun Grid Engine makes reprojecting hundreds of files in parallel easier. You just submit the whole batch at once using find -exec and a simple script.
Since ssh to all nodes is set up already, install dsh (distributed shell) on the root node. Running a single command on all nodes then become as easy as:
dsh -a "ls /glusterfs"
Using GNU/Linux and OpenSceneGraph is fun.
Enjoy.