![]() |
|
Parallel
programming in MC#: how to build a high-performance POVRAY application for one
day… Technical
Report Program
Systems Institute of Pereslavl-Zalessky, July 9,
2005 Table of
contents Parallel
programming In MC#: how to Build a HIgh-performance Povray application for one
day 3. Povray Tracer
(Persistence of Vision Ray Tracer) 4. Implementation of
“simplest” distributed Povray in MC# language 4.1.
Movable methods as virtual processors. 5. Comparison of parallel
Povray implementation in MC# language with MPI version 6. Native implementation
of Povray in MC# Supplements References 1. IntroductionThis report contains the details of the “simplest” implementation of the distributed Povray package (Persistence of Vision Ray Tracer, http://www.povray.org/ ) in the parallel programming language MC# (http://u.pereslavl.ru/~vadim/MCSharp). We call the approach we had taken the “simplest” because it doesn’t require any changes in the original code of Povray tracer but still it gives us acceptable performance level and can be used with any version of Povray. The main purpose of MC# project is to estimate whether it is possible to implement complex computational programs (that are usually implemented in C and FORTRAN languages jointly with some communication library like MPI) in byte-code languages like C# within managed runtime environments. In this project we used a Mono platform – a free implementation of .Net platform for Unix-like systems (http://www.mono-project.com/). 2. The basics of MC# languageMC# language is a universal high-level programming language based on C# language designed to be used on cluster architectures. Actually this means that you can easily reuse in your parallel programs in MC# language a vast collection of existing .Net libraries, experience, patterns and best practices that were accumulated by Microsoft and other vendors for the recent years. Specific feature of MC# language is the transferring of
asynchronous parallel programming model of Cw (formerly known as
Polyphonic C#, - Asynchronous methods of Cw in MC# language can be scheduled for execution on a remote machine (this can be a node of the cluster or computer in a GRID network) and are known in MC# as movable methods. Interaction between movable methods that are executed on different machines can be realized through channels – a special syntactic class in MC# language. For synchronization purposes MC# uses the chords as in Cw. Though the channels essentially are one-way entities (same as in join-calculus, Fournet C., Gonthier G., 1998), so called “bi-directional channels” were introduced in MC# language through the special BDChannel class. Using this kind of channels you can send and receive values in usual way. More information about MC# language, including articles and examples of implemented applications, is available on the MC# project’s site: http://u.pereslavl.ru/~vadim/MCSharp/ 3. Povray Ray Tracer (Persistence of Vision Ray Tracer)The ray tracer POVRAY is a high-quality, totally free tool for creating stunning three-dimensional graphics. It was developed at Persistence of Vision Raytracer Pty. Ltd. and nowadays new versions of this software appear every few months. Unfortunately the latest available parallel version of Povray for MPI is 3.50c while the current stable non-parallel version is 3.6. Also parallel versions of Povray are not supported by the owners and are made by the volunteers (they create patches to original source code that are tightly integrated with particular version). So, first of all we had to find our own way how to create a parallel program that could work with any version of this ray tracer. Ideally that requires that no changes should be made to the original source code from Persistence of Vision Raytracer Pty. Ltd. when you change one version to another. It is worth noting that this package supports a lot of sophisticated features and built-in functionality like radiosity, interior, media and atmospheric effects. To port all the code to C# language would require a many months of coding. In this report we describe the “simplest” implementation of parallel Povray algorithm that can be used with any version of original Povray package. 4. Implementation of “simplest” distributed Povray in MC# languageOriginal Povray package was implemented on C++ language and from the beginning it wasn’t oriented at any communication libraries. Later some contributors created some patches that enabled Povray to be running within MPI communication library on different platforms. As we already have mentioned above, these patches are tightly integrated with particular versions of Povray and in order to migrate to the new version you have to create new patches (that’s a nontrivial task). Fortunately all versions of Povray have options “+SC” and “+END” that can be used to specify the start and end columns of the image we need to render. Knowing the width and height of the whole image and the number of processors in the cluster we can easily calculate the number of fragments (and start and end columns for each fragment) and run the calculation of each fragment on their own nodes. When fragments are ready they are stored in the temporary files. Then we join them on the frontend of the cluster and get the result image. That is the “simplest” algorithm we could think about. 4.1. Movable methods as virtual processorsOne of the key features of MC# language is the notion of movable methods. This kind of methods can be scheduled to be executed on any node of the cluster/GRID network/… depending on the current configuration and runtime system. MC# is fully abstracted from the actually used computational network. In fact the best practice of writing programs in MC# language is that programmer shouldn’t even know what configuration of cluster is used. In our case declaration of such movable method will look like this:
In this fragment we start a new process that render the fragment and then we wait while it completes. Given movable method will be run on one of the nodes of the cluster – Runtime system will decide which particular node should be taken depending on the current workload of the cluster. Then we send the result frame to the main node. This movable method accepts as arguments location of the original Povray executable file, some inputs that should be passed to the original Povray, start and end columns and also some special channel that will be used to send the results to the Main program. We will discuss the channels in the next section, but here you should note that this channel accepts only objects of type Frame - our own class that has some fields that can describe fragment (filename, start column and end column). Of course, you can pass any objects of .Net Framework from one node to another one, even objects of your own types and MC# Runtime will do all necessary work for automatic serialization/deserialization/delivering of these complex objects. You can call movable methods in the same manner as you call conventional methods:
4.2. Channels as a tool for interaction between Main method and movable methods (virtual processors)Movable methods by their nature cannot return any values to other methods. To return the computed values from movable methods we use channels. Typically channels are declared by the chords as in Cw (see [3] for more information). In our case we declare one chord for accepting Frame objects from cluster nodes:
The main loop that accepts Frames from cluster nodes will be like this one:
And that’s almost all we need to do for this program (except the parsing routine for program parameters – you can see the whole program in supplement). So we got easily readable code and by the way the most C++/C# programmers must be already acquainted with the most of the syntax of this code – this means that you can easily learn this language if you already familiar with C# or C++. 5. Comparison of parallel Povray implementation in MC# language with MPI version5.1. Tests resultsWe have tested the proposed program on the standard file benchmark.pov with resolution 1024x768 pixels using Povray 3.50c and 3.6. The correctness of results has been checked by byte-to-byte comparison of output files. Of course we didn’t expect that this version will be faster then MPI analogue (mainly because MC# is much more high-level language then C and because Mono platform is still being in development process and hasn’t been optimized properly yet), but still we got the comparable results. Remark. The Povray 3.50c parallel implementation has some important feature. Starting it as
>mpirun –np N povray in fact, leads to running its rendering clients on N – 1 processors. So, for correct comparison of MC# implementation with an original parallel Povray, we compare an each MC# run on N processors with an original Povray run on N + 1 processors. Cluster “SKIF FIRST-BORN M” (16 SMP nodes with Athlon MP 1800+ processors)
Cluster “SKIF K-1000” (98th place in November 2004 Top500; up to 288 nodes with Opteron processors). N proc MC# LAM (3.50c) (3.50c)
1 2464 2459 2 1471 1245 4 753 642 8 405 340 12 301 239 16 237 189 20 195 159 24 172 139 28 151 123 32 136 114 36 125 105 40 116 98 44 109 92 48 103 89 52 97 85 56 94 81 60 90 78 64 87 76 6. Native implementation of Povray in MC#Currently we are trying to port a small fragment of original Povray package to C# language (and then to MC# parallel programming language).This fragment will be able to render scenes described in the subset of Povray input language. This subset allows one to describe scenes constructed from the basic objects such as spheres, boxes, cones, cylinders, etc. and also the light sources. The purpose of this experiment is to estimate the possibility to implement of complex computational algorithms in byte-code languages within managed runtime environments and also the efficiency of their parallel versions coded in MC#. The results of this experiment will be available soon at MC# site. 7. ConclusionsNowadays it is not so important how fast your parallel program is – you can always add some additional processors and speed up your application if it is scalable. The more important questions in business world are how much time and efforts you need to spend to write this parallel program in a particular language and how comfortable you are when using that particular programming language. MC# offers the most abstract model of parallel computations for today as well as the acceptable level of performance. Building, configuring, finding the right MPI version of Povray, fixing bugs in the original MPI version (it was almost impossible to compile MPI Povray under Scali without changes in the source code) and benchmarking took more then a week of quite qualified personal while we spent less then one day to write the proposed program in MC# language and make all necessary tests for it … References[1] MC# Official Site - http://u.pereslavl.ru/~vadim/MCSharp/ [2] Povray Official Site - http://www.povray.org/ [3] Cw Official Site - http://research.microsoft.com/comega/ [4] Mono Official Site - http://www.mono-project.com/
AttachmentsFull source code of the program
|
|