README 3.62 KB
Newer Older
GUYOT DOMINIQUE's avatar
GUYOT DOMINIQUE committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116
                                              _                 _ 
                        _ __   __ _ _ __ __ _| | ___   __ _  __| |
                       | '_ \ / _` | '__/ _` | |/ _ \ / _` |/ _` |
                       | |_) | (_| | | | (_| | | (_) | (_| | (_| |
                       | .__/ \__,_|_|  \__,_|_|\___/ \__,_|\__,_|
                       |_|                                        


WHAT FOR?

Paraload is a large scale load balancer program for independent calculation tasks.
Paraload is a program to distribute independent calculations of tasks on a very 
large number of processors, as these are on the same machine or not.
This tool allows very significantly accelerate the processing speed
of any calculation parallelizable by the data.
Paraload is an application client / server that uses TCP / IP connections
to distribute data and the command to execute in parallel on multiple servers.


REQUIREMENT:

You absolutely need to use it with LINUX systems (at least 2.5.44).
The standard library of C, the C++ library STL2011 (at least), GCC, G++, GNU Make.
AND...

That's all!


INSTALL:

Untar and unzip the tar.gz:
$tar zxvf paraload.tar.gz

Go into the src directory:
$cd src

Compile the sources:
$make

Then you can use it localy
$./paraload .......

Or copy it in /usr/bin as root.


GENERAL USE:

$paraload --help


The configuration is by default in paraload.conf.
You can change it with --conf(-C) [your conf]

paraload.conf:

The better way to understand is to give an exemple:

As an exemple you want to cut input file where lines are beginning by ">", we call that sentinel (for blast for instance):
SENTINEL=		>
Every data inside two ">" is called an atomic job.

You want to send 10 atomics jobs to each clients (we call that a chunk)
POLICY_LIMIT=		10
Here we have a chunk size of 10.

We only want to copy the input on the output (here it is the command line that execute the client)
PRGM=			cat  #chunkin# > #chunkout#

But you can also ask for some protein blast computation:
PRGM=			blastp --in #chunkin# --out #chunkout# --db nr.fas

Try to compute this above 10 times if it fail:
ROUND=			10

I don't want authentication (neccessary if your user id is not the same on the client and on the server)
AUTHENTICATION=		NO


Normaly you don't have to change thoses parameters (for expert users it is some system and network stack configuration)
LISTEN_BACKLOG=		1024
EPOLL_TIME_WAIT=	10000
EPOLL_MAX_EVENTS=	64
MAX_SOCKETS=		2048
TCP_KEEPIDLE=		600
TCP_KEEPINTVL=		10
TCP_KEEPCNT=		5

IT IS NECCESSARY THAT THE PROGRAM YOU WANT TO USE IN THE PRGM LINE IS INSTALLED
ON EVERY COMPUTER WHERE YOU WANT TO LAUNCH THE CLIENTS.




After this configuration you can launch the server:{} is optional

$paraload -s -p [a port number between 10000 and 65535] -i [the file containing the sentinel] -o [as you want, it is the output]
 -l [name of the log file] {-b(this is to run it in back ground) -r [name of report file]}

Then the serveur wait for some clients.

Now you can launch a pool of clients on the host of the server OR NOT.....

$paraload -c -p [the same as the server] -h [the name of the host of the server or its ip adress]
$paraload -c -p [the same as the server] -h [the name of the host of the server or its ip adress]
$paraload -c -p [the same as the server] -h [the name of the host of the server or its ip adress]
....
$paraload -c -p [the same as the server] -h [the name of the host of the server or the ip adress]


The more you have clients the faster will be the computation.


Wait for the end of the computation, by having a look to the end of the report file or wait for the return of the server command
if the server is not in background.