Commit 08e4c630 authored by Maude Le Jeune's avatar Maude Le Jeune
Browse files

readme en cours + debug log pbs

parent 662743a7
......@@ -204,11 +204,14 @@ final output set of segment "melt" will be:
[('Lancelot the Brave'), ('Lancelot the Pure'), ('Galahad the Brave'), ('Galahad the Pure')].
TODO : describe input data type : disctionnary , ... ?
*** Multiplex directive
This default behavior can be altered by specifying a #multiplex
directive in the commentary of the segment code. If several multiplex
directive are present in the segment code the last one is retained.
directives are present in the segment code the last one is retained.
- #multiplex : activate the default behaviour
......@@ -247,7 +250,7 @@ The storage is organized as follows:
/prefix/
- all segment meta data are stored below a root which name corresponds
to an unique match of the segment code.
/prefix/seg_segname_YFLJ65/
/prefix/segname_YFLJ65/
- Segment's meta data are:
- a copy of the segment python script
- a copy of all segment hook scripts
......@@ -255,12 +258,12 @@ The storage is organized as follows:
- a meta data file (.meta) which contains some extra meta data
- all segment instances data and meta data are stored in a specific subdirectory
which name corresponds to a string representation of its input
/prefix/seg_segname_YFLJ65/data/1/
/prefix/segname_YFLJ65/data/1/
- if there is a single segment instance, then data are stored in
/prefix/seg_segname_YFLJ65/data/
/prefix/segname_YFLJ65/data/
- If a segment has at least one parent, its root will be located below
one of its parent's one :
/prefix/seg_segname_YFLJ65/seg_segname2_PLMBH9/
/prefix/segname_YFLJ65/segname2_PLMBH9/
- etc...
*** The segment environment
......@@ -294,7 +297,7 @@ The segment code is executed in a specific environment that provides:
5. Hooking support
Pipelet enables you to write reusable generic
segments by providing a hooking system via the hook function.
hook (hookname, globals()): execute Python script ‘seg_segname_hookname.py’ and update the namespace.
hook (hookname, globals()): execute Python script ‘segname_hookname.py’ and update the namespace.
*** The exemple pipelines
......@@ -302,6 +305,78 @@ The segment code is executed in a specific environment that provides:
**** cmb
** Running Pipes
*** The sample main file
A sample main file is made available when creating a new pipelet
framework. It is copied from the reference file:
pipelet/pipelet/static/main.py
This script illustrates various ways of running pipes. It describes
the different parameters, and also, how to write a
main python script can be used as any binary from the command line
(including options parsing).
*** Common options
Some options are common to each running modes.
**** log level
The logging system is handle by the python logging facility module.
This module defines the following log levels :
+ DEBUG
+ INFO
+ WARNING
+ ERROR
+ CRITICAL
All logging messages are saved in the differents pipelet log files,
available from the web interface (rotating file logging). It is also
possible to print those messages on the standard output (stream
logging), by setting the desired log level in the launchers options:
For example:
import logging
launch_process(P, N,log_level=logging.DEBUG)
If set to 0, stream logging will be disable.
**** matplotlib
The matplotlib documentation says:
"Many users report initial problems trying to use maptlotlib in web
application servers, because by default matplotlib ships configured to
work with a graphical user interface which may require an X11
connection. Since many barebones application servers do not have X11
enabled, you may get errors if you don’t configure matplotlib for use
in these environments. Most importantly, you need to decide what kinds
of images you want to generate (PNG, PDF, SVG) and configure the
appropriate default backend. For 99% of users, this will be the Agg
backend, which uses the C++ antigrain rendering engine to make nice
PNGs. The Agg backend is also configured to recognize requests to
generate other output formats (PDF, PS, EPS, SVG). The easiest way to
configure matplotlib to use Agg is to call:
matplotlib.use('Agg')
"
The matplotlib and matplotlib_interactive options turn the matplotlib
backend to Agg in order to allow the execution in non-interactive
environment.
Those two options are set to True by default in the sample main
script.
TODO : explain why.
*** The interactive mode
This mode has been designed to ease debugging. If P is an instance of
the pipeline object, the syntax reads :
......@@ -313,6 +388,8 @@ w.run()
In this mode, each tasks will be computed in a sequential way.
Do not hesitate to invoque the Python debugger from IPython : %pdb
*** The process mode
In this mode, one can run simultaneous tasks (if the pipe scheme
allows to do so).
......@@ -328,6 +405,18 @@ The number of job is set by the N parameter :
from pipelet.launchers import launch_pbs
launch_pbs(P, N , address=(os.environ['HOST'],50000))
It is possible to specify some job submission options like:
+ job name
+ job header: this string is prepend to the PBS job scripts. You may
want to add some environment specific paths. Log and error files are
automatically handled by the pipelet engine, and made available from
the web interface.
+ cpu time: syntax is: "hh:mm:ss"
The 'server' option can be disable to add some workers to an existing
scheduler.
** Browsing Pipes
*** The pipelet webserver and ACL
......@@ -337,26 +426,38 @@ Each pipeline has to be register using :
pipeweb track <shortname> sqlfile
To remove a pipeline from the tracked list, use :
pipeweb untrack <shortname>
As the pipeline browsing implies a disk parsing, some basic security
has to be set also. All users have to be register with a specific access
level (1 for read-only access, and 2 for write access).
pipeutils -a <username> -l 2 sqlfile
To remove a user from the user list:
pipeutils -d <username> sqlfile
Start the web server using :
pipeweb start
Then the web application will be available on the web page http://localhost:8080
To stop the web server :
pipeweb stop
*** The web application
In order to ease the comparison of different processing, the web
In order to ease the comparison of different processings, the web
interface displays various views of the pipeline data :
**** The index page
The index page display a tree view of all pipeline instances. Each
The index page displays a tree view of all pipeline instances. Each
segment may be expand or reduce via the +/- buttons.
The parameters used in each segments are resumed and displayed with
......@@ -412,6 +513,27 @@ pipelet.utils.rebuild_db_from_disk (prefix, sqlfile)
All information will be retrieve, but with new identifiers.
** The hooking system
As described in the 'segment environment' section, pipelet supports
an hooking system which allows the use of generic processing code, and
code sectioning.
Let's consider a set of instructions that have to be systematically
applied at the end of a segment (post processing), one can put those
instruction in the separate script file named for example
'segname_postproc.py' and calls the hook function:
hook('postproc', globals())
A specific dictionnary can be passed to the hook script to avoid
confusion.
The hook scripts are included into the hash key computation (see
advanced usage section).
** Writing custom environments
The pipelet software provides a set of default utilities available
......@@ -501,3 +623,4 @@ This section document the code for developpers.
** The Tracker object
** The Worker object
** The Environment object
......@@ -245,7 +245,7 @@ echo $PYTHONPATH
f.write ("python -m pipelet.launchers -H %s -p %s -s %s -l %s"%(address[0],address[1],authkey,jobfile.replace('job','worker')))
f.close()
subprocess.Popen(['qsub',jobfile]).communicate()[0]
subprocess.Popen(['qsub','-o' ,jobfile, '-e', errfile,jobfile]).communicate()[0]
if server:
print 'launching the scheduler'
......
......@@ -2,7 +2,7 @@
def main():
import optparse
parser = optparse.OptionParser(usage="\nTo create a new pipeline:\n %prog -c <pipename> [-p <prefix>]\nTo activate acl and setup a new user:\n %prog -a <username> [-l <access_level>] <sql_file>\nTo suppress an existing user:\n %prog -d <username>\nTo change the data root directory : %prod -r old_dir new_dir <sql_file>")
parser = optparse.OptionParser(usage="\nTo create a new pipeline:\n %prog -c <pipename> [-p <prefix>]\nTo activate acl and setup a new user:\n %prog -a <username> [-l <access_level>] <sql_file>\nTo suppress an existing user:\n %prog -d <username> <sql_file>\nTo change the data root directory : %prod -r old_dir new_dir <sql_file>")
parser.add_option('-c', '--create-pipeline',
help='Create a new pipeline',)
parser.add_option('-p', '--prefix',
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment