Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Open sidebar
pipelet
Pipelet
Commits
0db292e4
Commit
0db292e4
authored
Nov 25, 2010
by
Marc Betoule
Browse files
org Formatting in the doc
parent
82c98ffb
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
35 additions
and
26 deletions
+35
-26
README.org
README.org
+35
-26
No files found.
README.org
View file @
0db292e4
The Pipelet Readme
Pipelet is a free framework allowing for the creation, execution and
browsing of scientific data processing pipelines. It provides:
...
...
@@ -56,7 +58,7 @@ You may find useful to install some generic scientific tools that nicely interac
There is not any published stable release of Pipelet right now.
git clone git://gitorious.org/pipelet/pipelet.git
=
git clone git://gitorious.org/pipelet/pipelet.git
=
**** Installing Pipelet
...
...
@@ -66,17 +68,19 @@ sudo python setup.py install
1. Run the test pipeline
cd test/first_test
python main.py
=cd test/first_test=
=python main.py=
2. Add this pipeline to the web interface
pipeweb track test ./.sqlstatus
=
pipeweb track test ./.sqlstatus
=
3. Set the access control and launch the web server
pipeutils -a username -l 2 .sqlstatus
pipeweb start
=pipeutils -a username -l 2 .sqlstatus=
=pipeweb start=
4. You should be able to browse the result on the web page
http://localhost:8080
...
...
@@ -85,7 +89,7 @@ pipeweb start
To get a new pipeline framework, with example main and segment scripts :
pipeutils -c pipename
=
pipeutils -c pipename
=
This command ends up with the creation of directory named pipename wich contains:
+ a main script (named main.py) providing functionnalities to execute
...
...
@@ -109,12 +113,16 @@ The dependencies between segments must form a directed acyclic
graph. This graph is described by a char string using a subset of the
graphviz dot language (http://www.graphviz.org). For example the string:
"""
a -> b -> d;
c -> d;
c -> e;
=a -> b -> d;=
=c -> d;=
=c -> e;=
"""
defines a pipeline with 5 segments {"a", "b", "c", "d", "e"}. The
relation "a->b" ensures that the processing of the segment "a" will be
done before the processing of its child segment "b". Also the output
...
...
@@ -129,8 +137,8 @@ named "se.py" and "s.py". This way, different segments of the pipeline
can share the same code, if they are given a name with a common root
(this mechanism is useful to write generic segment and is completed by
the hooking system, described in the advanced usage section). The code
is then executed in a specific namespace (see below The execution
environment).
is then executed in a specific namespace (see below
[[*The%20segment%20environment][
The execution
environment
]]
).
*** The Pipeline object
...
...
@@ -140,7 +148,7 @@ P = Pipeline(pipedot, codedir=, prefix=)
- pipedot is the string description of the pipeline
- codedir is the path where the segment scripts can be found
- prefix is the path to the data repository (see below Hierarchical data storage)
- prefix is the path to the data repository (see below
[[*Hierarchical%20data%20storage][
Hierarchical data storage
]]
)
It is possible to output the graphviz representation of the pipeline
(needs graphviz installed). First, save the graph string into a .dot
...
...
@@ -206,8 +214,8 @@ final output set of segment "melt" will be:
[('Lancelot the Brave'), ('Lancelot the Pure'), ('Galahad the Brave'), ('Galahad the Pure')].
This default behavior can be altered by specifying a #multiplex
directive in the commentary of the segment code. See section Multiplex
directive for more details.
directive in the commentary of the segment code. See section
[[*Multiplex%20directive][
Multiplex
directive
]]
for more details.
As the segment execution order is not uniquely determined by the pipe
scheme (several path may exists), it is not possible to retrieve an
...
...
@@ -218,7 +226,7 @@ above example, one can read "melt" inputs using:
k = seg_input["knights"]
q = seg_input["quality"]
See section 'The segment environment' for more details.
See section
[[*The%20segment%20environment][
'The segment environment'
]]
for more details.
*** Orphan segments
...
...
@@ -236,7 +244,7 @@ id = seg_input['segnamephantom']
or
id = seg_input.values()[0]
See section
'
The segment environment
'
for more details.
See section
[[*The%20segment%20environment][
The segment environment
]]
for more details.
*** Hierarchical data storage
...
...
@@ -311,7 +319,7 @@ The segment code is executed in a specific environment that provides:
Pipelet enables you to write reusable generic
segments by providing a hooking system via the hook function.
hook (hookname, globals()): execute Python script ‘segname_hookname.py’ and update the namespace.
See the Hooking system for more details.
See the
section [[*the%20Hooking%20system][
Hooking system
]]
for more details.
*** The example pipelines
...
...
@@ -680,6 +688,7 @@ Logs are ordered by date.
* Advanced usage
** Multiplex directive
The default behavior can be altered by specifying a #multiplex
directive in the commentary of the segment code. If several multiplex
...
...
@@ -729,7 +738,7 @@ Another caution on the use of group: segment input data type is no
longer a dictionary in those cases as the original tuple is
transformed but simply the result of the class function.
See section
'
The segment environment
'
for more details.
See section
[[*The%20segment%20environment][
The segment environment
]]
for more details.
** Depend directive
...
...
@@ -816,8 +825,8 @@ File : myenvironment.py
The Pipelet engine objects (segments, tasks, pipeline) are available
from the worker attribut self._worker. See section
"
The Pipelet
actors"
for more details about the Pipelet machinery.
from the worker attribut self._worker. See section
[[*the%20pipelet%20actors][
The Pipelet
actors]]
for more details about the Pipelet machinery.
*** Writing new environment
...
...
@@ -850,8 +859,8 @@ The segment output argument has to be returned by the _close(self, glo)
method.
The pipelet engine objects (segments, tasks, pipeline) are available
from the worker attribut self._worker. See section
"
The Pipelet
actors
"
for more details about the Pipelet machinery.
from the worker attribut self._worker. See section
[[*the%20pipelet%20actors][
The Pipelet
actors
]]
for more details about the Pipelet machinery.
*** Loading another environment
...
...
@@ -869,11 +878,11 @@ Pipeweb use the cherrypy web framework server and can be run behind an
apache webserver which brings essentially two advantages:
- https support.
- faster static files serving.
See the
cherrypy
documentation for hints about this.
See the
[[http://www.cherrypy.org/wiki/TableOfContents][cherrypy]]
documentation for hints about this.
* The
p
ipelet actors
* The
P
ipelet actors
This section document the code for develop
p
ers.
This section document the code for developers.
The code documentation can be built using the doxygen configuration
file
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment