Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Open sidebar
limbra
limbra
Commits
5668b350
Commit
5668b350
authored
Jul 06, 2017
by
LE GAC Renaud
Browse files
Update the API documentation.
parent
1f65e815
Changes
54
Hide whitespace changes
Inline
Side-by-side
Showing
14 changed files
with
237 additions
and
145 deletions
+237
-145
docs/api/generated/recordpubli/invenio_tools.recordpubli.RecordPubli.year.rst
...ecordpubli/invenio_tools.recordpubli.RecordPubli.year.rst
+0
-6
docs/api/generated/reportthesis/invenio_tools.recordthesis.RecordThesis.these_town.rst
...is/invenio_tools.recordthesis.RecordThesis.these_town.rst
+0
-6
docs/api/harvester.rst
docs/api/harvester.rst
+10
-5
docs/api/invenio.rst
docs/api/invenio.rst
+6
-3
modules/harvest_tools/automaton.py
modules/harvest_tools/automaton.py
+10
-9
modules/harvest_tools/checkandfix.py
modules/harvest_tools/checkandfix.py
+10
-21
modules/invenio_tools/__init__.py
modules/invenio_tools/__init__.py
+2
-1
modules/invenio_tools/base.py
modules/invenio_tools/base.py
+3
-3
modules/invenio_tools/inveniostore.py
modules/invenio_tools/inveniostore.py
+5
-3
modules/invenio_tools/record.py
modules/invenio_tools/record.py
+46
-3
modules/invenio_tools/recordconf.py
modules/invenio_tools/recordconf.py
+5
-4
modules/invenio_tools/recordinst.py
modules/invenio_tools/recordinst.py
+20
-3
modules/invenio_tools/recordpubli.py
modules/invenio_tools/recordpubli.py
+115
-73
modules/invenio_tools/recordthesis.py
modules/invenio_tools/recordthesis.py
+5
-5
No files found.
docs/api/generated/recordpubli/invenio_tools.recordpubli.RecordPubli.year.rst
deleted
100644 → 0
View file @
1f65e815
invenio_tools.recordpubli.RecordPubli.year
==========================================
.. currentmodule:: invenio_tools.recordpubli
.. automethod:: RecordPubli.year
\ No newline at end of file
docs/api/generated/reportthesis/invenio_tools.recordthesis.RecordThesis.these_town.rst
deleted
100644 → 0
View file @
1f65e815
invenio_tools.recordthesis.RecordThesis.these_town
==================================================
.. currentmodule:: invenio_tools.recordthesis
.. automethod:: RecordThesis.these_town
\ No newline at end of file
docs/api/harvester.rst
View file @
5668b350
...
...
@@ -2,14 +2,19 @@
harvest_tools
-------------
The *harvest_tools* package contains all classes to harvest
er
the invenio
The *harvest_tools* package contains all classes to harvest the invenio
store and to load the publication in the database.
The base class is :class:`.Automaton`.
All the other classes inherited from it.
Inherited classes specialise the work for :class:`.Articles`, :class:`.Notes`,
:class:`.Preprints`, :class:`.Proceedings`, :class:`.Reports`,
:class:`.Talks` and :class:`.Thesis`.
All the specialised classes inherited from it:
* :class:`.Articles`
* :class:`.Notes`
* :class:`.Preprints`
* :class:`.Proceedings`
* :class:`.Reports`
* :class:`.Talks`
* :class:`.Thesis`
The automaton is instantiated by the factory :func:`.build_harvester_tool`,
for a given category of publication.
...
...
docs/api/invenio.rst
View file @
5668b350
...
...
@@ -8,7 +8,6 @@ store and to retrieve the publications according to user search criteria.
The main classes are:
* :class:`.InvenioStore` to search publications in the store,
* :class:`.Marc12` to instantiate the record associated to a publication.
* :class:`.RecordPubli`, :class:`.RecordConf`, :class:`.RecordThesis`
Constants
...
...
@@ -20,14 +19,20 @@ Constants
~base.ARXIV
~base.ARXIV_PDF
~base.MSG_INV_CONF
~base.MSG_INV_CONF_KEY
~base.MSG_NO_CONF
~base.MSG_NO_CONF_ID_KEY
~base.MSG_NO_COUNTRY
~base.MSG_NO_PUBLISHER
~base.MSG_NO_THESIS
~base.MSG_WELL_FORMED_COLLABORATION
~base.OAI
~base.OAI_URL
~base.REG_ARXIV_NUMBER
~base.REG_AUTHOR
~base.REG_DATE
~base.REG_CONF
~base.REG_OAI
~base.REG_YEAR
~base.THESIS_DIR
...
...
@@ -53,8 +58,6 @@ Classes
:toctree: generated/
~inveniostore.InvenioStore
~iterrecord.IterRecord
~marc12.Marc12
~record.Record
~recordconf.RecordConf
~recordinst.RecordInst
...
...
modules/harvest_tools/automaton.py
View file @
5668b350
...
...
@@ -491,7 +491,7 @@ class Automaton(object):
return
0
def
process_collection
(
self
,
collection
):
"""
"
Retrieve JSON objects from the invenio store and for the given
"""Retrieve JSON objects from the invenio store and for the given
collection. Corresponding records are inserted in the database.
Args:
...
...
@@ -500,8 +500,8 @@ class Automaton(object):
Note:
* Design to never stop although exceptions are raised
* Have a look to the collection_logs and logs
in order to
understand what happen.
* Have a look to the
attributes ``
collection_logs
``
and
``
logs
``
in order to
understand what happen.
"""
if
self
.
dbg
:
...
...
@@ -605,15 +605,16 @@ class Automaton(object):
"""Process the publication identified by its record identifier:
* get the publication data from the store using its identifier
* instantiate the record (RecordPubli, REcordConf, RecordThesis)
* instantiate the record: ``RecordPubli``, ``RecordConf``
or ``RecordThesis``
* process OAI data
* check the record
* insert new record in the database
Note:
* Design to never stop although exception are raised
* Have a look to the collection_logs and logs in
order to
understand what happen.
* Have a look to the
attribute ``
collection_logs
``
and
``
logs
``
in
order to
understand what happen.
Args:
rec_id (int):
...
...
@@ -647,8 +648,8 @@ class Automaton(object):
Note:
* Design to never stop although exceptions are raised
* Have a look to the collection_logs and logs
in order to
understand what happen.
* Have a look to the
attributes ``
collection_logs
``
and
``
logs
``
in order to
understand what happen.
Args:
host (unicode):
...
...
@@ -683,7 +684,7 @@ class Automaton(object):
dict:
* ``collection_logs`` list of :class:`MsgCollection`
* ``controller`` unicode
* ``logs`` list of :class:Msg
* ``logs`` list of :class:
`
Msg
`
* ``selector`` :class:`plugin_dbui.Selector`
"""
...
...
modules/harvest_tools/checkandfix.py
View file @
5668b350
...
...
@@ -329,6 +329,7 @@ class CheckAndFix(object):
def
authors
(
self
,
record
):
"""Check that:
* author fields are defined.
* first author is not like ATLAS Collaboration
...
...
@@ -359,10 +360,8 @@ class CheckAndFix(object):
Raises:
CheckException:
* the collaboration is unknown
(neither collaboration nor synonym)
* the collaboration is unknown (neither collaborationnor synonym)
* more than one synonym found.
"""
if
self
.
dbg
:
print
"
\t\t
Check collaboration"
...
...
@@ -406,18 +405,6 @@ class CheckAndFix(object):
* the country is unknown (neither country nor synonym)
* more than one synonym found.
"""
"""Check conference country.
Have a look to the synonyms when the country does not exist.
Args:
record (RecordConf):
record describing a talk or a proceeding.
Raises:
CheckException:
the country is not defined nor entered as a synonym.
"""
if
self
.
dbg
:
print
"
\t\t
Check country"
...
...
@@ -455,7 +442,7 @@ class CheckAndFix(object):
raise
CheckException
(
*
e
.
args
)
def
conference_date
(
self
,
record
):
"""Check conference date
and format it properly
.
"""Check conference date
exists and well formatted
.
Args:
record (RecordConf):
...
...
@@ -562,8 +549,8 @@ class CheckAndFix(object):
fmt (str):
define the format for author names.
Possible values are
"
First, Last
", "
F. Last
", "
Last
"
,
"
Last, First
"
and
"
Last F.
"
Possible values are
``
First, Last
``, ``
F. Last
``, ``
Last
``
,
``
Last, First
``
and
``
Last F.
``
"""
if
self
.
dbg
:
...
...
@@ -778,7 +765,7 @@ class CheckAndFix(object):
sort authors by family name when true otherwise use the
order of authors at the creation of the record
Return
Return
s:
str:
* the found affiliation
* an empty string when the rescue list is used.
...
...
@@ -900,7 +887,8 @@ class CheckAndFix(object):
record describing a publication.
Raises:
CheckException:
CheckException::
* the publisher is unknown (neither abbreviation nor synonym)
* more than one synonym found.
...
...
@@ -939,7 +927,8 @@ class CheckAndFix(object):
record describing a publication.
Raises:
CheckException:
CheckException::
* the date is not well formed
* more than one date are found.
...
...
modules/invenio_tools/__init__.py
View file @
5668b350
...
...
@@ -50,7 +50,8 @@ def load_record(host, record_id):
either RecordPubli, RecordInst, RecordConf of RecordThesis.
Raises:
CdsException:
CdsException::
* the server return an HTTP error.
* no JSON object could be decoded.
...
...
modules/invenio_tools/base.py
View file @
5668b350
...
...
@@ -114,11 +114,11 @@ def is_thesis(recjson):
"""True when the record describes a thesis.
Args:
rec
ord (Record): MARC12 record associated to a publication
or to and institute.
rec
json (dict):
record associated to a publication
or to and institute.
Return:
bool: ``True`` when the
MARC
record describes a thesis.
bool: ``True`` when the record describes a thesis.
"""
# THESIS in collection
...
...
modules/invenio_tools/inveniostore.py
View file @
5668b350
...
...
@@ -371,7 +371,8 @@ class InvenioStore(object):
* The list is empty when the request failed on the server.
Raises:
CdsException:
CdsException::
* keyword argument is invalid;
* the server return an HTTP error;
* JSON object can't be decoded
...
...
@@ -430,10 +431,11 @@ class InvenioStore(object):
Returns:
dict:
the record data (
MarcJSON
).
the record data (
recjson
).
Raises:
CdsException:
CdsException::
* the server return an HTTP error.
* no JSON object could be decoded.
...
...
modules/invenio_tools/record.py
View file @
5668b350
...
...
@@ -16,53 +16,95 @@ class Record(dict):
record[field] = [dict1(subfield1=..., subfield2=...),
dict2(subfield1=..., subfield2=...), ...]
f
or an article, typical field ares (cds 1951625, ins 1319638):
:
F
or an article, typical field ares (cds 1951625, ins 1319638
, *etc.*
):
+-----------------------------+-----------------------------+
| field (cds) | field (inspirehep) |
+
-----------------------------+-----------------------------
+
+
=============================+=============================
+
| | FIXME_OAI |
+-----------------------------+-----------------------------+
| abstract | abstract |
+-----------------------------+-----------------------------+
| accelerator_experiment | accelerator_experiment |
+-----------------------------+-----------------------------+
| agency_code | |
+-----------------------------+-----------------------------+
| authors | authors |
+-----------------------------+-----------------------------+
| base | |
+-----------------------------+-----------------------------+
| collection | collection |
+-----------------------------+-----------------------------+
| comment | comment |
+-----------------------------+-----------------------------+
| copyright_status | |
+-----------------------------+-----------------------------+
| corporate_name | corporate_name |
+-----------------------------+-----------------------------+
| creation_date | creation_date |
+-----------------------------+-----------------------------+
| doi | doi |
+-----------------------------+-----------------------------+
| email_message | |
+-----------------------------+-----------------------------+
| filenames | filenames |
+-----------------------------+-----------------------------+
| files | files |
+-----------------------------+-----------------------------+
| filetypes | filetypes |
+-----------------------------+-----------------------------+
| imprint | imprint |
+-----------------------------+-----------------------------+
| keywords | keywords |
+-----------------------------+-----------------------------+
| language | |
+-----------------------------+-----------------------------+
| license | license |
+-----------------------------+-----------------------------+
| number_of_authors | number_of_authors |
+-----------------------------+-----------------------------+
| number_of_citations | number_of_citations |
+-----------------------------+-----------------------------+
| number_of_comments | number_of_comments |
+-----------------------------+-----------------------------+
| number_of_reviews | number_of_reviews |
+-----------------------------+-----------------------------+
| oai | |
+-----------------------------+-----------------------------+
| other_report_number | |
+-----------------------------+-----------------------------+
| persistent_identifiers_keys | persistent_identifiers_keys |
+-----------------------------+-----------------------------+
| physical_description | physical_description |
+-----------------------------+-----------------------------+
| prepublication | prepublication |
+-----------------------------+-----------------------------+
| primary_report_number | primary_report_number |
+-----------------------------+-----------------------------+
| publication_info | publication_info |
+-----------------------------+-----------------------------+
| recid | recid |
+-----------------------------+-----------------------------+
| | reference |
+-----------------------------+-----------------------------+
| report_number | |
+-----------------------------+-----------------------------+
| | source_of_acquisition |
+-----------------------------+-----------------------------+
| status_week | |
+-----------------------------+-----------------------------+
| subject | subject |
+-----------------------------+-----------------------------+
| system_control_number | system_control_number |
+-----------------------------+-----------------------------+
| thesaurus_terms | thesaurus_terms |
+-----------------------------+-----------------------------+
| title | title |
+-----------------------------+-----------------------------+
| | title_additional |
+-----------------------------+-----------------------------+
| url | |
+-----------------------------+-----------------------------+
| version_id | version_id |
+-----------------------------+-----------------------------+
...
...
@@ -207,7 +249,8 @@ class Record(dict):
"""The Open Archive Initiative identifier URL(s).
Returns:
str: the primary and secondary URLs are separated by a comma.
unicode:
the primary and secondary URLs are separated by a comma.
The pattern of the URL is ``http://host/record/id`` or
an empty string when it is not defined or when the OAI is
not well formed.
...
...
modules/invenio_tools/recordconf.py
View file @
5668b350
...
...
@@ -8,19 +8,20 @@ from .recordpubli import RecordPubli
class
RecordConf
(
RecordPubli
):
"""The record describing a conference talk or a proceeding.
Additional field describing the conference data are:
:
Additional field describing the conference data are:
+----------------+-----------------------------------------------+
| field | subfield |
+
----------------+-----------------------------------------------
+
+
================+===============================================
+
| meeting_name | closing_date, coference_code, country, date, |
| | location, opening_date, year |
+----------------+-----------------------------------------------+
One field is added by limbra:
+----------------+-----------------------------------------------+
| field (limbra) | subfield |
+
----------------+-----------------------------------------------
+
+
================+===============================================
+
| meeting_note | recid, url |
+----------------+-----------------------------------------------+
...
...
@@ -77,7 +78,7 @@ class RecordConf(RecordPubli):
Returns:
unicode:
-
empty string when not defined
empty string when not defined
"""
# algorithm depends on the store
...
...
modules/invenio_tools/recordinst.py
View file @
5668b350
...
...
@@ -11,35 +11,52 @@ MSG_INVALID_RECORD = "Invalid record, it is not describing an institute"
class
RecordInst
(
Record
):
"""The record describing an institute.
Fields are:
:
Fields are:
+-----------------------------+----------------------------------+
| field (inspirehep) | subfield |
+
-----------------------------+----------------------------------
+
+
=============================+==================================
+
| FIXME_OAI | id, set |
+-----------------------------+----------------------------------+
| administrative_history | |
+-----------------------------+----------------------------------+
| authority_institution | institution |
+-----------------------------+----------------------------------+
| cataloguer_info | creation_date, modification_date |
+-----------------------------+----------------------------------+
| collection | primary, secondary |
+-----------------------------+----------------------------------+
| corporate_name | name |
+-----------------------------+----------------------------------+
| creation_date | |
+-----------------------------+----------------------------------+
| files | |
+-----------------------------+----------------------------------+
| filetypes | |
+-----------------------------+----------------------------------+
| number_of_citations | |
+-----------------------------+----------------------------------+
| number_of_comments | |
+-----------------------------+----------------------------------+
| number_of_reviews | |
+-----------------------------+----------------------------------+
| persistent_identifiers_keys | |
+-----------------------------+----------------------------------+
| recid | |
+-----------------------------+----------------------------------+
| source_of_description | note |
+-----------------------------+----------------------------------+
| system_control_number | institute, value |
+-----------------------------+----------------------------------+
| url | |
+-----------------------------+----------------------------------+
| version_id | |
+-----------------------------+----------------------------------+
One field is added by limbra:
+-----------------------------+----------------------------------+
| field (limbra) | subfield |
+
-----------------------------+----------------------------------
+
+
=============================+==================================
+
| corporate_note | identifier, futur_identifier, |
| | name |
+-----------------------------+----------------------------------+
...
...
modules/invenio_tools/recordpubli.py
View file @
5668b350
...
...
@@ -69,62 +69,104 @@ def to_str(x):
class
RecordPubli
(
Record
):
"""The record describes an article, preprint, proceeding, report and talk.
The main ``field`` and ``subfield`` are::
+---------------------------------+----------------------------------+
| field | subfield |
+---------------------------------+----------------------------------+
| FIXME_OAI (inspire) | id |
| abstract | |
| accelerator_experiment | |
| agency_code (cds) | |
| authors | INSPIRE_number, affiliation, |
| | control_number, first_name, |
| | full_name, last_name, |
| | relator_name (phd director) |
| base (cds) | |
| collection | |
| comment | |
| copyright_status (cds) | |
| corporate_name | collaboration |
| creation_date | |
| doi | |
| email_message (cds) | |
| filenames | |
| files | comment, description, eformat, |
| | full_name, full_path, magic, |
| | name, path, size, status, |
| | subformat, superformat, type, |
| | url, version |
| filetypes | |
| imprint | |
| keywords | |
| language (cds) | |
| license | |
| number_of_authors | |
| number_of_citations | |
| number_of_comments | |
| number_of_reviews | |
| oai (cds) | value |
| other_report_number (cds) | |
| persistent_identifiers_keys | |
| physical_description | |
| prepublication | date, publisher_name, place |
| primary_report_number | |
| publication_info | pagination, title, volume, year |
| recid | none |
| reference (inspire) | |
| report_number (cds) | internal, report_number |
| source_of_acquisition (inspire) | |
| status_week (cds) | |
| subject | |
| system_control_number | institute, value or canceled |
| thesaurus_terms | |
| title | title |
| title_additional (inspire) | |
| url (cds) | description, url |
| version_id | |
+---------------------------------+----------------------------------+
The main ``field`` and ``subfield`` are:
+---------------------------------+----------------------------------+
| field | subfield |
+=================================+==================================+
| FIXME_OAI (inspire) | id |
+---------------------------------+----------------------------------+
| abstract | |
+---------------------------------+----------------------------------+
| accelerator_experiment | |
+---------------------------------+----------------------------------+
| agency_code (cds) | |
+---------------------------------+----------------------------------+
| authors | INSPIRE_number, affiliation, |
| | control_number, first_name, |
| | full_name, last_name, |
| | relator_name (phd director) |
+---------------------------------+----------------------------------+
| base (cds) | |
+---------------------------------+----------------------------------+
| collection | |
+---------------------------------+----------------------------------+
| comment | |
+---------------------------------+----------------------------------+
| copyright_status (cds) | |
+---------------------------------+----------------------------------+
| corporate_name | collaboration |
+---------------------------------+----------------------------------+
| creation_date | |
+---------------------------------+----------------------------------+
| doi | |
+---------------------------------+----------------------------------+
| email_message (cds) | |
+---------------------------------+----------------------------------+
| filenames | |
+---------------------------------+----------------------------------+
| files | comment, description, eformat, |
| | full_name, full_path, magic, |
| | name, path, size, status, |
| | subformat, superformat, type, |
| | url, version |
+---------------------------------+----------------------------------+
| filetypes | |
+---------------------------------+----------------------------------+
| imprint | |
+---------------------------------+----------------------------------+
| keywords | |
+---------------------------------+----------------------------------+
| language (cds) | |
+---------------------------------+----------------------------------+
| license | |
+---------------------------------+----------------------------------+
| number_of_authors | |
+---------------------------------+----------------------------------+
| number_of_citations | |
+---------------------------------+----------------------------------+
| number_of_comments | |
+---------------------------------+----------------------------------+
| number_of_reviews | |
+---------------------------------+----------------------------------+
| oai (cds) | value |
+---------------------------------+----------------------------------+
| other_report_number (cds) | |
+---------------------------------+----------------------------------+
| persistent_identifiers_keys | |
+---------------------------------+----------------------------------+
| physical_description | |
+---------------------------------+----------------------------------+
| prepublication | date, publisher_name, place |
+---------------------------------+----------------------------------+
| primary_report_number | |
+---------------------------------+----------------------------------+
| publication_info | pagination, title, volume, year |
+---------------------------------+----------------------------------+
| recid | none |
+---------------------------------+----------------------------------+
| reference (inspire) | |
+---------------------------------+----------------------------------+
| report_number (cds) | internal, report_number |
+---------------------------------+----------------------------------+
| source_of_acquisition (inspire) | |
+---------------------------------+----------------------------------+
| status_week (cds) | |
+---------------------------------+----------------------------------+
| subject | |
+---------------------------------+----------------------------------+
| system_control_number | institute, value or canceled |
+---------------------------------+----------------------------------+
| thesaurus_terms | |
+---------------------------------+----------------------------------+
| title | title |
+---------------------------------+----------------------------------+
| title_additional (inspire) | |
+---------------------------------+----------------------------------+
| url (cds) | description, url |
+---------------------------------+----------------------------------+
| version_id | |
+---------------------------------+----------------------------------+
"""
def
__init__
(
self
,
*
args
):
...
...