limbra issueshttps://gitlab.in2p3.fr/limbra/limbra/-/issues2018-04-27T15:02:17+02:00https://gitlab.in2p3.fr/limbra/limbra/-/issues/80Improve filter on authors for publications2018-04-27T15:02:17+02:00LE GAC RenaudImprove filter on authors for publications* In the `publications` table, the filter for the grid can be run on the `author` field.
* Currently the operator is `contains`.
* Replace it by the `like` operator. Therefore a search on two authors can be executed, *e.g* `Dupont%Dura...* In the `publications` table, the filter for the grid can be run on the `author` field.
* Currently the operator is `contains`.
* Replace it by the `like` operator. Therefore a search on two authors can be executed, *e.g* `Dupont%Durant`.
* for the time being the search is base on a exact match. It perceived as painful for accentuated letter (é, è, ...). Modify the search pattern in such way that `Hélène` can be found when the user request `helene` !https://gitlab.in2p3.fr/limbra/limbra/-/issues/76Optimize python code2018-04-27T15:02:17+02:00LE GAC RenaudOptimize python codeImprove speed and memory footprint. Systematic use of:
* decorator `@staticmethod`
* PyDAL iterator `db(query).iterselect()`
* Use `pandas.DataFrame`
* itertools: `chain`, `imap`, `ifilter`, `izip`, ....Improve speed and memory footprint. Systematic use of:
* decorator `@staticmethod`
* PyDAL iterator `db(query).iterselect()`
* Use `pandas.DataFrame`
* itertools: `chain`, `imap`, `ifilter`, `izip`, ....https://gitlab.in2p3.fr/limbra/limbra/-/issues/75Add a combobox for the automate filter of the harvester table2018-04-27T15:02:17+02:00LE GAC RenaudAdd a combobox for the automate filter of the harvester table* Filter of the Harvester table
* Automate filter is there but it is a text field
* Replace it with a ComboBox containing the value. The ComboBox can be reset.* Filter of the Harvester table
* Automate filter is there but it is a text field
* Replace it with a ComboBox containing the value. The ComboBox can be reset.https://gitlab.in2p3.fr/limbra/limbra/-/issues/54Review the label for the table controller2018-04-27T15:02:17+02:00LE GAC RenaudReview the label for the table controller* The table controller associate an automate to publications category.
* For historical reason the name of the table, field are `controller(s)`.
* In version 0.8.14, the term `controller` has been replace by `automate`.
* This change ...* The table controller associate an automate to publications category.
* For historical reason the name of the table, field are `controller(s)`.
* In version 0.8.14, the term `controller` has been replace by `automate`.
* This change has to be propagated to that table and the related actions.https://gitlab.in2p3.fr/limbra/limbra/-/issues/53Add language in the application preference2018-04-27T15:02:17+02:00LE GAC RenaudAdd language in the application preference* Allow to choose tke language from the UI.
* Possible values are FR et UK.* Allow to choose tke language from the UI.
* Possible values are FR et UK.https://gitlab.in2p3.fr/limbra/limbra/-/issues/49Create sanity check wizard2018-04-27T15:02:17+02:00LE GAC RenaudCreate sanity check wizard* A wizard similar to `Check And Validate`
* It aim is to check the configuration / set up of the database:
1. The relation between `team` and `project`
2. The configuration of the harvester
3. The relation between automa...* A wizard similar to `Check And Validate`
* It aim is to check the configuration / set up of the database:
1. The relation between `team` and `project`
2. The configuration of the harvester
3. The relation between automaton and category
4. ...
* When a problem is report, ask first to run the sanity check.
* As soon as a new problem is solve, add more test in the sanity check.
* When there is problem with the PDF creation, require to run the `check and validate` wizard. Then to export the `LaTeX` file,....https://gitlab.in2p3.fr/limbra/limbra/-/issues/44Use combobox with multiple selection2018-04-27T15:02:17+02:00LE GAC RenaudUse combobox with multiple selection* ComboBox can be configured to allow *multiple selection* (via multiSelector).
* Use this features in *metrics* and *graphs* when selecting publications categories.
* MultiSelector are available in ExtJS 6, and can be used in differen...* ComboBox can be configured to allow *multiple selection* (via multiSelector).
* Use this features in *metrics* and *graphs* when selecting publications categories.
* MultiSelector are available in ExtJS 6, and can be used in different ways.
* http://examples.sencha.com/extjs/6.0.1/examples/classic/multiselect/multiselect-demo.html
* http://examples.sencha.com/extjs/6.0.1/examples/kitchensink/#multi-selectorhttps://gitlab.in2p3.fr/limbra/limbra/-/issues/39Harvest private collections2018-04-27T15:02:17+02:00LE GAC RenaudHarvest private collections* Private collections are private notes for Atlas, LHCb, ....
* Add a flag to identify *private collection* in the harvester configuration.
* Request login and password when an harvester runs on private collection.
* Exclude the scan ...* Private collections are private notes for Atlas, LHCb, ....
* Add a flag to identify *private collection* in the harvester configuration.
* Request login and password when an harvester runs on private collection.
* Exclude the scan of *private collection* when running all harvesters.
* Remove the wizard insert Marc XML.
Road to explore (I):
* the `cern-get-sso-cookie` is the linux command which might help.
* the packages can be found at http://linuxsoft.cern.ch/cern/centos/7/cern/x86_64/Packages/
* there is a python wrapper: https://github.com/sashabaranov/cernsso
* might require to register you application at CERN via https://sso-management.web.cern.ch/
* might be a good idea to understand the difference between `request` and `urlib`
* Might be a good idea to start with a small python script
Road to explore (II):
* the solution can be in https://media.readthedocs.org/pdf/flask-sso/latest/flask-sso.pdf since `flask` is not so different from `web2py`
* ...https://gitlab.in2p3.fr/limbra/limbra/-/issues/6Automatize the harvesters2018-04-27T15:02:17+02:00LE GAC RenaudAutomatize the harvestersCurrently, each group runs its harvesters manually. This development will run the harvesters for each group periodically.
* Periodicity is once every week.
* The logs will be stored in the database and kept during one month.
* The log...Currently, each group runs its harvesters manually. This development will run the harvesters for each group periodically.
* Periodicity is once every week.
* The logs will be stored in the database and kept during one month.
* The logs can be view using the current *harvester views*.
* The automatize process can be switch off.
* Each harvester can be activated or deactivated in the automatize process.
* This development would relies on the web2py task scheduler.
### Roadmap
* [x] Refactor harvester
* [x] Add automated harvester application parameter
* [x] Setup Scheduler with a skeleton automated harvesting task function
* Phase1: Create a scheduler task for automated harvesting
* [x] If global automated harvester parameter is not *yes* or *true* return from task
* [x] Iterate on all harvester group entry
* [x] If harvest is False continue
* [x] Harvest group using process_url
* [x] Convert logs and collection_logs to json
* [x] Use logging system for debug information
* [x] Add an application parameter to define the execution scheduling
* [x] Queue or dequeue automatic harvesting task according to application parameter values
* [x] Requeue the automatic harvesting task with the new start time if the scheduling is modified
* Phase 2: Create DB tables
* [x] Create a table to hold automatic harvesting logs
* [x] Write json logs and info into the table
* [x] Erase logs older than one month
* [x] Update the DB schema graphic
* Phase 3: Create view for the logs
* [x] Create Selector for harvesting logs display
* [x] Create Controller function for harvesting logs
* [x] Add menu command to display harvesting logs
* [x] Get logs from the database
### Conclusions
From that prototype, we identified all pieces required to run periodically the harvesters:
* task scheduler
* scheduler tables
* task modules
* additional controller to manipualte the task and to give access to the log
It also appears that we have to simplify the interface exposes to the user.
A possible evolution is to create a separate application, SCAN, connected to the task scheduler:
* Give access to the schedule tables
* Contain the logic to authorize the running of the harvester for a given track_publications_xxx database
* Contain the logic to balance the load between the different track_publications_xxx applications
For each track_publication application, the user will have access to:
* a switch to allow or not the periodic scan
* a switch for each harvester
* an action to consult log. It will give access to the date and the harvester log for each team. The layout is a grid where row are grouped per team. Each row contains the date and an hyper-link pointing to the harvester log.