Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
L limbra
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 15
    • Issues 15
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • limbra
  • limbra
  • Issues
  • #67

Closed
Open
Created May 05, 2016 by LE GAC Renaud@legacOwner

Review the affiliation mechanism

  • Since version 0.8.14, the affiliation is based on the short name defined by inspirehep for each institute (110__t and 110__u).
  • The mechanism look for authors with a given affiliation using the institute short name. Here, we assumed that the short name is used both by inspirehep.net and cds.cern.ch to define author's affiliation.
  • Several exceptions have been found in inspirehep.net:
    • https://inspirehep.net/record/1409292
    • https://inspirehep.net/record/1391152
    • https://inspirehep.net/record/1421141
    • https://inspirehep.net/record/1421133
    • https://inspirehep.net/record/1318575
    • https://inspirehep.net/record/1326994
    • http://inspirehep.net/record/1318882
    • http://inspirehep.net/record/1420148
  • as well as in cds.cern.ch:
    • https://cds.cern.ch/record/2050561
  • In addition, we learned that cds.cern.ch will change to use a CERN identifier related to their own database defining institutes.
  • Finally, we have the case in which author come from different institute, i.e LPC CAEN, ENSICAE but belong to the same entity.
  • In the author MARC fields, author's affiliation are defined by the field (100)700__u. An other field exists, (100)700__v which is left free to the cataloguer, but contains the full name (address) of the institute, in most of the case.

PROPOSAL

  1. Add a new database table affiliations containing two columns short name and full name. The first one would match the content of the field 700__u while the second one would match 700__v. One or the other can be undefined but not both.

  2. For each authors build the affiliation as 700_u+700_v

  3. Build a regular expression with the content of the affiliations table. Something like:

        $full_name1^|$long_name2^|$full_name3Long_name3.^|...
  4. Scan authors list to find a match.

  5. In order to facilitate the construction of the affiliations table, a wizard have to be developed. It will relies on:

    • the inspirehep notice
    • a list of authors belonging to the lab
  6. The wizard will propose a list of (full_name, long name) interrogating cds.cern.ch and inspirehep.net. Meaning pairs will selected by the user.

  7. The wizard can be call at any time to add more values in the affiliations table.

  8. In principle, this approach is very general and should cover all cases.

Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking