The Affiliation Skill Cartridge® extracts and normalizes the names of Organizations and Locations appearing in the Affiliation text zone of scientific literature such as Pubmed, Embase, Scopus. Organizations are subdivided in
To achieve this task, Affiliation embeds a variety of predictive natural language processing methods including morphological and syntactical heuristics, lexicon analysis and statistical methods. For example, "UNIFESP" may be extracted as an Academia (from lexicon) and normalized as Fed. Univ. São Paulo while "Universidade Federal de São Paulo" may be extracted as an Academia due to the headword Universidade Federal and normalized as Fed. Univ. São Paulo.
Two procedures are included in the Affiliation SC®. The default procedure (Affiliation) extracts all Organizations (Academia, Hospital and Medical center, Institute, Company, Strong Guessed Company and Sub-Entity), Locations (Continent, Country, Admindiv1, Admindiv2 and City), Potential entities (Guessed Company, Potential Affiliation, Other Location). A secondary procedure (Affiliation-restrictive) emphasizes precision and does not extract Potential entities.
Customization and Extension
Affiliation can be customized through Luxid® Knowledge Editor, to add specific Organization names, Locations and headwords . The headwords or expressions for the different languages will be used to extract Organization entities when they apppear in combination with what the Skill Cartridge® identifies as a potential organization name.
Especially suited to extracting information from scientific literature, Affiliation may be used in situations where the identities of researchers and their insitutions are of specific interest :
- analysis of citation paths in scientific literature
- analysis of collaboration networks