Text Mining 360°
Skill Cartridge®
Extracts and normalizes 20+ types of common entities among which the names of People, Companies, Organizations, Locations, as well as Measurements, Money, Time expressions and Contact information.
By TEMIS

Scope

TM360 extracts and normalizes more than 20 types of common entities among which the names of People, Companies, Organizations, Locations, as well as Measurements, Money, Time expressions and Contact information.

Internals

To achieve these tasks, TM360 embeds a variety of predictive natural language processing methods including morphological and syntactical heuristics, lexicon analysis and statistics. The main procedure (NER) offered by TM360-Port is entirely dedicated to entity extraction. Two secondary procedures (RF360 and RF360 Plus) enable the extraction of basic relationships that is triggered when a verb separates major entities (People, Companies, Locations, Organizations) or when unspecified proper nouns are used.

Customization and Extension

TM360 can be customized through Luxid® Knowledge Editor, for example to add specific vocabularies or patterns of interest as well as new entity types (common examples would include the names of products or technologies, as well as contract, customer, area or project identification references). TM360 also offers several parameters that can be modified in order to enable extra modules, for example personal pronouns or acronyms extraction, or to modify statistical thresholds (scores below which an entity will not be extracted). Common extensions include expanding the granularity of coverage of geographical entities to cities and administrative divisions with less than 15.000 inhabitants, or re-training statistical model on your specific corpus.

Typical Applications

Especially suited to extracting information from well-formed texts, TM360 may be successfully used standalone or in combination with other components in a wide variety of use cases where the entity categories it covers are of particular interest, such as for example : Enriching articles in newspapers, magazines or B2B trade publications or proprietary corporate content with metadata regarding the key entities mentioned, for example to boost navigation or retrievability

Extracting structured, numerical, information mentioned in documents to conduct deep analytics

For earlier versions of this Skill Cartridge®, please contact TEMIS

.

Skill cartridge
Generic
Language(s): FR , EN , DE , ES , IT , NL
Compatibility: Luxid® 6.2
Posting date: September 2013
Version: 6.2.64
Business model: Turnkey
Related links
TM360 Documentation