Text Mining 360° Arabic
Skill Cartridge®
Extracts and normalizes more than 20 types of specific entities related to Arabic, among which the names of People, Organizations, Locations, as well as Measurements, Time expressions and Contact information.
By TEMIS

Text Mining 360° Arabic (TM360-Arab) extracts and normalizes more than 20 types of specific entities related to Arabic, among which the names of People, Organizations, Locations, as well as Measurements, Time expressions and Contact information. To achieve this task, TM360-Arab embeds a variety of predictive natural language processing methods including morphological and syntactical heuristics, lexicon analysis and statistical methods. TM360-Arab can be customized through Luxid® Knowledge Editor, for example to add vocabularies or patterns of interest as well as new entity types specific to the Arabic language. Especially suited to extract information on well-formed texts, TM360-Arab is also used on different kinds of corpus, like enterprise content. Three procedures are available: NER: Main and default one, it is completely dedicated to entity extraction. RF360: In addition to entity extraction, this procedure also extracts relationships between major entities (People, Locations, Organizations) when separated by a verb. The relation name is the verb one. RF360Plus: Same as RF360 adding unspecified proper nouns as relationship triggers. Some limitations on major entities scope: - The results only take into account the validation of Precision, not Recall. - People names must include a family name. Single first names or references to people names are not extracted. - Locations consist on continents, countries, cities and administrative divisions for cities having more than 15.000 inhabitants.

Skill cartridge
Generic
Language(s): AR
Compatibility: Luxid® 6.0
Posting date: January 2013
Version: 1.0
Business model: Turnkey