Anonymization
Skill Cartridge®
Locates personal identification information for removal to enable anonymous distribution of documents. Originally developped to anonymize case law.
By TEMIS

Scope

Originally developped to anonymize case law for online publication, the Anonymization SC® identifies personal information within documents to facilitate its removal.

Internals

This is implemented as a two step process. First, the SC® recognizes and annotates all family names, company names, postal addresses, e-mail addresses, phone numbers and fax numbers in the document. These items are then tagged as either 'to anonymize' (in cases where for example a name can be replaced by a single letter) or 'to exclude' from anonymization.

  • For People names, the SC® uses titles as triggers to exclude the names of attorneys, magistrates and experts from anonymization. In the current version of the SC®, first names are also excluded from anonymization.
  • Regarding Addresses, only those associated with a party to the case are qualified as 'to anonymize'.
  • Company names are automatically excluded, unless they contain a family name cited in the document as a party.
  • All phone numbers, fax numbers and e-mail addresses are qualified as 'to anonymize'.

The SC® provides two annotation procedures. The first (Anonymization) extracts all the entities and tags them as 'to anonymize' or 'to exclude'; the second (AnoSansExclu) only extracts the 'to anonymize' entities.

Typical Applications

The original use case for which this SC® was developped is the anonymization of legal decisions for online publication in confirmity with national regulations. It may also be adapted for anonymization of any large-scale corpus containing personal information (for example, in Healthcare-related applications).

Skill cartridge
Legal
Language(s): FR
Compatibility: Luxid® 6.0
Posting date: January 2013
Version: 2.2
Business model: Project