User Tools

Site Tools


01_corpus:02_preprocessing:01_anonymization

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
01_corpus:02_preprocessing:01_anonymization [2020/04/14 17:44] simone01_corpus:02_preprocessing:01_anonymization [2022/06/27 09:21] (current) – external edit 127.0.0.1
Line 17: Line 17:
  
 ===== Last names ===== ===== Last names =====
-Only very few last names can in fact be found in the data. Because of this limitation, the same procedure as with first names could not be applied, because additionally some of the last names used are very rare if not unique. It was therefor decided to replace all last names with [LastName] instead. In a combined effort of manually analyzing and means of computer linguistics, more than 95% of all last names were removed.+Only very few last names can in fact be found in the data. It was decided to replace all last names with [LastName].  
  
 ===== Numbers ===== ===== Numbers =====
-In an effort to remove information about phone numbers, bank accounts etc., all numbers with three and more digits where removed and each digit was replaced with one N. The phone number 079 987 65 43 would thus become NNN NNN 65 43, while 0799876543 would be NNNNNNNNNN. Reliability here lies at 100%.+In order to remove information about phone numbers, bank accounts etc., all numbers with three and more digits where removed and each digit was replaced with one N. Reliability here lies at 100%.
  
 ===== E-Mail addresses ===== ===== E-Mail addresses =====
-All email adresses were removed and replaced with xxx@yyy.ch, while keeping the number of characters. info@uzh.ch would therefore become xxxx@yyy.ch, while admin@google.com would become xxxxx@yyyyyy.com.+All email addresses were removed and replaced with xxx@yyy.ch, while keeping the number of characters. info@uzh.ch would therefore become xxxx@yyy.ch, while admin@google.com would become xxxxx@yyyyyy.com.
  
 ===== Street addresses ===== ===== Street addresses =====
01_corpus/02_preprocessing/01_anonymization.txt · Last modified: 2022/06/27 09:21 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki