01_corpus:02_preprocessing:01_anonymization
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
01_corpus:02_preprocessing:01_anonymization [2020/04/14 17:44] – simone | 01_corpus:02_preprocessing:01_anonymization [2020/04/16 13:38] – simone | ||
---|---|---|---|
Line 17: | Line 17: | ||
===== Last names ===== | ===== Last names ===== | ||
- | Only very few last names can in fact be found in the data. Because of this limitation, the same procedure as with first names could not be applied, because additionally some of the last names used are very rare if not unique. It was therefor | + | Only very few last names can in fact be found in the data. It was decided to replace all last names with [LastName]. |
===== Numbers ===== | ===== Numbers ===== | ||
- | In an effort | + | In order to remove information about phone numbers, bank accounts etc., all numbers with three and more digits where removed and each digit was replaced with one N. Reliability here lies at 100%. |
===== E-Mail addresses ===== | ===== E-Mail addresses ===== | ||
- | All email adresses | + | All email addresses |
===== Street addresses ===== | ===== Street addresses ===== |
01_corpus/02_preprocessing/01_anonymization.txt · Last modified: 2022/06/27 09:21 by 127.0.0.1