User Tools

Site Tools


01_corpus:02_preprocessing:07_normalization

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
01_corpus:04_annotations:03_normalization [2019/10/30 13:40] – ↷ Page moved from corpus:04_annotations:03_normalization to 01_corpus:04_annotations:03_normalization simone01_corpus:02_preprocessing:07_normalization [2022/06/27 09:21] (current) – external edit 127.0.0.1
Line 1: Line 1:
-====== Normalization ======+====== 1.2.7 Normalization ====== 
 +Normalization is the task of "translating" non-standard language data into standard language. It can be performed manually or automatically with computational linguistics tools. 
 + 
 +In the case of our corpus, we have manually normalized some data in the Swiss German dialect, resulting in the corpus WUS_DIALOG_GSW (5 chats, 34,683 tokens).
  
01_corpus/02_preprocessing/07_normalization.1572439205.txt.gz · Last modified: 2022/06/27 09:21 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki