01_corpus:01_subcorpora
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| 01_corpus:01_subcorpora [2020/04/15 06:37] – ↷ Page name changed from 01_corpus:subcorpora to 01_corpus:01_subcorpora simone | 01_corpus:01_subcorpora [2022/06/27 07:21] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| ===== Definitions for sub-corpora ===== | ===== Definitions for sub-corpora ===== | ||
| - | * Each chat was to be assigned to only one language-sub-corpus. As mentioned in the section [[01_corpus: | + | * Each chat was to be assigned to only one language-sub-corpus. |
| * Additionally, | * Additionally, | ||
| - | * Where additional tasks were performed on individual chats (e.g. normalization or part-of-speech tagging) we created additional sub-corpora | + | * Where additional tasks were performed on individual chats (e.g. normalization or part-of-speech tagging) we created additional sub-corpora per language. |
| Line 25: | Line 25: | ||
| * WUS_ROH_DEMOG: | * WUS_ROH_DEMOG: | ||
| + | Additionally to these corpora, you also see corpora with lowercase letters in the browser (e.g. deu-rftagged, | ||
| ===== Smaller corpora ===== | ===== Smaller corpora ===== | ||
| Line 32: | Line 32: | ||
| * WUS_SMALL_DEMOG: | * WUS_SMALL_DEMOG: | ||
| * WUSdemographics: | * WUSdemographics: | ||
| - | * WUS_ARGDROP and WUS_ARGDROP_language: | + | * WUS_ARGDROP and WUS_ARGDROP_language: |
| + | ===== Other corpora in the browsing tool ===== | ||
| + | Additionally to these corpora, you also see corpora with lowercase letters in the browser (e.g. deu-rftagged, | ||
| ===== More information about the subcorpora ===== | ===== More information about the subcorpora ===== | ||
01_corpus/01_subcorpora.1586932624.txt.gz · Last modified: (external edit)
