01_corpus:start
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
01_corpus:start [2020/04/14 15:22] – ↷ Links adapted because of a move operation simone | 01_corpus:start [2022/06/27 09:21] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== 1. THE CORPUS ====== | ====== 1. THE CORPUS ====== | ||
- | The corpus consists of 617 chats that were sent in by the Swiss population in 2014 through a fixed procedure that was communicated in the press in order to get people interested. The individual chats were checked for their [[01_corpus: | + | The corpus consists of 617 chats that were sent in by the Swiss population in 2014 through a fixed procedure that was communicated in the press in order to get people interested. The individual chats were checked for their [[01_corpus: |
- | In a first step the most basic processing | + | Next processing |
- | In a later step, more [[01_corpus: | + | Our authentic WhatsApp chats were gathered in summer 2014. Not all made it into the corpus |
+ | |||
+ | * Number | ||
+ | * Number of messages (with permission to be used): 763’644 | ||
+ | * Number of informants (who gave their permission): | ||
+ | * Number of tokens: 5' | ||
+ | * Number of emojis: 382' | ||
+ | |||
+ | The corpus is built up of chats in all four national languages of Switzerland, | ||
+ | |||
+ | Available languages: | ||
+ | * fra: French | ||
+ | * ita: Italian | ||
+ | * roh: Any variety of Romansh | ||
+ | * gsw: dialectal German as used in Switzerland | ||
+ | * deu: non-dialectal German | ||
+ | * eng: English | ||
+ | * spa: Spanish | ||
+ | * sla: Any Slavic language | ||
+ | |||
+ | Romansh varieties: | ||
+ | |||
+ | * roh-ja: Jauer Romansh | ||
+ | * roh-sr: romontsch sursilvan | ||
+ | * roh-st: rumàntsch sutsilvan | ||
+ | * roh-sm: rumantsch surmiran | ||
+ | * roh-pt: rumauntsch puter | ||
+ | * roh-vl: rumantsch vallader | ||
+ | * roh-gr: rumantsch grischun | ||
+ | |||
+ | The tool used to browse is [[https:// | ||
+ | |||
+ | Krause, Thomas & Zeldes, Amir (2016): ANNIS3: A new architecture for generic corpus query and visualization. in: Digital Scholarship in the Humanities 2016 (31). [[http:// | ||
01_corpus/start.1586870548.txt.gz · Last modified: 2022/06/27 09:21 (external edit)