01_corpus:start
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
01_corpus:start [2019/10/30 13:37] – ↷ Page moved from corpus:start to 01_corpus:start simone | 01_corpus:start [2022/06/27 09:21] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
+ | The corpus consists of 617 chats that were sent in by the Swiss population in 2014 through a fixed procedure that was communicated in the press in order to get people interested. The individual chats were checked for their [[01_corpus: | ||
- | The corpus consists of 617 chats that were sent in by the Swiss population in 2014 through | + | Next processing steps comprised [[01_corpus: |
- | In a first step the most basic processing of the data took place such as to allow the project members to work with the data. This included | + | Our authentic WhatsApp chats were gathered in summer 2014. Not all made it into the corpus |
- | In a later step, more [[corpus:04_annotations|annotations]] were applied | + | * Number of chats: 617 |
+ | * Number of messages (with permission | ||
+ | * Number | ||
+ | * Number of tokens: 5' | ||
+ | * Number of emojis: 382' | ||
+ | |||
+ | The corpus is built up of chats in all four national languages of Switzerland, | ||
+ | |||
+ | Available languages: | ||
+ | * fra: French | ||
+ | * ita: Italian | ||
+ | * roh: Any variety of Romansh | ||
+ | * gsw: dialectal German as used in Switzerland | ||
+ | * deu: non-dialectal German | ||
+ | * eng: English | ||
+ | * spa: Spanish | ||
+ | * sla: Any Slavic language | ||
+ | |||
+ | Romansh varieties: | ||
+ | |||
+ | * roh-ja: Jauer Romansh | ||
+ | * roh-sr: romontsch sursilvan | ||
+ | * roh-st: rumàntsch sutsilvan | ||
+ | * roh-sm: rumantsch surmiran | ||
+ | * roh-pt: rumauntsch puter | ||
+ | * roh-vl: rumantsch vallader | ||
+ | * roh-gr: rumantsch grischun | ||
+ | |||
+ | The tool used to browse is [[https:// | ||
+ | |||
+ | Krause, Thomas & Zeldes, Amir (2016): ANNIS3: A new architecture for generic corpus query and visualization. in: Digital Scholarship in the Humanities 2016 (31). [[http:// | ||
01_corpus/start.1572439058.txt.gz · Last modified: 2022/06/27 09:21 (external edit)