Table of Contents
"What's up, Switzerland?"
The data underlying the corpus was collected in 2014 to constitute the data base of the research project "What's up, Switzerland?" under the lead of Prof. Elisabeth Stark (University of Zurich). The project was funded by the Swiss National Fund (Sinergia: CRSII1_160714) with CHF 1'832'647 and ran between 2016 - 2020. More about the project ...
Using the corpus
This corpus is freely available for academic, non-commercial research. When using the corpus, please make sure to quote correctly.
Our authentic WhatsApp chats were gathered in summer 2014. Not all made it into the corpus (e.g. doublets, chats or message without permission etc.). In its present form, the corpus comprises:
- Number of chats: 617
- Number of messages (with permission to be used): 763’644
- Number of informants (who gave their permission): 944
- Number of tokens: 5'155'476 (without redactedQ.* (cf. Messages without permission))
- Number of emojis: 382'116
The corpus is built up of chats in all four national languages of Switzerland, i.e. Swiss German dialect, non-dialectal German, French, Italian and varieties of Romansh. In more detail, the following languages and varieties can be found in the corpus:
- fra: French
- ita: Italian
- roh: any variety of Romansh
- gsw: dialectal German as used in Switzerland
- deu: non-dialectal German
- eng: English
- spa: Spanish
- sla: any Slavic language
- roh-ja: Jauer Romansh
- roh-sr: Romontsch Sursilvan
- roh-st: Rumàntsch Sutsilvan
- roh-sm: Rumantsch Surmiran
- roh-pt: Rumauntsch Puter
- roh-vl: Rumantsch Vallader
- roh-gr: Rumantsch Grischun
More information about the corpus can be found in the section corpus and in the following publication:
Ueberwasser, Simone/Stark, Elisabeth (2017). "What’s up, Switzerland? A corpus-based research project in a multilingual country". Linguistik online 84/5, 105-126 DOI: https://doi.org/10.13092/lo.84.3849 .
When using the corpus, please quote as follows:
Stark, Elisabeth; Ueberwasser, Simone; Göhring, Anne (2014-2020). Corpus "What’s up, Switzerland?". University of Zurich. www.whatsup-switzerland.ch.
Stark, Elisabeth; Ueberwasser, Simone (2020): The corpus "What's up, Switzerland?". Documentation, facts and figures. www.whatsup-switzerland.ch.
Creation of the corpus
Ueberwasser, Simone; Stark, Elisabeth (2017): "What’s up, Switzerland? A corpus-based research project in a multilingual country”. In: Linguistik online, 84/5, 105-126. https://bop.unibe.ch/linguistik-online/article/view/3849/5834
Stark, Elisabeth (2016-2020). SNSF project "What’s up, Switzerland?" (Sinergia: CRSII1_160714). University of Zurich. www.whatsup-switzerland.ch.
If you want to use our raw data for computational linguistic projects, please contact Prof. Elisabeth Stark to see whether your project complies with our requirements. If we make the data available, a CC BY-NC-ND license is applied.