User Tools

Site Tools


corpus:00_corpus

This is an old revision of the document!


The corpus

The corpus consists of 617 chats that were sent in by the Swiss population in 2014 through a fixed procedure that was communicated in the press in order to get people interested. The individual chats were checked for their permission to use them and for chats that had to be removed. Furthermore, demographic data (were provided) were linked to the chats.

In a first step the most basic processing of the data took place such as to allow the project members to work with the data. This included the anonymization and the annotation of a main language per chat and thus the creation of subcorpora.

corpus/00_corpus.1569141279.txt.gz · Last modified: 2022/06/27 09:21 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki