01_corpus:02_preprocessing
This is an old revision of the document!
1.2 Pre-processing
After collecting the data, we had around 650 chats in different languages but no idea which chat was in which language. Furthermore, we had given the promise to anonymize the data and we did not have a tool to browse the data in the available format. Thus, before making the data available to the research team, we had to pre-process the data.
01_corpus/02_preprocessing.1572449093.txt.gz · Last modified: 2022/06/27 09:21 (external edit)