02_browsing:02_layers
This is an old revision of the document!
Layers of information
WhatsApp messages are built up in a hierarchy: a chat contains messages that contain tokens that contain characters. A corpus of WhatsApp chats should allow for all these layers to be queried. Additionally, there is meta-data about the chats (e.g. number of messages) and about the messages (e.g. the timestamp when it was written) and about the informant (e.g. his/her age) and about the tokens (e.g. part of speech). This makes our corpus a rather challenging and complex endeavor.
These layers can nicely seen when browsing results from a query:
02_browsing/02_layers.1573035569.txt.gz · Last modified: 2022/06/27 09:21 (external edit)