Differences

This shows you the differences between two versions of the page.

--- 02_browsing:02_layers [2019/11/06 12:12] – [Labels] simone
+++ 02_browsing:02_layers [2020/01/06 16:59] – simone
@@ Line 2: / Line 2: @@
 WhatsApp messages are built up in a hierarchy: a chat contains messages that contain tokens that contain characters. A corpus of WhatsApp chats should allow for all these layers to be queried. Additionally, there is meta-data about the chats (e.g. number of messages) and about the messages (e.g. the timestamp when it was written) and about the informant (e.g. his/her age) and about the tokens (e.g. part of speech). This makes our corpus a rather challenging and complex endeavor.
-These layers can nicely seen when browsing results from a query:
+These layers can nicely be seen when browsing results from a query:
 {{ :02_browsing:layers.png?direct&600 |}}
@@ Line 21: / Line 21: @@
 The individual tokens are annoted in green in the above example and they are aligned to the message, to which they belong.
-Tokens, too, (can) have meta data that is assigned to them. In the example shown above, you have the following meta data that was created by our team or by our computational linguists:
+Tokens, too, (can) have annotations that are assigned to them. In the example shown above, you have the following meta data that was created by our team or by our computational linguists:
   * Gloss: a normalization, i.e. a "translation" into standard spelling. A good example here is //xo//, which was normalized as <però>.
   * tt_pos: A part-of-speech annotation generated with the parser [[https://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/|TreeTagger]].