User Tools

Site Tools


01_corpus:02_preprocessing:03_emojis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
01_corpus:02_preprocessing:03_emojis [2020/04/14 17:48] – ↷ Page moved from 01_corpus:03_preprocessing:03_emojis to 01_corpus:02_preprocessing:03_emojis simone01_corpus:02_preprocessing:03_emojis [2022/06/27 09:21] (current) – external edit 127.0.0.1
Line 1: Line 1:
-====== Emojis ====== +====== 1.2.3. Emojis ====== 
-Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guarantied that an emoji in the original text looked as it does on your screen.+Emojis are characters in Unicode. The application WhatsApp uses special fonts such as to have the same appearance of emojis on all operation systems. In our corpus browsers, emojis can be displayed, but they are represented in the font that is used by the user, thus, it cannot be guaranteed that an emoji in the original text looked as it does on your screen
 + 
 +Querying emojis is not an easy task. We decided to encode them in the messages, e.g. as  ''emojiQsmilingCatFaceWithOpenMouth''. This encoding system allows for easily finding individual or groups of emojis using [[02_browsing:04_queries:03_regex|Regular Expressions]], e.g.: 
 +  * ''emojiQ.*'' finds all emojis 
 +  * ''emojiQcat.*''  finds all cats 
 +  * ''emojiQ.*[Ff]ace.*'' finds all faces, both human and cats (and maybe others).
  
-Querying emojis might not always be easy. We therefor decided to encode them in texts. This emoji 😺 would e.g. become //emojiQsmilingCatFaceWithOpenMouth//. This encoding system allows for easily finding individual or groups of emojis using Regular Expressions, e.g.: 
-  * //emojiQ.*//: finds all emojis 
-  * //emojiQcat.*//: finds all cats 
-  * //emojiQ.*[Ff]ace.*//: finds all faces, both human and cats (and maybe others). 
  
  
-You can thus query for individual emojis or for their encodings. 
  
  
01_corpus/02_preprocessing/03_emojis.1586879307.txt.gz · Last modified: 2022/06/27 09:21 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki