02_browsing:04_queries:03_regex
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
02_browsing:04_queries:02_regex [2020/04/21 11:09] – [Separators] simone | 02_browsing:04_queries:03_regex [2020/05/04 14:07] – simone | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== 2.4.2 Regular Expressions ====== | + | ====== 2.4.3 Regular Expressions ====== |
In order to search for spelling variants, different forms of a lemma or else, you need to formulate RegEx expressions in ANNIS. For this, you put your query in between slashes. | In order to search for spelling variants, different forms of a lemma or else, you need to formulate RegEx expressions in ANNIS. For this, you put your query in between slashes. | ||
Line 10: | Line 10: | ||
===== A (very) short introduction to RegEx ===== | ===== A (very) short introduction to RegEx ===== | ||
- | "In computing, regular expressions, | + | RegEx takes a pattern of characters you enter into the search field and looks for matches of these characters in the database. Let us assume that the database to be queried is a string of characters like "the man manually attached the tube in Manchester" |
- | + | ||
- | As Wikipedia tells us, RegEx takes a pattern of characters you enter into the search field and looks for matches of these characters in the database. Let us assume that the database to be queried is a string of characters like "the man manually attached the tube in Manchester" | + | |
However, RegEx also allows you to search for such things as alternatives (//man// or //men//), for word boundaries etc. RegEx is a syntax widely spread in programming languages. In what follows, we try to offer an easy overview over the functions you might use most often in this corpus. | However, RegEx also allows you to search for such things as alternatives (//man// or //men//), for word boundaries etc. RegEx is a syntax widely spread in programming languages. In what follows, we try to offer an easy overview over the functions you might use most often in this corpus. | ||
Line 43: | Line 41: | ||
==Variable letters== | ==Variable letters== | ||
- | If you are looking for any letter, you can use '' | + | If you are looking for any letter, you can use '' |
Line 85: | Line 83: | ||
== Diacritica== | == Diacritica== | ||
- | This corpus is set up so as to recognize umlauts and letters with accents as individuals (Keep in mind that this is not the case in many other uses of RegEx. Especially in programs that were developed in the US, a <ü> is not considered as a letter but rather as a boundary). In our corpus, seearching for ''/ | + | This corpus is set up so as to recognize umlauts and letters with accents as individuals (keep in mind that this is not the case in many other uses of RegEx. Especially in programs that were developed in the US, a <ü> is not considered as a letter but rather as a boundary). In our corpus, seearching for ''/ |
=== Digits=== | === Digits=== |
02_browsing/04_queries/03_regex.txt · Last modified: 2022/06/27 09:21 (external edit)