| type | tutorial |
|---|---|
| layout | tutorial |
| title | Searching with Regular Expressions |
| description | This tutorial demonstrates how to write patterns with regular expressions and use these patterns for searching in a text. |
| authors | Andrey Vityuk |
| date | 2021-02-03 |
| showAuthorInfo | false |
You will create search patterns with a regular expression and write search functions using these patterns.
This tutorial consists of two parts:
-
In the second part, you will learn about five search functions:
Function Description containsMatchInFinds out if the text contains any matches. Returns a Boolean value. findFinds the first match. Returns a match parameter. findAllFinds all matches. Returns a parameter for each match. matchEntireFinds out if the pattern matches the whole text. Returns a match parameter. matchesFinds out if the pattern matches the whole text. Returns a Boolean value.
To get started, you need to create an application using Intellij IDEA or another IDE with the Kotlin Programming Language support.
There may be times when you need to use regular expressions.
Regular expression or regex is a sequence of characters that forms a pattern. This pattern can help you to search certain data in a text.
Basic regular expressions consist of a single literal character or a sequence of literal characters.
amatches the first occurrence of the a character in your text.actormatches the first occurrence of the actor character sequence in your text.
More complex regular expressions include special characters.
\smatches the first occurrence of the white space character in your text.\b[1-9][0-9]{2}\bmatches the first occurrence of the number between 100 and 999 in your text.
You can combine literal and special characters to create complex patterns. For the full information about available syntax, refer to the Pattern syntax reference for JVM.
-
Introduce a local variable
textwith the keywordval. -
Assign the following value to the
textvariable: Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary.val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary"
In this tutorial, we will use a list of words and numbers as mock data. You can also use the text of your choice. In this case, you will need to configure a search pattern that will be relevant for your text, but you still can refer to the described functions.
-
Introduce a local variable
patternwith the keywordval. -
Assign the
\b[1-9][0-9]{2}\bvalue to this variable.val pattern = "\b[1-9][0-9]{2}\b"
With this regular expression, you can search for numbers between 100 and 999.
-
Add an extra backslash before each
\bpart of the regular expression.\bis a metacharacter with a special meaning. This metacharacter matches a boundary. It allows to search for whole words or whole numbers. You must use\as an escape character before metacharacters. Otherwise, the compiler will ignore the special meaning of metacharacters.val pattern = "\\b[1-9][0-9]{2}\\b"
-
For now, your pattern is just a text string. To make it into a regular expression, assign the
Regextype to the value.val pattern = Regex("\\b[1-9][0-9]{2}\\b")
In this example, we added the Regex type to the value. Alternatively, you can use the toRegex function.
val pattern = "\\b[1-9][0-9]{2}\\b".toRegex()In this tutorial, we will search for numbers between 100 and 999. You can also create a search pattern of your choice. In this case, you will need to have a relevant text for your pattern, but you still can refer to the described functions.
In this part, you will learn how to:
- Find out if the text contains any matches of your pattern (
containsMatchIn) - Find pattern matches in the text; depending on the applied function, you can get the first match parameter (
find) or a list of parameters of all matches in the text (findAll) - Find out if your pattern matches the whole text; depending on the applied function, you can get a match parameter (
matchEntire) or a Boolean value (matches)
You can find the full list of functions provided by the Regex type in the reference.
The containsMatchIn function attempts to find a match of the pattern in your text. Returns a boolean value.
Define if the pattern has a match in the text.
-
Define your text and search pattern as local variables.
val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") -
Introduce a local variable
result1with the keywordval.val result1 -
Assign the following value to the
result1variable:pattern.containsMatchIn(text)where:
patternis a variable for the search patterncontainsMatchInis a search functiontextis a variable for the text where we are searching for matches
-
Use the
printlnfunction to display the result of theresult1variable.fun main() { val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") val result1 = pattern.containsMatchIn(text) println("The containsMatchIn function returns $result1.") //true }
-
To run your application, click the green Run icon in the gutter and select Run 'MainKt'.
The pattern has matches in the text, so the containsMatchIn function returns the true value.
The find function returns the parameter of the first pattern match in the text.
Find the first match of the pattern in the text.
-
Define your text and search pattern as local variables.
val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") -
Introduce a local variable
result2with the keywordval.val result2 -
Assign the following value to the
result2variable:pattern.find(text)where:
patternis a variable for the search patternfindis a search functiontextis a variable for the text where we are searching for matches
-
Use the
printlnfunction to display the result of theresult2variable and thevaluefunction to get the first number that matches the regular expression.If the pattern does not have matches in the text, the
findfunction returnsnull. To allow thevaluefunction to returnnull, add a question mark?after theresult2variable name. For information about the danger of null references, refer to Null Safety.fun main() { val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") val result2 = pattern.find(text) println("The find function returns ${result2?.value}.") //123 }
-
To run your application, click the green Run icon in the gutter and select Run 'MainKt'.
The function returns the first match value: 123.
For information about function parameters and exceptions, refer to the find function reference.
The findAll returns parameters of all matches of the pattern in your text.
Find all pattern matches in the text.
-
Define your text and search pattern as local variables.
val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") -
Introduce a local variable
result3with the keywordval.val result 3 -
Assign the following value to the
result3variable:pattern.findAll(text)where:
patternis a variable for the search patternfindAllis a search functiontextis a variable for the text where we are searching for matches
-
Use the
printlnfunction to display the text presenting results of theresult3variable.println("The findAll function returns") -
The result of the
findAllfunction is a sequence of values. Use thevaluefunction to get the list of numbers that match the regular expression and theforEachfunction to display the result as a column of values. For information about this function, refer to theforEachfunction reference.fun main() { val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") val result3 = pattern.findAll(text) println("The findAll function returns") result3.forEach { match -> val values = match.value println(values) //123, 808 } }
-
To run your application, click the green Run icon in the gutter and select Run 'MainKt'.
The pattern has two matches in the text: 123 and 808, so the findAll function returns 2 values.
For information about function parameters and exceptions, refer to the findAll function reference.
The matchEntire function attempts to match the pattern against the whole text string. Returns a match parameter.
Define if the pattern matches the entire text.
-
Define your text and search pattern as local variables.
val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") -
Introduce a local variable
result4with the keywordval.val resul4 -
Assign the following value to the
result4variable:pattern.matchEntire(text)where:
patternis a variable for the search patternmatchEntireis a search functiontextis a variable for the text where we are searching for matches
-
Use the
printlnfunction to display the result of theresult4variable. Use thevaluefunction to get a range of indexes in the text string.If the pattern does not have matches in the text, the
matchEntirefunction returnsnull. To allow therangefunction to returnnull, add the question mark?after theresult4variable name. For information about the danger of null references, refer to Null Safety.fun main() { val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") val result4 = pattern.matchEntire(text) println("The matchEntire function returns ${result4?.range}") //null }
-
To run your application, click the green Run icon in the gutter and select Run 'MainKt'.
The pattern does not match the text, so the function returns null.
The matches function indicates if the pattern matches the whole text string. Returns a Boolean value.
Define if the pattern matches the entire text.
-
Define your text and search pattern as local variables.
val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") -
Introduce a local variable
result5with the keywordval.val result5 -
Assign the following value to the
result5variable:pattern.matches(text)where:
patternis a variable for the search patternmatchesis a search functiontextis a variable for the text where we are searching for matches
-
Use the
printlnfunction to display the result of theresult5variable.fun main() { val text = "Actomyosin, 99, actor, 123, actress, 808, actual, 5005, actually, actuary" val pattern = Regex("\\b[1-9][0-9]{2}\\b") val result5 = pattern.matches(text) println("The matches function returns $result5") //false }
-
To run your application, click the green Run icon in the gutter and select Run 'MainKt'.
The pattern does not match the text, so the function returns the false value.
In this tutorial, we explained how to write patterns with regular expressions and use these patterns for searching data in a text. To extend this knowledge, you can:
- Add options to your regular expression using the
RegexOptiontype reference - Replace data in the text using other functions of the
Regextype