Regular expressions are a pattern-matching tool that can be used to process text. Java provides the java.util.regex
package, which contains the classes and methods needed to use regular expressions in your Java applications.
Here’s an example of using regular expressions with a Car
class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
javaCopy code<code>import java.util.regex.Matcher; import java.util.regex.Pattern; class Car { private String model; private int year; public Car(String model, int year) { this.model = model; this.year = year; } public String toString() { return model + " " + year; } } public static void main(String[] args) { Car car = new Car("Sedan", 2020); String carString = car.toString(); // Define a regular expression pattern Pattern pattern = Pattern.compile("\\d+"); // Create a matcher object Matcher matcher = pattern.matcher(carString); // Find and print all matches while (matcher.find()) { System.out.println(matcher.group()); } } |
In this example, we have a Car
class that has a model
and a year
attribute. We then convert the Car
object to a string representation using the toString
method.
We use the Pattern
class to compile a regular expression pattern that matches one or more digits (\\d+
). We then create a Matcher
object from the pattern and the carString
. The matcher.find
method is used to find all occurrences of the pattern in the carString
, and the matcher.group
method is used to retrieve the matching text. In this example, it will find and print the year of the car.
Here’s a table of the most commonly used regular expression types in Java and their syntax:
Type of Regular Expression | Description | Syntax |
---|---|---|
Character classes | Matches any character within the specified range or set | [abc] – Matches any character a , b , or c <br>[a-zA-Z] – Matches any letter, upper or lowercase |
Predefined character classes | Matches any character belonging to a specific set | \d – Matches a digit (equivalent to [0-9] )<br>\w – Matches a word character (equivalent to [a-zA-Z_0-9] )<br>\s – Matches a white space character (including space, tab, and line break) |
Repetition | Matches the preceding expression zero or more times | * – Matches zero or more occurrences of the preceding expression<br>+ – Matches one or more occurrences of the preceding expression<br>? – Matches zero or one occurrence of the preceding expression |
Grouping | Group multiple subexpressions into a single unit | (expression) – Captures the matched text and can be retrieved later using Matcher.group() method |
Alternation | Matches the expression before or after the
| symbol |
Anchors | Matches the position before and after characters | ^ – Matches the position at the start of a line<br>$ – Matches the position at the end of a line<br>\b – Matches a word boundary |
some additional regular expression types in Java:
Type of Regular Expression | Description | Syntax |
---|---|---|
Quantifiers | Specify the number of occurrences of the preceding expression | {n} – Matches exactly n occurrences of the preceding expression<br>{n,} – Matches at least n occurrences of the preceding expression<br>{n,m} – Matches between n and m occurrences of the preceding expression |
Backreferences | Matches the same text as the specified capturing group | \n , where n is the capturing group number (1 to 9) |
Boundary matchers | Matches specific boundaries in the input text | \b – Matches a word boundary<br>\B – Matches a non-word boundary<br>\A – Matches the start of the input<br>\G – Matches the end of the previous match<br>\Z – Matches the end of the input, but for the final terminator if any<br>\z – Matches the end of the input |
Lookarounds | Specifies a condition for a match, but does not consume any input | (?=expression) – Positive lookahead<br>(?!expression) – Negative lookahead<br>(?<=expression) – Positive lookbehind<br>(?<!expression) – Negative lookbehind |