Java Regular Expressions Tutorial with Examples
1. Regular expression
A regular expression defines a search pattern for strings. Regular expressions can be used to search, edit and manipulate text. The pattern defined by the regular expression may match one or several times or not at all for a given string.
The abbreviation for regular expression is regex.
The abbreviation for regular expression is regex.
Regular expressions are supported by most programming languages, e.g., Java, C#, C/C++, etc. Unfortunately each language supports regular expressions slightly different.
You may be interested:
2. Rule writing regular expressions
No ADS
No | Regular Expression | Description |
1 | . | Matches any character |
2 | ^regex | Finds regex that must match at the beginning of the line. |
3 | regex$ | Finds regex that must match at the end of the line. |
4 | [abc] | Set definition, can match the letter a or b or c. |
5 | [abc][vz] | Set definition, can match a or b or c followed by either v or z. |
6 | [^abc] | When a caret appears as the first character inside square brackets, it negates the pattern. This can match any character except a or b or c. |
7 | [a-d1-7] | Ranges: matches a letter between a and d and figures from 1 to 7. |
8 | X|Z | Finds X or Z. |
9 | XZ | Finds X directly followed by Z. |
10 | $ | Checks if a line end follows. |
11 | \d | Any digit, short for [0-9] |
12 | \D | A non-digit, short for [^0-9] |
13 | \s | A whitespace character, short for [ \t\n\x0b\r\f] |
14 | \S | A non-whitespace character, short for [^\s] |
15 | \w | A word character, short for [a-zA-Z_0-9] |
16 | \W | A non-word character [^\w] |
17 | \S+ | Several non-whitespace characters |
18 | \b | Matches a word boundary where a word character is [a-zA-Z0-9_] . |
19 | * | Occurs zero or more times, is short for {0,} |
20 | + | Occurs one or more times, is short for {1,} |
21 | ? | Occurs no or one times, ? is short for {0,1} . |
22 | {X} | Occurs X number of times, {} describes the order of the preceding liberal |
23 | {X,Y} | Occurs between X and Y times, |
24 | *? | ? after a quantifier makes it a reluctant quantifier. It tries to find the smallest match. |
3. Special characters in Java Regex
Special characters in Java Regex:
\.[{(*+?^$|
The characters listed above are special characters. In Java regex you want it understood that character in the normal way you should add a \ in front.
Example dot character . java regex is interpreted as any character, if you want it interpreted as a dot character normally required mark \ ahead.
Example dot character . java regex is interpreted as any character, if you want it interpreted as a dot character normally required mark \ ahead.
// Regex pattern describe any character.
String regex = ".";
// Regex pattern describe a dot character.
String regex = "\\.";
4. Using String.matches(String)
No ADS
- Class String
...
// Check the entire String object matches the regex or not.
public boolean matches(String regex)
..
Using the method String.matches (String regex) allows you to check the entire string matches the regex or not. This is the most common way. Consider these examples:
StringMatches.java
package org.o7planning.tutorial.regex.stringmatches;
public class StringMatches {
public static void main(String[] args) {
String s1 = "a";
System.out.println("s1=" + s1);
// Check the entire s1
// Match any character
// Rule .
// ==> true
boolean match = s1.matches(".");
System.out.println("-Match . " + match);
s1 = "abc";
System.out.println("s1=" + s1);
// Check the entire s1
// Match any character
// Rule .
// ==> false (Because s1 has three characters)
match = s1.matches(".");
System.out.println("-Match . " + match);
// Check the entire s1
// Match with any character 0 or more times
// Combine the rules . and *
// ==> true
match = s1.matches(".*");
System.out.println("-Match .* " + match);
String s2 = "m";
System.out.println("s2=" + s2);
// Check the entire s2
// Start by m
// Rule ^
// ==> true
match = s2.matches("^m");
System.out.println("-Match ^m " + match);
s2 = "mnp";
System.out.println("s2=" + s2);
// Check the entire s2
// Start by m
// Rule ^
// ==> false (Because s2 has three characters)
match = s2.matches("^m");
System.out.println("-Match ^m " + match);
// Start by m
// Next any character, appearing one or more times.
// Rule ^ and. and +
// ==> true
match = s2.matches("^m.+");
System.out.println("-Match ^m.+ " + match);
String s3 = "p";
System.out.println("s3=" + s3);
// Check s3 ending with p
// Rule $
// ==> true
match = s3.matches("p$");
System.out.println("-Match p$ " + match);
s3 = "2nnp";
System.out.println("s3=" + s3);
// Check the entire s3
// End of p
// ==> false (Because s3 has 4 characters)
match = s3.matches("p$");
System.out.println("-Match p$ " + match);
// Check out the entire s3
// Any character appearing once.
// Followed by n, appear one or up to three times.
// End by p: p $
// Combine the rules: . , {X, y}, $
// ==> true
match = s3.matches(".n{1,3}p$");
System.out.println("-Match .n{1,3}p$ " + match);
String s4 = "2ybcd";
System.out.println("s4=" + s4);
// Start by 2
// Next x or y or z
// Followed by any one or more times.
// Combine the rules: [abc]. , +
// ==> true
match = s4.matches("2[xyz].+");
System.out.println("-Match 2[xyz].+ " + match);
String s5 = "2bkbv";
// Start any one or more times
// Followed by a or b, or c: [abc]
// Next z or v: [zv]
// Followed by any
// ==> true
match = s5.matches(".+[abc][zv].*");
System.out.println("-Match .+[abc][zv].* " + match);
}
}
Results of running the example:
s1=a
-Match . true
s1=abc
-Match . false
-Match .* true
s2=m
-Match ^m true
s2=mnp
-Match ^m false
-Match ^m.+ true
s3=p
-Match p$ true
s3=2nnp
-Match p$ false
-Match .n{1,3}p$ true
s4=2ybcd
-Match 2[xyz].+ true
-Match .+[abc][zv].* true
SplitWithRegex.java
package org.o7planning.tutorial.regex.stringmatches;
public class SplitWithRegex {
public static final String TEXT = "This is my text";
public static void main(String[] args) {
System.out.println("TEXT=" + TEXT);
// White space appears one or more times.
// The whitespace characters: \t \n \x0b \r \f
// Combining rules: \ s and +
String regex = "\\s+";
String[] splitString = TEXT.split(regex);
// 4
System.out.println(splitString.length);
for (String string : splitString) {
System.out.println(string);
}
// Replace all whitespace with tabs
String newText = TEXT.replaceAll("\\s+", "\t");
System.out.println("New text=" + newText);
}
}
Results of running the example:
TEXT=This is my text
4
This
is
my
text
New text=This is my text
EitherOrCheck.java
package org.o7planning.tutorial.regex.stringmatches;
public class EitherOrCheck {
public static void main(String[] args) {
String s = "The film Tom and Jerry!";
// Check the whole s
// Begin by any characters appear 0 or more times
// Next Tom or Jerry
// End with any characters appear 0 or more times
// Combine the rules:., *, X | Z
// ==> true
boolean match = s.matches(".*(Tom|Jerry).*");
System.out.println("s=" + s);
System.out.println("-Match .*(Tom|Jerry).* " + match);
s = "The cat";
// ==> false
match = s.matches(".*(Tom|Jerry).*");
System.out.println("s=" + s);
System.out.println("-Match .*(Tom|Jerry).* " + match);
s = "The Tom cat";
// ==> true
match = s.matches(".*(Tom|Jerry).*");
System.out.println("s=" + s);
System.out.println("-Match .*(Tom|Jerry).* " + match);
}
}
Results of running the example:
s=The film Tom and Jerry!
-Match .*(Tom|Jerry).* true
s=The cat
-Match .*(Tom|Jerry).* false
s=The Tom cat
-Match .*(Tom|Jerry).* true
5. Using Pattern and Matcher
No ADS
1. Pattern object is the compiled version of the regular expression. It doesn’t have any public constructor and we use it’s public static method compile(String) to create the pattern object by passing regular expression argument.
2. Matcher is the regex engine object that matches the input String pattern with the pattern object created. This class doesn’t have any public construtor and we get a Matcher object using pattern object matcher method that takes the input String as argument. We then use matches method that returns boolean result based on input String matches the regex pattern or not.
3. PatternSyntaxException is thrown if the regular expression syntax is not correct.
2. Matcher is the regex engine object that matches the input String pattern with the pattern object created. This class doesn’t have any public construtor and we get a Matcher object using pattern object matcher method that takes the input String as argument. We then use matches method that returns boolean result based on input String matches the regex pattern or not.
3. PatternSyntaxException is thrown if the regular expression syntax is not correct.
String regex= ".xx.";
// Create a Pattern object through a static method.
Pattern pattern = Pattern.compile(regex);
// Get a Matcher object
Matcher matcher = pattern.matcher("MxxY");
boolean match = matcher.matches();
System.out.println("Match "+ match);
- Class Pattern:
public static Pattern compile(String regex, int flags) ;
public static Pattern compile(String regex);
public Matcher matcher(CharSequence input);
public static boolean matches(String regex, CharSequence input);
- Class Matcher:
public int start()
public int start(int group)
public int end()
public int end(int group)
public String group()
public String group(int group)
public String group(String name)
public int groupCount()
public boolean matches()
public boolean lookingAt()
public boolean find()
Here is an example using Matcher and method find() to search for the substring matching the regular expression.
MatcherFind.java
package org.o7planning.tutorial.regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class MatcherFind {
public static void main(String[] args) {
final String TEXT = "This \t is a \t\t\t String";
// Spaces appears one or more time.
String regex = "\\s+";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(TEXT);
int i = 0;
while (matcher.find()) {
System.out.print("start" + i + " = " + matcher.start());
System.out.print(" end" + i + " = " + matcher.end());
System.out.println(" group" + i + " = " + matcher.group());
i++;
}
}
}
Result of running example:
start = 4 end = 7 group =
start = 9 end = 10 group =
start = 11 end = 16 group =
Method Matcher.lookingAt()
MatcherLookingAt.java
package org.o7planning.tutorial.regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class MatcherLookingAt {
public static void main(String[] args) {
String country1 = "iran";
String country2 = "Iraq";
// Start by I followed by any character.
// Following is the letter a or e.
String regex = "^I.[ae]";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(country1);
// lookingAt () searches that match the first part.
System.out.println("lookingAt = " + matcher.lookingAt());
// matches() must be matching the entire
System.out.println("matches = " + matcher.matches());
// Reset matcher with new text: country2
matcher.reset(country2);
System.out.println("lookingAt = " + matcher.lookingAt());
System.out.println("matches = " + matcher.matches());
}
}
6. Group
No ADS
A regular expression you can split into groups:
// A regular expression
String regex = "\\s+=\\d+";
// Writing as three group, by marking ()
String regex2 = "(\\s+)(=)(\\d+)";
// Two group
String regex3 = "(\\s+)(=\\d+)";
The group can be nested, and so need a rule indexing the group. The entire pattern is defined as the group 0. The remaining group described similar illustration below:
Note: Use (?:Pattern) to inform Java does not see this as a group (None-capturing group)
From Java 7, you can define a named capturing group (?<name>pattern), and you can access the content matched with Matcher.group(String name). The regex is longer, but the code is more meaningful, since it indicates what you are trying to match or extract with the regex.
Named capturing group can also be access via Matcher.group(int group) with the same numbering scheme.
Internally, Java's implementation just maps from the name to the group number. Therefore, you cannot use the same name for 2 different capturing groups.
Named capturing group can also be access via Matcher.group(int group) with the same numbering scheme.
Internally, Java's implementation just maps from the name to the group number. Therefore, you cannot use the same name for 2 different capturing groups.
-
Let's look at an example using the named for the group (Java> = 7)
NamedGroup.java
package org.o7planning.tutorial.regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NamedGroup {
public static void main(String[] args) {
final String TEXT = " int a = 100;float b= 130;float c= 110 ; ";
// Use (?<groupName>pattern) to define a group named: groupName
// Defined group named declare: using (?<declare>...)
// And a group named value: use: (?<value>..)
String regex = "(?<declare>\\s*(int|float)\\s+[a-z]\\s*)=(?<value>\\s*\\d+\\s*);";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(TEXT);
while (matcher.find()) {
String group = matcher.group();
System.out.println(group);
System.out.println("declare: " + matcher.group("declare"));
System.out.println("value: " + matcher.group("value"));
System.out.println("------------------------------");
}
}
}
Results of running the example:
int a = 100;
declare: int a
value: 100
------------------------------
float b= 130;
declare: float b
value: 130
------------------------------
float c= 110 ;
declare: float c
value: 110
------------------------------
Just to clarify you can see the illustration below:
7. Using Pattern, Matcher, Group and *?
No ADS
In some situations *? very important, take a look at the following example:
// This is a regex
// any characters appear 0 or more times,
// followed by ' and >
String regex = ".*'>";
// TEXT1 match the regex.
String TEXT1 = "FILE1'>";
// And TEXT2 match the regex
String TEXT2 = "FILE1'> <a href='http://HOST/file/FILE2'>";
*? will find the smallest match. We consider the following example:
NamedGroup2.java
package org.o7planning.tutorial.regex;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NamedGroup2 {
public static void main(String[] args) {
String TEXT = "<a href='http://HOST/file/FILE1'>File 1</a>"
+ "<a href='http://HOST/file/FILE2'>File 2</a>";
// Java >= 7.
// Define group named fileName.
// *? ==> ? after a quantifier makes it a reluctant quantifier.
// It tries to find the smallest match.
String regex = "/file/(?<fileName>.*?)'>";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(TEXT);
while (matcher.find()) {
System.out.println("File Name = " + matcher.group("fileName"));
}
}
}
Results of running the example:
File Name = FILE1
File Name = FILE2
No ADS
Java Basic
- Data Types in java
- Java PhantomReference Tutorial with Examples
- JDK Javadoc in CHM format
- Java Stream Tutorial with Examples
- Java Predicate Tutorial with Examples
- Java BiConsumer Tutorial with Examples
- Arrays in Java
- JDBC Driver Libraries for different types of database in Java
- Abstract class and Interface in Java
- Java Commons Email Tutorial with Examples
- Install Eclipse
- Bitwise Operations
- Install Eclipse on Ubuntu
- Configuring Eclipse to use the JDK instead of JRE
- Java Commons Logging Tutorial with Examples
- Java Enums Tutorial with Examples
- Loops in Java
- Java Regular Expressions Tutorial with Examples
- Install Java on Ubuntu
- Quick Learning Java for beginners
- Install Java on Windows
- Comparing and Sorting in Java
- Inheritance and polymorphism in Java
- Java Consumer Tutorial with Examples
- Java String, StringBuffer and StringBuilder Tutorial with Examples
- Java Exception Handling Tutorial with Examples
- Example of Java encoding and decoding using Apache Base64
- if else statement in java
- Switch Statement in Java
- Java Supplier Tutorial with Examples
- Java Programming for team using Eclipse and SVN
- Java JDBC Tutorial with Examples
- Java remote method invocation - Java RMI Tutorial with Examples
- Java Multithreading Programming Tutorial with Examples
- Customize java compiler processing your Annotation (Annotation Processing Tool)
- What is needed to get started with Java?
- Java Aspect Oriented Programming with AspectJ (AOP)
- Understanding Java System.identityHashCode, Object.hashCode and Object.equals
- Java Compression and Decompression Tutorial with Examples
- Java Reflection Tutorial with Examples
- Install OpenJDK on Ubuntu
- Java String.format() and printf() methods
- History of Java and the difference between Oracle JDK and OpenJDK
- Introduction to the Raspberry Pi
- Java Socket Programming Tutorial with Examples
- Java Generics Tutorial with Examples
- Manipulating files and directories in Java
- Java WeakReference Tutorial with Examples
- Java Commons IO Tutorial with Examples
- History of bits and bytes in computer science
- Which Platform Should You Choose for Developing Java Desktop Applications?
- Java SoftReference Tutorial with Examples
- Syntax and new features in Java 8
- Java Annotations Tutorial with Examples
- Java Function Tutorial with Examples
- Access modifiers in Java
- Java BiFunction Tutorial with Examples
- Get the values of the columns automatically increment when Insert a record using JDBC
- Java Functional Interface Tutorial with Examples
- Java BiPredicate Tutorial with Examples
Show More
- Java Servlet/Jsp Tutorials
- Java Collections Framework Tutorials
- Java API for HTML & XML
- Java IO Tutorials
- Java Date Time Tutorials
- Spring Boot Tutorials
- Maven Tutorials
- Gradle Tutorials
- Java Web Services Tutorials
- Java SWT Tutorials
- JavaFX Tutorials
- Java Oracle ADF Tutorials
- Struts2 Framework Tutorials
- Spring Cloud Tutorials