A Regex (Regular Expression) is a pattern that is used to check whether a given string matches that pattern.
In Ruby, regex patterns are defined between two forward slashes (/pattern/). For example,
# A regex pattern
/^m.t$/
The above pattern indicates a three-letter string where,
^- marks the start of a string. (Here, string starts withm).- indicates any one letter or character$- marks the end of a string (Here, string ends witht)
For example strings like "mat" and "mit" match the above regex pattern. However, strings like "mom" and "magnet" don't match because these strings are not 3 letter words that start with m and end with t.
We will learn about regular expressions in detail below.
Ruby Regex Class
Ruby provides the Regex class to work with regular expressions. You can create a regex pattern using either the literal syntax or the Regexp.new method.
Syntax
# Using literal syntax
/pattern/
# Using Regexp.new
Regexp.new("pattern")
Here, pattern is the regular expression pattern you want to match.
Example: Ruby RegEx
# A regular expression pattern for a five-letter word
# that starts with "a" and ends with "e"
pattern = /^a...e$/
# test string
word = "apple"
# Check if the word matches the pattern
if word.match?(pattern)
puts "String matches the pattern"
else
puts "String doesn't match the pattern"
end
Output
String matches the pattern
In the above example, we have checked whether the string "apple" matches the defined regular expression pattern.
The pattern "^a...e$" indicates any five-letter string starting with a and ending with e. Here, the match? method returns true if the string that we pass matches the regex pattern.
If we pass another string for example - "apache", it doesn't match with pattern because "apache" has more than five characters.
Regex Methods in Ruby
Ruby provides several methods to work with regex:
| Method | Description |
|---|---|
=~ |
Returns index of first match or nil |
match |
Returns MatchData object or nil |
match? (>= 2.4) |
Returns true/false if match found |
scan |
Returns array of all matching substrings |
gsub |
Substitutes all matches |
Example: Using Regex Methods
text = "I have 2 cats and 3 dogs"
# Find first match
puts text =~ /\d/ # Output: 7
# Get all digits
puts text.scan(/\d/).join(", ") # Output: 2, 3
# Replace digits with '#'
puts text.gsub(/\d/, '#') # Output: I have # cats and # dogs
Metacharacters
To specify regular expressions, metacharacters are used. Metacharacters are characters that are interpreted in a special way by a regex engine.
Some of the basic metacharacters are:
[]- square bracket.- period^- caret$- dollar*- star+- plus?- question mark{}- braces()- parenthesis|- alternation
[]- Square brackets
[] specifies a set of characters you wish to match.
For example,
/[abc]/
This pattern matches any string that contains any of the a, b, or c.
Let's check if the following string examples match the regex pattern [abc].
| String | Matched? | Reason |
|---|---|---|
| a | 1 Match | string contains a |
| ac | 2 Match | string contains a and c |
| jim | 0 Match | string doesn't contain any of a, b or c |
| abc | 3 Match | string contains all three - a, b and c |
Note: You can also specify a range of characters using - inside square brackets. For example,
[a-e]is the same as[abcde][0-3]is the same as[0123]
. - Period
A period specifies any single character (except newline '\n').
For example,
/.../
This pattern matchesany three-character substring (except newline).
Let's check if the following string examples match the regex pattern ….
| String | Matched? | Reason |
|---|---|---|
| abs | 1 Match | string contains three letters (a, b, s) |
| ac | 0 Match | string doesn't contain three letters |
| jim | 1 Match | string contains three letters |
| abcd | 1 Match | string contains three letters |
| abcjkl | 2 Matches | string contains 6 letters (3+3) |
^ - Caret
The caret symbol ^ specifies the string starts with a certain character.
For example,
/^m/
This pattern matches strings starting with the letter "m"
Let's check if the following string examples match the regex pattern ^m.
| String | Matched? | Reason |
|---|---|---|
| man | 1 Match | man starts with "m" |
| m | 1 Match | m starts with "m" |
| Man | 0 Match | Man doesn't start with "m" |
| sms | 0 Match | sms doesn't start with "m" |
$ - Dollar
The dollar symbol $ specifies the string ends with a certain character.
For example,
/y$/
This pattern matches strings ending with the letter "y"
Let's check if the following string examples match the regex pattern y$.
| String | Matched? | Reason |
|---|---|---|
| monday | 1 Match | monday ends with "y" |
| say | 1 Match | say ends with "y" |
| myname | 0 Match | myname doesn't end with "y" |
* - Star
The star symbol * matches zero or more occurrences of the pattern left to it.
For example,
/ca*t/
This pattern matches strings that have any number[including zero] of a in between c and t
Let's check if the following string examples match the regex pattern ca*t.
| String | Matched? | Reason |
|---|---|---|
| cat | 1 Match | cat has one a between c and t |
| ct | 1 Match | ct has zero a between c and t |
| caaaat | 1 Match | caaaat has three a between c and t |
| crt | 0 Match | crt has letter r (not a) between c and t |
| caatcaaat | 2 Matches | caatcaaat has a in two places (caat and caaat) |
+ - Plus
The plus symbol + matches one or more occurrences of the pattern left to it.
/ma+t/
This pattern matches strings that have one or more occurences of a in between m and t
Let's check if the following string examples match the regex pattern ma+t.
| String | Matched? | Reason |
|---|---|---|
| mat | 1 Match | mat has one a between m and t |
| mt | 0 Match | mt doesn't have a between m and t |
| matemaat | 2 Matches | matemaat has two matching substrings (mat and maat) |
| mart | 0 Match | a is not followed by t in mart |
? - Question Mark
The question mark symbol ? matches zero or one occurrence of the pattern left to it.
For example,
/ma?n/
This pattern matches strings that have one or zero number of a in between m and n
Let's check if the following string examples match the regex pattern ma?n.
| String | Matched? | Reason |
|---|---|---|
| man | 1 Match | man has one a between m and n |
| mn | 1 Match | mn has zero number of a between m and n |
| maaaaan | 0 Match | maaaaan has more than one a character between m and n |
| woman | 1 Match | woman has one a between m and n |
{} - Braces
The braces symbol {} is used to specify the range of repetitions of the pattern left to it.
/a{2}/
This pattern matches strings that have exactly two a characters in a row.
Braces {} can be used as:
# Exactly n times
{n}
# At least n times
{n,}
# Between n and m times
{n,m}
Let's check if the following string examples match the regex pattern /a{2}/.
| String | Matched? | Reason |
|---|---|---|
| aa | 1 Match | contains exactly two as in a row |
| abcdaat | 1 Match | abcdaat has two a on left of other character |
| - Alternation
The vertical bar | is used as or operator.
For example,
/a|b/
This pattern matches string that has either a or b
Let's check if the following string examples match the regex pattern a|b.
| String | Matched? | Reason |
|---|---|---|
| cde | 0 Match | string doesn't have either a or b |
| ade | 1 Match (match at ade) |
there is a in the string |
| acdbea | 3 matches (at acdbea) |
Matches each individual a or b character in the string. |
() - Parenthesis
Parentheses () are used to group sub-patterns in a regular expression. Grouping allows you to apply quantifiers (like +, *, or {}) to the entire group.
For example,
/(ab)+/
This pattern matches one or more occurrences of the substring ab.
Let's check if the following string examples match the regex pattern (ab)+.
| String | Matched? | Reason |
|---|---|---|
| ab | 1 Match | contains one occurrence of "ab" |
| abab | 1 Match | One contiguous match of repeating "ab" group |
| ababab | 1 Match | One long match of three "ab" sequences together |
| aabb | 0 Match | "aa" breaks the valid repeating group |
Special Sequences
Ruby regex also supports special sequences for common character sets.
| Pattern | Description |
|---|---|
\d |
Any digit (same as [0-9]) |
\D |
Any non-digit |
\w |
Word character (letters, digits, underscore) |
\W |
Non-word character |
\s |
Whitespace character |
\S |
Non-whitespace character |
\d - Matches if the specified characters are one or more digits.
For example,
/\d+/
This pattern matches one or more digits.
Let's check if the following string examples match the regex pattern \d+.
| String | Matched? | Reason |
|---|---|---|
| 123 | 1 Match | matches all digits |
| a1b2c3 | 3 Matches | matches 1, 2, and 3 |
| abc | No Match | no digits present |
Escape Sequences
If you want to match special characters literally (like . or *), you need to escape them with a backslash \. For example,
/\./
This matches a literal period ..
Let's check if the following string examples match the regex pattern \..
| String | Matched? | Reason |
|---|---|---|
| hello. | 1 Match | matches the dot character |
| hello | 0 Match | no dot present |
| hi.. | 2 Matches | two literal dots matched |