A regular expression or RegEx is a string of characters that define the pattern or patterns you are viewing. The syntax of regular expressions in Perl is very similar to what you will find within other regular expression, supporting programs, such as sed, grep, and awk.
In this post, Perl regex is illustrated with examples.
Click here to goto previous Perl posts:
- PERL - Tutorial Part 1 - ElecDude
- PERL TUTORIAL PART 2
- PERL TUTORIAL PART 3 – ELECDUDE
- PERL TUTORIAL PART 4 - Working with files
The basic method for applying a regular expression is to use the pattern binding operators =~ and !~. The first operator is a test and assignment operator.
There are three regular expression operators within Perl
· Match Regular Expression - m//<modifiers>
· Substitute Regular Expression - s///<modifiers>
· Transliterate Regular Expression - tr///<modifiers>
The forward slashes in each case act as delimiters for the regular expression (regex) that you are specifying. If you are comfortable with any other delimiter then you can use in place of forward slash.
THE MATCH OPERATOR
The match operator, m//, is used to match a string or statement to a regular expression. For example, to match the character sequence "foo" against the scalar $bar, you might use a statement like this:
if ($bar =~ /foo/)
Note that the entire match expression.that is the expression on the left of =~ or !~ and the match operator, returns true (in a scalar context) if the expression matches. Therefore the statement:
$true = ($foo =~ m/foo/);
Will set $true to 1 if $foo matches the regex, or 0 if the match fails.
In a list context, the match returns the contents of any grouped expressions. For example, when extracting the hours, minutes, and seconds from a time string, we can use:
my ($hours, $minutes, $seconds) = ($time =~ m/(\d+):(\d+):(\d+)/);
Match Operator Modifiers
The match operator supports its own set of modifiers. The /g modifier allows for global matching. The /i modifier will make the match case insensitive. Here is the complete list of modifiers
Modifier
|
Description
|
i
|
Makes the match case insensitive
|
m
|
Specifies that if the string has newline or carriage return characters, the ^ and $ operators will now match against a newline boundary, instead of a string boundary
|
o
|
Evaluates the expression only once
|
s
|
Allows use of . to match a newline character
|
x
|
Allows you to use white space in the expression for clarity
|
g
|
Globally finds all matches
|
cg
|
Allows the search to continue even after a global match fails
|
Code
|
Description
|
\w
|
Alphanumeric Characters
|
\W
|
Non-Alphanumeric Characters
|
\s
|
White Space
|
\S
|
Non-White Space
|
\d
|
Digits
|
\D
|
Non-Digits
|
\b
|
Word Boundary
|
\B
|
Non-Word Boundary
|
\A or ^
|
At the Beginning of a String
|
\Z or $
|
At the End of a String
|
.
|
Match Any Single Character
|
Based on the number of occurrences, the RegEx code & its uses are below:
Code
|
Description
|
*
|
Zero or More Occurrences
|
?
|
Zero or One Occurrence
|
+
|
One or More Occurrences
|
{ N }
|
Exactly N Occurrences
|
{ N,M }
|
Between N and M Occurrences
|
.* <thingy>
|
Greedy Match, up to the last thingy
|
.*? <thingy>
|
Non-Greedy Match, up to the first thingy
|
[ set_of_things ]
|
Match Any Item in the Set
|
[ ^ set_of_things ]
|
Does Not Match Anything in the Set
|
( some_expression )
|
Tag an Expression
|
$1..$N
|
Tagged Expressions used in Substitutions
|
Matching Only Once
There is also a simpler version of the match operator - the ?PATTERN? operator. This is basically identical to the m// operator except that it only matches once within the string you are searching between each call to reset.
For example, you can use this to get the first and last elements within a list:
@list = qw/food foosball subeo footnote terfoot canic footbrdige/;
foreach (@list)
{
$first = $1 if ?(foo.*)?;
$last = $1 if /(foo.*)/;
}
print "First: $first, Last: $last\n";
This will produce following result
First: food, Last: footbridge
THE SUBSTITUTION OPERATOR
The substitution operator, s///, is really just an extension of the match operator that allows you to replace the text matched with some new text. The basic form of the operator is:
s/PATTERN/REPLACEMENT/;
The PATTERN is the regular expression for the text that we are looking for. The REPLACEMENT is a specification for the text or regular expression that we want to use to replace the found text with.
For example, we can replace all occurrences of .dog. with .cat. using
$string =~ s/dog/cat/;
Another example:
$string = 'The cat sat on the mat';
$string =~ s/cat/dog/;
print "Final Result is $string\n";
This will produce following result
The dog sat on the mat
Substitution Operator Modifiers
Here is the list of all modifiers used with substitution operator
Modifier Description
i Makes the match case insensitive
m Specifies that if the string has newline or carriage
return characters, the ^ and $ operators will now
match against a newline boundary, instead of a
string boundary
o Evaluates the expression only once
s Allows use of . to match a newline character
x Allows you to use white space in the expression
for clarity
g Replaces all occurrences of the found expression
with the replacement text
e Evaluates the replacement as if it were a Perl statement,
and uses its return value as the replacement text
TRANSLATION
Translation is similar, but not identical, to the principles of substitution, but unlike substitution, translation (or transliteration) does not use regular expressions for its search on replacement values. The translation operators are:
tr/SEARCHLIST/REPLACEMENTLIST/cds
y/SEARCHLIST/REPLACEMENTLIST/cds
The translation replaces all occurrences of the characters in SEARCHLIST with the corresponding characters in REPLACEMENTLIST. For example, using the "The cat sat on the mat." string we have been using in this chapter:
$string = 'The cat sat on the mat';
$string =~ tr/a/o/;
print "$string\n";
This will produce following result
The cot sot on the mot.
Standard Perl ranges can also be used, allowing you to specify ranges of characters either by letter or numerical value. To change the case of the string, you might use following syntax in place of the ucfunction.
$string =~ tr/a-z/A-Z/;
Translation Operator Modifiers
Following is the list of operators related to translation
Modifier Description
c Complement SEARCHLIST.
d Delete found but unreplaced characters.
s Squash duplicate replaced characters.
The /d modifier deletes the characters matching SEARCHLIST that do not have a corresponding entry in REPLACEMENTLIST. For example:
$string = 'the cat sat on the mat.';
$string =~ tr/a-z/b/d;
print "$string\n";
This will produce following result
b b b.
The last modifier, /s, removes the duplicate sequences of characters that were replaced, so:
$string = 'food';
$string = 'food';
$string =~ tr/a-z/a-z/s;
print $string;
This will produce following result
fod
For queries, plz post comments or write to admin@elecdude.com
this article helps in many ways.Thankyou so much.
ReplyDeletejavascript training in chennai
javascript training in OMR
core java training in chennai
core java Training in Velachery
C++ Training in Chennai
C C++ Training in Tambaram
core java training in chennai
core java Training in Adyar
This is very interesting and I like this type of article only. I have always read important article like this. it contain word is simple to understand everyone.
ReplyDeleteC and C++ Training Institute in chennai | C and C++ Training Institute in anna nagar | C and C++ Training Institute in omr | C and C++ Training Institute in porur | C and C++ Training Institute in tambaram | C and C++ Training Institute in velachery
Quick Heal Total Security 2022 License Key is an antivirus created by Quick Heal Technologies. It is a lightweight cloud-based protection software. Quick Heal Antivirus Pro Product Key
ReplyDeleteDesign new sites visually with the popular Site Designer app or edit the code for existing projects manually Coffee Web Form Builder
ReplyDeletethe Muslim assembly of nation spend it pray fervidly, giving liberally, memorizing the sacred writing, and having to pay attention to Hadiths. Jumma Mubarak 2022
ReplyDelete