Basic Regular expressions to start with for the beginners.

Regular expressions are collection of characters and symbols. Regular expressions are a powerful way of retrieving and analysing values for long text strings or long paragraphs. First I will start with the basics one.

buy generic viagra

cts.com/wp-content/uploads/2011/07/regular-expressions-regex.jpg”>

What symbols and characters are used in Regular Expressions?

You must have seen asterisk(*), dot(.) charaters in regular expressions.

  1. ^ : This symbol determines the start of the regular expression.

^insect : This pattern will match all the strings which begin with insect.

  • $ : $ determines the end of the string.
  • + : +(Plus) is used for one or more occurrence of any character.
  • * : * is used for zero or more occurrence of character.
  • ? : ? is used for zero or one occurrence of a character.
  • . : .(Dot) matches exactly one character.

So if I have a expression as c?o+d*. This will match all the following strings :

Co, ooddd, cod, coddd and so on.

In the above results c may be or may not be in the result , while at least one occurrence of O will always be there. Again D may be or may not be in the result.

So lets carry on to some other symbols of regular expressions. Its getting easier and interesting now.

Code{3} : This expression will match cod followed by exactly 3 e’s.

e.g : Codeee

if we put something like Code{3,} then it can match at least 3 e’s.

Code{1,3} would match all patternswhich contains code followed by 1 to 3 e’s.

If you want any string that starts with a alphabet then you can use
^[a-zA-Z] This will match any uppercase or lowercase alphabet.
For Alphanumeric you can use :
N[a-zA-Z0-9]$ : This will match N followed by any alphanumeric to end the string.
If you don’t want any alphabet then you can use not symbol.
C[^a-zA-Z]% : This will match any character other than alphabet between C and %.
^.{3}$ : This expression will match a string with exactly 3 characters.

Regular expression to validate a valid E-mail Address:

$validEmail = "^[a-z0-9_\+-]+(\.[a-z0-9_\+-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*\.([a-z]{2,4})$";

I will explain this in detail how it validate a email address.

Lets take the first part i.e : ^[a-z0-9_\+-]+

^ marks the beginning of the regular expression. After that it means the address starts with any letters a-z, any numbers 0-9 or characters like ‘_’, ‘+’ and ‘_’. The last + indicates the occurrence of all these characters one or more time.

So a expression code_insects, code+insects09 will match this expression.

Now move to the second part i.e : (\.[a-z0-9_\+-]+)*

The . in the first means that the previous characters entered will be followed by the .(dot). After that the pattern continues as in the first part. Any letters, numbers, and symbols + , -, _. The last + in the expression indicates there would be one or more occurrence of these characters. The final * in the sequence indicates zero or more occurrence of these sequences.

So expression like .code87+, .09_sa will match the second expression.

After these characters there must be a single “@” character. It must be followed by a domain label that consists of letters, numbers and hyphens. There can be one or more domain labels separated with a period.

@[a-z0-9-]+(\.[a-z0-9-]+)*\.([a-z]{2,4})$

The first part is defined by [a-z0-9-]+.

After this there can be 0 or more similar sequences starting with a period. This is defined as (\.[a-z0-9-]+)*.

The last part validates the email address end with a period followed by 2-s4 letters (for example .insects .info). The expression \.([a-z]{2,4})$ matches this.

So thats all about the basic part of regular expressions. I hope you have found this post useful. Will come back with more advance regular expressions example and their usage. Till then Happy Pattern Matching

Comments
  1. Steven
    • Terrah
  2. sibghatullah
    • Sam

Leave a Reply

Your email address will not be published. Required fields are marked *

*