Saturday, October 30, 2010

REGULAR EXPRESSION – Cont.

Some Definitions:

Literal:

Any Character used in a search or matching expressions.

Metacharacter:

One or more special characters that have a unique meaning and are not used as literals.

Target String:

The string that we will be searching.

Search Expression:

The expression that we will be using to search our target string.

Meta Characters:

  • “[ ]” – Match anything inside the square brackets for one character position once and only once. Eg: [12] searches 1 and 2.
  • “-“ - Range Sperator. Allows us to define a range. Eg: [0-9] searches 0 to 9.
  • “^” – Negates the expression. [^Ff] searches anything except upper/lower case F. And also it looks at the beginning of the word. I.e. “^Hai” looks at the beginning of the word Hai
  • “$” - Looks at the end of the Target String.
  • “.” - Any character in this position. Eg: “ton.” Searches and finds “tons” and "tonneau” and not the “wanton”.

Iteration Metacharacters:

Quantifiers that can control the number of times a character or string is found in our searches.

  • “?” – Matches the preceding character 0 or 1 time only. Eg: “Colou?r” will find both “Colors” and “Colours”
  • “*” – Matches the preceding character 0 or more times. Eg: “tre*” will find “tree” and “tread”.
  • “+” – Matches the previous character 1 or more times exactly.
  • “{n}” – Matches the preceding character n times exactly. Eg: For searching phone no. we can as “[0-9]{3} – [0-9] {4}” which will find as 123 – 4567.
  • “{n,m}” – Matches the preceding character atleast n times but not more than m times. Eg: “ba{2,3}b” will find “baab” and “baaab” but not “bab” or “baaaab”.

Other Metacharacters:

§ “()” – Used to group parts of our search expression together.

§ “|” – Alteration in techspeak means find the left hand or right hand values. Eg: “gra(a|e)y” will find “gray” or “grey”

No comments:

Post a Comment