class Matcher

Mutable object used on instances of a Pattern class

Public Fields

[more]static const int MATCH_ENTIRE_STRING
Used internally by match to signify we want the entire string matched

Public Methods

[more]std::vector<std::string> findAll ()
Returns a vector of every substring in order which matches the given pattern.
[more]bool findFirstMatch ()
Scans the string for the first substring matching the pattern.
[more]bool findNextMatch ()
Scans the string for the next substring matching the pattern.
[more]int getEndingIndex (const int groupNum = 0) const
Returns the ending index of the specified group.
[more]unsigned long getFlags () const
The flags currently being used by the matcher.
[more]std::string getGroup (const int groupNum = 0) const
Returns the specified group.
[more]std::vector<std::string> getGroups (const bool includeGroupZero = 0) const
Returns every capture group in a vector
[more]int getStartingIndex (const int groupNum = 0) const
Returns the starting index of the specified group.
[more]inline std::string getString () const
Same as getText.
[more]std::string getText () const
The text being searched by the matcher.
[more]bool matches ()
Scans the string from start to finish for a match.
[more]std::string replaceWithGroups (const std::string & str)
Replaces the contents of str with the appropriate captured text.
[more]void reset ()
Resets the internal state of the matcher
[more]inline void setString (const std::string & newStr)
Sets the string to scan
[more] ~Matcher ()
Cleans up the dynamic memory used by this matcher

Protected Fields

[more]int* ends
An array of the ending positions for each group
[more]unsigned long flags
The flags with which we were made
[more]int gc
The number of capturing groups we have
[more]int* groupIndeces
An array of private data used by NFANodes during matching
[more]int* groupPos
An array of private data used by NFANodes during matching
[more]int* groups
An array of private data used by NFANodes during matching
[more]int lm
The ending index of the last match
[more]int matchedSomething
Whether or not we have matched something (used only by findFirstMatch and findNextMatch)
[more]int ncgc
The number of non-capturing groups we havew
[more]Pattern* pat
The pattern we use to match
[more]int start
The starting point of our match
[more]int* starts
An array of the starting positions for each group
[more]std::string str
The string in which we are matching

Protected Methods

[more]void clearGroups ()
Called by reset to clear the group arrays


Documentation

A matcher is a non thread-safe object used to scan strings using a given Pattern object. Using a Matcher is the preferred method for scanning strings. Matchers are not thread-safe. Matchers require very little dynamic memory, hence one is encouraged to create several instances of a matcher when necessary as opposed to sharing a single instance of a matcher.

The most common methods needed by the matcher are matches, findNextMatch, and getGroup. matches and findNextMatch both return success or failure, and further details can be gathered from their documentation.

Unlike Java's Matcher, this class allows you to change the string you are matching against. This provides a small optimization, since you no longer need multiple matchers for a single pattern in a single thread.

This class also provides an extremely handy method for replacing text with captured data via the replaceWithGroups method. A typical invocation looks like:

  char buf[10000];
  std::string str = "\\5 (user name \\1) uses \\7 for his/her shell and \\6 is their home directory";
  FILE * fp = fopen("/etc/passwd", "r");
  Pattern::registerPattern("entry", "[^:]+");
  Pattern * p = Pattern::compile("^({entry}):({entry}):({entry}):({entry}):({entry}):({entry}):({entry})$",
                                 Pattern::MULTILINE_MATCHING | Pattern::UNIX_LINE_MODE);
  Matcher * m = p->createMatcher("");
  while (fgets(buf, 9999, fp))
  {
    m->setString(buf);
    if (m->matches())
    {
      printf("%s\n", m->replaceWithGroups(str).c_str());
    }
  }
  fclose(fp);

Calling any of the following functions before first calling matches, findFirstMatch, or findNextMatch results in undefined behavior and may cause your program to crash.

The function findFirstMatch will attempt to find the first match in the input string. The same results can be obtained by first calling reset followed by findNextMatch.

To eliminate the necessity of looping through a string to find all the matching substrings, findAll was created. The function will find all matching substrings and return them in a vector. If you need to examine specific capture groups within the substrings, then this method should not be used.

oPattern* pat
The pattern we use to match

ostd::string str
The string in which we are matching

oint start
The starting point of our match

oint* starts
An array of the starting positions for each group

oint* ends
An array of the ending positions for each group

oint* groups
An array of private data used by NFANodes during matching

oint* groupIndeces
An array of private data used by NFANodes during matching

oint* groupPos
An array of private data used by NFANodes during matching

oint lm
The ending index of the last match

oint gc
The number of capturing groups we have

oint ncgc
The number of non-capturing groups we havew

oint matchedSomething
Whether or not we have matched something (used only by findFirstMatch and findNextMatch)

ounsigned long flags
The flags with which we were made

ovoid clearGroups()
Called by reset to clear the group arrays

ostatic const int MATCH_ENTIRE_STRING
Used internally by match to signify we want the entire string matched

o ~Matcher()
Cleans up the dynamic memory used by this matcher

ostd::string replaceWithGroups(const std::string & str)
Replaces the contents of str with the appropriate captured text. str should have at least one back reference, otherwise this function does nothing.
Parameters:
- str The string in which to replace text
Returns:
A string with all backreferences appropriately replaced

ounsigned long getFlags() const
The flags currently being used by the matcher.
Returns:
Zero

ostd::string getText() const
The text being searched by the matcher.
Returns:
the text being searched by the matcher.

obool matches()
Scans the string from start to finish for a match. The entire string must match for this function to return success. Group variables are appropriately set and can be queried after this function returns.

Returns:
Success if and only if the entire string matches the pattern

obool findFirstMatch()
Scans the string for the first substring matching the pattern. The entire string does not necessarily have to match for this function to return success. Group variables are appropriately set and can be queried after this function returns.

Returns:
Success if any substring matches the specified pattern

obool findNextMatch()
Scans the string for the next substring matching the pattern. If no calls have been made to findFirstMatch of findNextMatch since the last call to reset, matches, or setString, then this function's behavior results to that of findFirstMatch.

Returns:
Success if another substring can be found that matches the pattern

ostd::vector<std::string> findAll()
Returns a vector of every substring in order which matches the given pattern.

Returns:
Every substring in order which matches the given pattern

ovoid reset()
Resets the internal state of the matcher

oinline std::string getString() const
Same as getText. Left n for backwards compatibilty with old source code
Returns:
Returns the string that is currently being used for matching

oinline void setString(const std::string & newStr)
Sets the string to scan
Parameters:
newStr - The string to scan for subsequent matches

oint getStartingIndex(const int groupNum = 0) const
Returns the starting index of the specified group.
Parameters:
groupNum - The group to query
Returns:
The starting index of the group if it was matched, -1 for an invalid group or if the group was not matched

oint getEndingIndex(const int groupNum = 0) const
Returns the ending index of the specified group.
Parameters:
groupNum - The group to query
Returns:
The ending index of the group if it was matched, -1 for an invalid group or if the group was not matched

ostd::string getGroup(const int groupNum = 0) const
Returns the specified group. An empty string ("") does not necessarily mean the group was not matched. A group such as (a*b?) could be matched by a zero length. If an empty string is returned, getStartingIndex can be called to determine if the group was actually matched.
Parameters:
groupNum - The group to query
Returns:
The text of the group

ostd::vector<std::string> getGroups(const bool includeGroupZero = 0) const
Returns every capture group in a vector

Parameters:
includeGroupZero - Whether or not include capture group zero
Returns:
Every capture group


This class has no child classes.
Friends:
class NFANode
class NFAStartNode
class NFAEndNode
class NFAGroupHeadNode
class NFAGroupLoopNode
class NFAGroupLoopPrologueNode
class NFAGroupTailNode
class NFALookBehindNode
class NFAStartOfLineNode
class NFAEndOfLineNode
class NFAEndOfMatchNode
class NFAReferenceNode
class Pattern
Author:
Jeffery Stuart
Version:
0.01a
Since:
March 2003, Stable Since November 2004

Alphabetic index HTML hierarchy of classes or Java



This page was generated with the help of DOC++.