Here are a couple of wrapper VBScript functions that make using script’s built-in RegExp
class similar to PHP’s preg_xxx
family of functions.
First a private helper function to initialize an instance of RegExp
:
Private Function preg_init(find_re) Set preg_init = New RegExp With preg_init .Global = True If Left(find_re, 1) = "/" Then Dim pos: pos = InStrRev(find_re, "/") .Pattern = Mid(find_re, 2, pos - 2) .IgnoreCase = (InStr(pos, find_re, "i") > 0) .Multiline = (InStr(pos, find_re, "m") > 0) Else .Pattern = find_re End If End With End Function
This enables optionally to set flags of the object in the search string after the trailing slash e.g. “/test/mi” will search for “test” ignoring case and multi-line. This helper function is not meant to be used by the client scripts but helps implement the actual search&replace functions.
Next the implementation of preg_match
, preg_replace
and preg_split
becomes very simple
Function preg_match(find_re, text) preg_match = preg_init(find_re).Test(text) End Function Function preg_replace(find_re, replace_arg, text) preg_replace = preg_init(find_re).Replace(text, replace_arg) End Function Function preg_split(find_re, text) Dim esc: esc = ChrW(&HE1B6) '-- U+E000 to U+F8FF - Private Use Area (PUA) preg_split = Split(preg_init(find_re).Replace(text, esc), esc) End Function
These are fairly straight forward and are about enough to complete 99% of any search&replace task at hand. For preg_replace
one can use replace placeholders in VBScript notation ($1, $2, etc.) to match search groups.
The troublesome 1% with this implementation of preg_replace
is that it cannot handle callback functions for replace_arg
. So here is a manual implementation of preg_replace
that can handles both strings and a callback objects for replace_arg
.
Function preg_replace_callback(find_re, replace_arg, text) Dim matches, match, count, offset, retval Set matches = preg_init(find_re).Execute(text) If matches.Count = 0 Then preg_replace_callback = text Exit Function End If ReDim retval(matches.Count * (1 - IsObject(replace_arg))) For Each match In matches With match retval(count) = Mid(text, 1 + offset, .FirstIndex - offset) count = count + 1 If IsObject(replace_arg) Then retval(count) = replace_arg(match) count = count + 1 End If offset = .FirstIndex + .Length End With Next retval(count) = Mid(text, 1 + offset) If IsObject(replace_arg) Then preg_replace_callback = Join(retval, vbNullString) Else preg_replace_callback = Join(retval, replace_arg) End If End Function
For a callback object one has to pass an instance of a class with a default method. Here is sample class that implements PHP notation for replace placeholders (\1, \2, etc. or \{1}, \{2}, etc.)
Function preg_substitute(replace_arg) Set preg_substitute = New preg_substitute_class.init(replace_arg) End Function Class preg_substitute_class private m_esc Private m_replace Public Function init(replace_arg) m_esc = ChrW(&HE1B6) '-- U+E000 to U+F8FF - Private Use Area (PUA) m_replace = Replace(replace_arg, "\", m_esc) Set init = Me End Function Public Default Function callback(match) Dim idx, replace_str replace_str = match.Value callback = Replace(Replace(m_replace, m_esc & "{0}", replace_str), m_esc & "0", replace_str) With match.SubMatches For idx = .Count To 1 Step -1 replace_str = .Item(idx - 1) callback = Replace(Replace(callback, m_esc & "{" & idx & "}", replace_str), m_esc & idx, replace_str) Next End With callback = Replace(callback, m_esc, "\") End Function End Class
This can be used like preg_replace_callback("/(test)\s+(this)/mi", preg_substitute("\{2} \{1}"))
.
The most interesting method of the callback class is the default one — here it’s named callback
but the actual name can be arbitrary. The default method receives a match
argument which is the entry in the RegExp
‘s matches collection and returns a string to be used as a replace string.
Match object exposes currently matched substring in Value
property, its position in FirstIndex
property and all the matched subgroups in SubMatches
collection. This allows a much more sophisticated replacement implementation, for instance lower/upper casing entries in the SubMatches
collection, etc.
The performance of preg_replace_callback
is about 20% worse than RegExp
‘s build-in Replace
method used directly by the simple preg_replace
even for thousands of occurrences of the search regular expression.
Full source code of these functions including sample usage is available in our pg_conv.vbs converting script that we used to convert a Microsoft SQL Server database table definitions script to PostgreSQL dialect.