Regular Expressions in Order Rules

This article gives a short overview of regular expressions, their use in PackageBee, and a convenient "cheat sheet".

Overview

Regular expressions are patterns of text strings where characters of the same type are represented by a unique symbol or "metacharacter". Strings of such symbols can be used to represent many actual text strings, and in this way speed up text search, comparison, replacement and deletion, without having to spell out all character strings individually.

For example, using the symbol \d to represent any single digit, we can find all orders where the phone number is in the US 212 area code, with the regular expression 212-\d\d\d\d\d\d\d, without having to type in all possible combinations:

  • 212-1111111
  • 212-1111112
  • 212-1111113
  • ...
  • 212-9999999

Many different symbolic representations of text strings can be devised but regular expressions have become a popular formal standard that has been in use since 1950s, and included in slight variations in all major programming languages.

PackageBee conventions

In PackageBee, regular expressions are used in order rules with the matches comparison operator to compare attribute values, and to assign attribute values to textual attributes.

In the PackageBee Advanced rule editor, regular expressions are placed between forward slashes (/.../), e.g. /212-\d\d\d\d\d\d\d/. In the GUI Editor, forward slashes are added automatically, so you should not include them.

Characters that are used as symbols for other characters (see the table below), need to be annotated with a preceding left slash (\) as an "escape character" when they are not used as symbols but as literal characters, so that the processing engine knows which sense you have in mind, e.g. if you would like to find strings that contain a question mark, then you should represent the question mark as \?.

Right slashes cannot be used inside PackageBee regular expressions even with escape characters, including them will trigger an error.

Regex cheat sheet

The table below includes symbols that are used in regular expressions most frequently. All symbols can be combined with other symbols to create more complex strings.

Symbol Description Example regex Matched string
\d One digit from 0 to 9 \d\d\d 212
\D one character that is not a digit \D\D\D aBc
\w One letter, digit or underscore \w\w\w a_1
\W one character that is not a letter, digit or underscore \W\W\W +-)
\s One whitespace character (space, tab, newline) Hello\sworld Hello world
\S any character, except whitespace \S\S\S\S\S Hello!
. Any character, except line break a.c abc
+ The preceding symbol 1 or more times \d+ 1234
? The preceding symbol 0 or 1 time Ahaa? Ahaaa
* The preceding symbol 0 or more times Oo.* Ooo123+h
{m,n} The preceding symbol occurring minimum m and maximum n times.
(m can be 0; n can be omitted)

{m} = exactly m times
{m,} = at least m times
{0,n} = at most n times
\w{2,4}
\D{3}
a{2,}h
a{0,2}h
abcd
ABC
aaah
ah
| Separates alternatives (OR) 2|3|4 4
() A string of symbols as a single element. The symbols in parentheses can be referenced later by positional variables \1, \2, etc. A(nt|pple)

He(l)\1(o) w\2r\1d
Apple

Hello world
[] A set of possible characters. A dash (-) between two characters represents all characters in that range, inclusive. A caret (^) at the beginning of the brackets represents negation (matches every character except those inside the brackets). [aeiou]
[A-Z]
[^AEIOU]
b
E
Z
^ The beginning of a string or a line ^@[a-z]+ @jane
$ The end of a string or a line ^.* the end$ This is the end

For more examples of regular expressions and for a more in-depth discussion, see the Wikipedia article.

To test out your regular expressions, use one of the online regex testers, e.g. Rubular or Regex101.

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.