Regular Expressions

Regular expressions (also "regex" or "regexp") are a powerful way to look for patterns in text, and we employ regular expressions in the PyReconstruct interface mainly to filter lists. To filter a list, you will be asked for an expression you would like to evaluate for. You will therefore need to understand a little bit about how regular expressions are constructed. Here, we present the bare minimum to get up and running.

After browsing this page you might want to learn more, and YouTube has a wealth of information about using regular expressions in Python. Simply search for "regular expression python". Be aware that many programming languages have their own, unique spin on regular expressions. Regular expressions in Emacs and elisp, for example, differ from those in Python, so be sure you are looking specifically into Python regular expressions. You can write very complicated regular expressions in Python to match a variety of patterns, and for a definitive, in-depth discussion, please see the Python documentation on the re module.

Basic characters


ExpressionExplainerExamplesMatches
a
Character
a
a (matches the character "a" exactly)
abc
String
abc
abc (matches the string "abc" exactly)
.
Matches any character
d. 
d followed by any character
|
Or
d1|2
d followed by 1 or 2

Sets

Sets are collections of characters enclosed in square brackets.


ExpressionExplainerExamplesMatches
[abc]
Match a or b or c
d[123] 
d1, d2, d3, not d4


d0[123]
d01, d02, d03, not d04, not d10


[dD]01
d01, D01


d01[cp]01
d01c01, d01p01
[a-z]
Match any lowercase letter
d[a-z]
da, db, dc, not dA, not d1


d001[a-z]
d001a, d001b, d001c, etc.
[A-Z]
Match any uppercase letter
d[A-Z]
dA, dB, DC, not da, not d1


d001[A-Z]
d001A, d001B, d001C, etc.
[0-9]
Match range of digits
d[0-9]
d1, d2, d3, ..., d9 (i.e., match any digit)


d[1-2]
d1, d2, not d3, not d4, etc.


d[0-9][0-9]
d01, d02, d03, d25, etc. (Same as d[0-9]{2}, see below)


d1[0-9]
d10, d11, d12, not d01
[0-9A-Za-z]
Match any number or letter
d[0-9A-Za-z]
d1, da, dC, d9, dX, etc.
[^Z]
Match all characters not in the set
d[^c]
da, db, not dc, dd, etc.

Quantifiers

Quantifiers modify the element immediately to its left and allow you to specify how many times a pattern should be matched. Combining sets with quantifiers becomes a powerful way to filter lists in PyReconstruct.


ExpressionExplainerExamplesMatches
*
Any number of times (0 or more)
.* 
Any character, any number of times (i.e., matches everything)


d01.*
Anything prefixed with d01 (e.g., d01, d01_c01, d01mito)


d[0-9]*
d followed by any numbers (e.g., d1, d01, d12345, etc.)
+
One or more times
d01+
d01, d011, d0111, not d0, not d11
?
Zero or one time
d1?
d, d1, not d11


d01_?c01
d01c01 and d01_c01
{m}
Match m number of times
d[0-9]{2}
d01, d02, not d1, not d003 (same as d[0-9][0-9])
{m,n}
Match m to n number of times
d[0-9]{2,3}
d01, d02, d001, d002, not d1, not d0001

Special characters

You might actually want to match a special character literally. For example, a period in Python regular expressions is interpreted as "any character", but you might need to match a period in the name of an object. To do this, place a backslash before the character.


ExpressionExplainerExamplesMatches
\x
Interpret x literally
d01\*c01
d01*c01


d01\+c01
d01+c01


d01\.c01
d01.c01



← Back to user guide home