Primality testing with Perl regexs

July 23, 2010 | Computer Science | By: Mark VandeWettering

Okay, here’s a little pet peeve of mine. Somebody brought up that it’s possible to do primality testing using Perl regular expressions. This has been kicking around for a while, I’m not sure who originated the idea, but you can find links to it with a Google search. As an example, consider the Perl one liner here:

Primality testing with Perl regexs@Everything2.com

It’s a very cute hack, so where is my peeve?

The regular expression they list isn’t a regular expression.

It seems a bit of a quibble. I’ve had people argue that whatever the meaning of regular expressions might be, the “regular expressions” matched by popular “regular expression” libraries like Perl and Python should supercede, because they are in fact so ubiquitous. I might argue, except for one thing:

Real regular expressions can be matched really quickly, but are seldom matched very quickly at all by Perl, Python, or Ruby. Russ Cox wrote a nice article about this. Yes, perl regular expressions can match all sorts of complex things that real regular expressions can’t, but if all you’ve got is a real regular expression, perl’s performance is typically an order of magnitude slower than necessary, and in fact, perl can be arbitrarily slow. This should come as no surprise: since the perl implementation of primality testing does work, it obviously must be slow: it is, after all, capable of any arbitrary computation.

It’s not like doing it correctly is even difficult: we implemented all this stuff back in Ginnie Lo’s compiler class back in the early 80s, and it was old back then. We made lexical analyzers that began by compiling NFAs for all the tokens in the language, and then converting them into DFAs, which marched along and processed files in linear time.

Okay, it’s my peeve, but there is some good, beautiful computer science underlying it all.

Comments

Comment from Steve VanDevender
Time 7/23/2010 at 9:08 pm

I saw some article by Damian Conway about creating a recursive descent parser using Perl “regular expressions” and thought much the same thing — you can’t actually do that with real regular expressions because they can’t handle unbounded recursive descent. The difference between what a finite-state machine and what a Turing machine can do is pretty clear and fundamental, and claiming these things are being done with “regular expressions” is a misnomer.