smillerjones

Patterns replace RegEx in SmartPurge

Blog Post created by smillerjones on Mar 26, 2015

When we were developing SmartPurge, one of our objectives was to make purging the CDN intuitive, and ensure that the results of what would get purged were predictable.  While you can enter single URLs directly into any Purge request, we also need a way to support bulk purging - targeting multiple objects with a single command.

 

Before SmartPurge, complicated RegEx patterns needed to be used in purge requests in order to achieve this, which are not particularly intuitive, so in SmartPurge we have implemented the idea of patterns. A pattern can be thought of as a simplified regular expression that only supports the (.*) equivalent, the glob character (*).

 

What's a glob?  Globs are a straightforward form of patterns that can easily be used to match files or strings.  This Wikipedia article provides a great overview of globs.

 

Something important to understand about patterns, is that they are anchored at both ends by default. What this means is that a glob must match a whole string (filename or URL path string). So a glob of a* will not match the string cat, because it only matches the at, not the whole string. A glob of ca*, however, would match cat.

 

So far so good.  Lets think about an example from a real use case.

 

I have many different domains which each have folder called "audio".  Every domains audio folder contains many mp3 files which I want to purge from the Limelight CDN.  I don't know the individual file names, as they are programmatically generated.  I want to use one command to purge them all from the CDN.

 

In the past, using RegEx, I would have to create something like this:

http://([-A-Za-z0-9]+\.)+[A-Za-z]+([0-9]+)/?(\/[^?]*)?\/([^\?]*\/)*audio/([^\?]*\/)*([^\?\/]*\.mp3)(\?.*$|$)

 

With SmartPurge, all I need to do is enter the following pattern:

http://*/audio/*.mp3

 

So in our example, * is any string of characters and "*.mp3" is a glob pattern.

 

Here are some additional examples of how RegEx maps to Patterns.  The Patterns are equivalent to the RegEx

RegExPattern
"^foo$""foo"
"^foo .* bar$""foo * bar"
"^foo .* bar .*baz$""foo * bar *baz"

 

 

I also need to mention a limitation to Patterns and also take a look at how we handle query strings.

 

Pattern limitations

There is a limitation to pattern syntax: when glob characters are used in a pattern, there should also be separators, to separate them.

 

These are valid separators:

' ', '&', '.', '/', '=', '?'

 

Thankfully, we also took care of this for you in SmartPurge.  We split the patterns up using the separators, and provide conditional logic to make sure your pattern will work.  We enable you to enter patterns into the UI...

 

Screen Shot 2015-03-11 at 4.15.16 PM.png

...and provide visual cues on what patterns are being used.

Screen Shot 2015-03-11 at 4.16.10 PM.png

 

When submitting a Purge Request via the API, this is an example of how patterns are used:

 

POST http://purge.llnw.com/purge/v1/account/example/requests

{

  "patterns": [{

    "pattern": "http://*.example.com/images/*",

    "evict": false, "exact": false, "incqs": false

  }],

  "email": {

    "subject": "purge results",

    "to": "user@example.com"

  },

  "callback": {

    "url": "http://test.example.com/my_callback.php"

  }

}

 

Technical note:  The SmartPurge implementation limits pattern size to 4096 characters and number of globs per pattern to 8.  The Backslash ('\') character can be used to escape any character in a pattern.

 

Query strings

As a final note, we have also implemented a way of ensuring that we handle query strings for you. Each pattern has an "include query string" flag associated with it, which controls matching behavior.

  • If the flag is set (available in the UI and as a flag in the API), the Pattern is matched against the entire input string,
  • Otherwise, the input string is truncated up to the leftmost '?' character before matching.

 

For example:

 

 

 

 

So Patterns give us a way to make sure we have provided an intuitive purging method which can be used to target individual files, specific file types and entire directories, with confidence and ease.  No more RegEx needed.

 

I'm really interested in your real world examples - please do share your experience of working with patterns with me int eh comments.

 

Happy purging!

 

smj

Outcomes