InDesign GREP styles gotcha…

I was mucking around with InDesign GREP styles for auto-formatting dollar values, and had a bit of trouble getting it to work.

If you’re not interested in wading through the technical information below – you’re just poking around to find some example of how to format prices in InDesign: scroll down. The working solution is at the end of this post.

After a lot of hemming and hawing, I figured out what it was that made it behave in a way I did not expect.

Lookbehind does not allow for variable-length patterns

I was formatting prices, and had the following patterns set up:

While experimenting, I had given all the character styles involved a different colored stroke, so the characters would ‘glow’ in different colors depending on the character style applied to them.

This makes things a lot easier when used it together with the Preview option on the Paragraph Style dialog.

The raw text looks like this:

After applying the paragraph style, the result looked like this:

which means it did not format the cent values as expected.

As it turns out, positive lookahead (?=... allows variable length patterns, but positive lookbehind (?<=... does not.

So these patterns, which have lookbehind, did not work:

(?<=\d+)\.(?=\d{2})
(?<=\d+\.)\d{2}

Both patterns look behind for one or more decimal digits.

\d means a decimal digit;
\d+ means one or more digits
(?<=) means: look behind the character we’re currently working on
\. means: a period
(?=) means: look ahead from the character we’re currently working on
\d{2} means exactly two decimal digits

The first pattern means: look for a period, and then look behind (i.e. to the left of) that period and verify you can see one or more digits. Then look ahead of the period and verify you can see exactly two decimal digits.

But this pattern, a lookahead, does work:

\$(?=\d+\.\d{2})

Removing the + from the positive lookbehind patterns makes it all work:

My working solution:

I’ve put the styles into style groups ‘GREPStyles’ (to keep things organized):

The character styles are:

DollarSign: [None] + superscript
DollarValue: [None]
DecimalPoint: [None] + size 0.1pt + color: [None]
CentValue: [None] + superscript

The paragraph GREP style is set to:

Apply DollarSign to:
\$(?=\d+\.\d{2})

Apply DollarValue to:
(?<=\$)\d+(?=\.\d{2})

Apply DecimalPoint to:
(?<=\d)\.(?=\d{2})

Apply CentValue to:
(?<=\d\.)\d{2}

This is not perfect, but it’ll do for me.

You could easily extend this to also handle thousands separators – I leave that as an exercise.

Postscriptum:

David Blatner gave me a great tip which can be used to achieve a more precise matching and less ‘iffy’ results: the \K pattern.

This pattern allows us to do ‘lookbehind without lookbehind’ and does not have the same issues as lookbehind.

More info here:
https://www.regular-expressions.info/keep.html