ExtendScript eval() stumbles over U+2028 and U+2029

I think I found another obscure bug in ExtendScript.

This bug affects any ExtendScript code that is passed through eval().

eval() spits the dummy when you try to evaluate any JS code that contains literal strings that contain the literal Unicode characters U+2028 (line separator) or U+2029 (paragraph separator).

This can be worked around by encoding these characters by their escaped equivalents, \u2028 or \u2029.

I verified the 16-bit Unicode range, and these are the only two characters that cause problems. See

https://github.com/zwettemaan/ESON/blob/main/findBadCodes.jsx

It all started because I was curious and wondered what it would take to use eval/uneval as the basis for implementing JSON.parse/JSON.stringify.

json2.js

One of the most common solutions is to use Douglas Crockford’s library, JSON-js (aka json2.js).

https://github.com/douglascrockford/JSON-js

You can readily //@include this file into your ExtendScript code, and gain access to JSON.parse() and JSON.stringify().

This module is time-proven, and performs a proper parse of the input data, which is useful if you need to ingest JSON data from untrusted sources.

This protects your script from injection attacks, where a malicious actor crafts a ‘fake’ JSON file which contains some executable JavaScript code.

Executable JS code is not proper JSON, so json2.js will simply refuse to parse and will throw an exception. That’s a great feature!

A disadvantage of using json2.js is that it is not very fast when used with ExtendScript.

Especially when parsing larger and larger amounts of JSON data, the time needed will start to balloon and it will slow to a snail’s pace, or even slower than that: the relation between data size and slowdown is not linear. I’ve not properly benchmarked it, but I suspect it might be an exponential relation.

The best way to get around the slowness is to use a C++-based alternative, rather than a ‘pure JS’ solutions like json2.js.

When it comes to ingest large, multi-megabyte JSON files, C++ code will eat those in a tiny fraction of the time it takes json2.js to slog through them.

Note 2024-09-17: Marc Autret (Indiscripts) sent me some feedback on my blog post and pointed out a bunch of problems with json2.js.

https://www.indiscripts.com/

If you need a reliable JSON module for ExtendScript, make sure to look at Marc’s idExtenso

https://github.com/indiscripts/IdExtenso

eval()

Another alternative is to use the built-in eval() as a ‘quick and dirty’ replacement for JSON.parse().

JSON is a subset of JavaScript, and simply handing over some JSON data to eval() works fine because eval() will treat it as executable code, and evaluate the JSON.

Advantage: this is much faster than json2.js.

Advantage: eval() can process JSON-C (i.e. JSON with comments).

BIG disadvantage: eval() has some serious problems, and it should only be used in environments where the JSON data comes from a trusted source.

eval() is problematic, because it does not verify that the JSON data is just that, data.

That means that there is a risk that some malicious actor would to be able to ‘inject’ some fake JSON file with executable JS code into your workflow.

In short: in a pinch, eval() can be used as a quick-and-dirty replacement for JSON.parse(), but I think it’s better to be safe than sorry, and avoid this in production code.

uneval()

ExtendScript is ‘old time JavaScript’, and as such it still supports the uneval() function.

uneval() will produce output that is similar to JSON, but not quite.

var o = {
  a: 1,
  b: [1,2,3],
  c: {
    d: 9,
    e: "ab\u0101c"  
  }
}
uneval(o);

will produce:

 ({a:1, b:[1, 2, 3], c:{d:9, e:"abāc"}})

Proper JSON would be:

{"a":1,"b":[1,2,3],"c":{"d":9,"e":"abāc"}}

Fake it till you make it

For the heck of it, I managed to create a usable implementation of JSON.parse/JSON.stringify based on eval/uneval, and it seems to work fine.

https://github.com/zwettemaan/ESON

ESON.stringify() will transmogrify the output of uneval() into proper JSON. It is slowed down a bit because it needs to make sure that U+2028 and U+2029 are properly encoded.

If you are 100% certain that U+2028 and U+2029 never occur in the data you’re processing, then ESON.stringify() can be sped up a fair bit by omitting the check for the ‘bad codes’.

ESON.parse() is a lot faster than JSON.parse(), but should be avoided because of its insecure nature.

There is also a bunch of benchmarking code: it will generate random large objects, and run them through stringify/parse in order to compare json2.js with ESON.

Use at your own risk!

If you find this helpful, make sure to give me a positive reaction on LinkedIn! I don’t use any other social media platforms. My LinkedIn account is here:

https://www.linkedin.com/in/kristiaan