Most dynamic languages allow you to evaluate a string of code, for example eval
, in JavaScript or Python. Eval is powerful (and mandatory) if you’re building an IDE. However, the benefits are usually greatly outweighed by the risks.
Evaluated code is much more difficult to write than inline code. In JavaScript, you have to escape things like quotes and line breaks and your editor probably won’t help you with syntax highlighting or type-ahead. Code hidden in strings also makes the code much more difficult to read, not to mention debug. However, these are minor compared to the potential security problems that eval introduces.
Injection attacks are a type of security vulnerability when data supplied by a user is interpreted or executed in a malicious or unexpected way. SQL injection is one of the most common occurrences (“Little Bobby Tables” anyone?), but any code that is evaluated is susceptible.
For example, the following code from a naïve calculator application takes a mathematical expression and returns the answer.
function calculate(expression) { return eval(expression); } 'The answer is ' + calculate(request['expression']);
This works great for expressions like 1 + 1
or even Math.acos(3 * Math.PI)
. However, what if the user passed in System.shutdown()
or database.clear()
or users.findByID(1234).creditAccount(9999999999, '£')
? The calculate()
function would blindly execute these as well, with potentially dire consequences. Even if a user does not know what specific functionality is available in the target evaluation context, it is not very difficult to guess or get up to no good with just the core language. To implement our calculator safely, we should implement our own expression parser that can sanitize and validate inputs to make sure they are valid math expressions and not arbitrary code.
MarkLogic provides built-in APIs to evaluate code. This is most useful as a means to run code in a context different than the request from which it was called, for example in a different transaction, as another user, or asynchronously on the task server.
This is useful in many ways:
Take a look at the options to xdmp.eval()
for other ways to affect the context of evaluated code.
Like JavaScript’s built-in eval
, xdmp.eval()
takes a string of JavaScript and configuration options and runs the passed in code in the context of the options. For all of the reasons above, xdmp.eval()
is generally to be avoided. A better option is to use xdmp.invoke()
. Unlike xdmp.eval()
, with xdmp.invoke()
you specify a path to an existing module. Like xdmp.eval()
, you can use the $vars
argument to safely pass in dynamic parameters to the stored module. That’s a much safer way to parametrize evaluated code than building strings to eval. However, unlike xdmp.eval()
, there’s no chance that an invoked module will unsafely evaluate an input. xdmp.invoke()
uses the same set of context options that xdmp.eval()
uses, so you can invoke a module in a separate transaction or as a different user.
Unfortunately, it’s not always feasible or convenient to isolate your dynamic code into its own main module. xdmp.invokeFunction()
allows you to invoke any in-context function, even anonymous ones that you build on the fly. Think of it as a MarkLogic-enhanced version of Function.prototype.apply()
. Moreover, xdmp.invokeFunction()
allows you to separate the concerns of what the function does from the context in which it’s evaluated. This makes for cleaner code and easier testing.
Take, for example, the following trivial illustration. The xdmp.transaction()
function gives the ID of the current transaction. Because the xdmp.invokeFunction()
call specifies that the second call to xdmp.transaction()
be run in a separate transaction you’ll get a different ID.
[ xdmp.transaction(), xdmp.invokeFunction(xdmp.transaction, { isolation: 'different-transaction' }) ]
The first call returns the transaction assigned to the current request. The second, using xdmp.invokeFunction()
explicitly calls the xdmp.transaction()
function in a different transaction. Note the use of xdmp.transaction
sans parentheses. xdmp.transaction()
calls the xdmp.transaction
function. xdmp.transaction
, no parens, is a reference to the function itself. The actual identifiers in the output below are not important. The fact that they’re different because of the evaluation context is important.
[ "4394203566847635840", "8340410512199485627" ]
xdmp.invokeFunction()
is the best way to run code in a different context with Server-Side JavaScript in MarkLogic. However, it requires that you pass it a zero-arity function, i.e. one that has no inputs, and always returns a ValueIterator
, even if the invoked function returns an atomic value. With the magic of first-class functions in JavaScript, we can provide a friendlier version.
/** * Return a function proxy to invoke a function in another context. * The proxy can be called just like the original function, with the * same arguments and return types. Example uses: to run the input * as another user, against another database, or in a separate * transaction. * * @param {function} fct The function to invoke * @param {object} [options] The `xdmp.eval` options. * Use `options.user` as a shortcut to * specify a user name (versus an ID). * `options.database` can take a `string` * or a `number`. * @param {object} [thisArg] The `this` context when calling `fct` * @return {function} A function that accepts the same arguments as * the originally input function. */ function applyAs(fct, options, thisArg) { return function() { var args = Array.prototype.slice.call(arguments); // Curry the function to include the params by closure. // `xdmp.invokeFunction` requires that invoked functions have // an arity of zero. var f = function () { // Nested ValueIterators are flattened. Thus if `fct` returns a ValueIterator // there’s no way to differentiate it from the ValueIterator that // `xdmp.invokeFunction` (or `xdmp.eval` or `xdmp.invoke` or `xdmp.spawn`) // returns. However, by wrapping the returned Sequence in something else— // an array here—we can “pop” the stack to get the actual return value. return [fct.apply(thisArg, args)]; }; options = options || {}; // Allow passing in database name, rather than id if('string' === typeof options.database) { options.database = xdmp.database(options.database); } // Allow passing in user name, rather than id if(options.user) { options.userId = xdmp.user(options.user); delete options.user; } // Allow the functions themselves to declare their transaction mode if(fct.transactionMode && !(options.transactionMode)) { options.transactionMode = fct.transactionMode; } return fn.head(xdmp.invokeFunction(f, options)).pop(); } }
applyAs()
takes a function and the same options argument as xdmp.invokeFunction()
and returns a new function that behaves just like the input, but will be invoked in the context determined by the options. Thus, downstream consumers don’t need to be aware that the function is being invoked in a different context and can call the function as if it were the original function. For example, the (contrived) insert()
function below takes a URI and string message, saves a document to the database, and returns a string.
function insert(uri, message) { xdmp.documentInsert(uri, { message: message }, xdmp.defaultPermissions(), xdmp.defaultCollections()); return message; } var myInsert = applyAs(insert, { database: 'Modules', transactionMode: 'update-auto-commit' }); myInsert('/hello.json', 'Hello, world!');
myInsert()
has the same “signature” as the insert function but hides its evaluation context, simplifiying usage, very similar to applying around advice in aspect-oriented programming.
This approach is a lot cleaner and has a clearer separation of the logic and the orchestration than something like the following:
function myInsert(uri, message) { return fn.head( xdmp.invokeFunction(function() { xdmp.documentInsert(uri, { message: message }, xdmp.defaultPermissions(), xdmp.defaultCollections()); }, { database: '3616783675111452341', transactionMode: 'update-auto-commit' }) ); }
To summarize, it’s almost always a bad idea to eval strings of code. This leaves you open to injection attacks and makes code more difficult to read and write. Instead, use xdmp.invokeFunction()
in MarkLogic Server-Side JavaScript to run a function in another context, such as in a separate transaction, against another database, or as another user. First-class functions in JavaScript can help you write a better xdmp.invokeFunction()
that can be used to wrap existing functions, hiding the change of context from consumers.
Stay safe out there.
View all posts from Justin Makeig on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.
Let our experts teach you how to use Sitefinity's best-in-class features to deliver compelling digital experiences.
Learn MoreSubscribe to get all the news, info and tutorials you need to build better business apps and sites