MarkLogic can resolve most deadlocks on its own. When two updates depend on each other’s locks, MarkLogic detects the deadlock and resolves it by restarting the update with the fewest locks. However, there’s one scenario where this solution doesn’t work: when one update nests another update in a separate transactional context and the deadlock happens between the two updates. A restart would only cause the same issue to happen again. These “unresolvable” deadlocks are essentially code bugs. They can happen when using xdmp:eval, xdmp:invoke, or xdmp:invoke-function, and in this article we’ll show techniques to avoid these problematic deadlocks.
When using these invoking functions, you can choose to use the same transactional context (same-transaction, in which case locks are shared and deadlocks between the two won’t happen) or a different transactional context (different-transaction, and this is where you need to be careful). The default is to use a different transactional context, so by default you need to be careful.
The following situations are examples of when a programmer risks unresolvable deadlocks:
The example below is a REST extension that allows an update of a record at most once every minute:
sample-update.xqy – a REST API extension module namespace ns = "http://marklogic.com/rest-api/resource/sample-update"; declare default function namespace "http://marklogic.com/rest-api/resource/sample-update"; declare variable $tracker-file := "/tracker.json"; declare variable $content-file := "/content.html"; (: : http put by default operates on update transaction mode. : more information about transactions are available at : https://docs.marklogic.com/guide/app-dev/transactions :) declare function put( $context as map:map, $params as map:map, $input as document-node()* ) as document-node()? { if (check-tracker()) then ( xdmp:document-insert($content-file, $input) , update-tracker() , document{ fn:true() } ) else ( document{ fn:false() } ) }; (: : limit update to once per minute : other applications may want to deduct a certain balance for each : transaction made :) declare function check-tracker( ) as xs:boolean { let $tracker := doc($tracker-file) let $age := fn:current-dateTime() - $tracker/timestamp return not(fn:exists($tracker)) or $age gt xs:dayTimeDuration('PT60S') }; declare function update-tracker( ) { xdmp:invoke-function( function(){ xdmp:document-insert($tracker-file, object-node{'timestamp' : fn:current-dateTime()}) } ) };
If we try to invoke this operation via curl, it will fail as follows:
$> curl -X PUT --anyauth -uadmin:admin "http://localhost:9999/v1/resources/sample-update" -H "Content-Type:application/json" -d "{"new" : "content"}" {"errorResponse":{"statusCode":500, "status":"Internal Server Error", "messageCode":"INTERNAL ERROR", "message":"SVC-EXTIME: xdmp:document-insert("/tracker.json", object-node{"timestamp":text{"2018-04-11T18:40:51.4589003+08:00"}}) -- Time limit exceeded . See the MarkLogic server error log for further detail."}}
The above error will yield a lot of “Notice” level entries in the MarkLogic log. For MarkLogic 8 and below, this would all be at ErrorLog.txt. For MarkLogic 9 and above, where each app server gets its own ErrorLog, the following log would be at 9999_ErrorLog.txt:
2018-04-18 21:18:29.580 Notice: SVC-EXTIME: xdmp:document-insert("/tracker.json", object-node{"timestamp":text{"2018-04-18T21:08:29.47+08:00"}}) -- Time limit exceeded 2018-04-18 21:18:29.580 Notice:+in /marklogic.rest.resource/sample-update/assets/resource.xqy, at 46:6, 2018-04-18 21:18:29.580 Notice:+in function() as item()*() [1.0-ml] 2018-04-18 21:18:29.580 Notice:+in /marklogic.rest.resource/sample-update/assets/resource.xqy, 2018-04-18 21:18:29.580 Notice:+in xdmp:invoke(function() as item()*) [1.0-ml] 2018-04-18 21:18:29.580 Notice:+in /marklogic.rest.resource/sample-update/assets/resource.xqy, at 44:2, 2018-04-18 21:18:29.580 Notice:+in update-tracker() [1.0-ml]
The above logs provide us with the following information:
By reviewing our sample-update.xqy to see what involves “/tracker.json”, we discover that the activity involved is reading the document, via [fn:]doc in check-tracker(), before the call to xdmp:invoke in update-tracker(). Since “PUT” runs in update mode, the read activity initiates a read lock on the document “/tracker.json”. The child transaction can no longer acquire a read-write lock on the same document to proceed with the update. (Read more information about locks and transactions)
There are several options on how to resolve this issue. Each of them can be implemented independently or collectively. Let’s go through them one by one.
We modify update-tracker() as follows:
declare function update-tracker( ) { xdmp:invoke-function( function(){ xdmp:document-insert($tracker-file, object-node{'timestamp' : fn:current-dateTime()}) } , <options xmlns="xdmp:eval"> <isolation xmlns="http://www.w3.org/1999/xhtml">same-statement</isolation> </options> ) };
This minor change will allow the child transaction to share the lock that has been acquired by the main transaction. However, this approach may not be an option if you want the child transaction to execute regardless of the success or failure of the outer transaction, e.g. creating an audit trail of attempts.
We modify check-tracker() as follows:
declare function check-tracker( ) as xs:boolean { let $tracker := xdmp:invoke-function( function(){ doc($tracker-file) } , <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode> </options> ) let $age := fn:current-dateTime() - $tracker/timestamp return not(fn:exists($tracker)) or $age gt xs:dayTimeDuration('PT60S') };
This approach moves the doc call into a separate read-only transaction thus allowing access to the document content without holding onto any read lock for the rest of the main transaction.
However, this approach will not work if done within a multi-statement transaction as the invoke transaction will not see the temporary changes that are only available inside the multi-statement transaction.
Additionally, the query call runs at a higher timestamp than the source transaction and all other transactions before it. So this kind of implementation can become unpredictable:
let $query := cts:word-query('agent smith') let $result1 := xdmp:invoke-function( function(){ cts:search(/, $query)[1] } , <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode> </options> ) let $_ := xdmp:invoke-function( function(){ xdmp:document-insert(concat('/item.',sem:uuid-string(),'.json'), object-node{'name' : 'Agent Smith'}) }, <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">update-auto-commit</transaction-mode> </options> ) let $result2 := xdmp:invoke-function( function(){ cts:search(/, $query)[1] }, <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode> </options> ) return document-uri($result1) = document-uri($result2)
$result1 and $result2 will have different results. To help address this, we acquire a timestamp value and pass it consistently to all invoke. See example below:
let $query := cts:word-query('agent smith') let $timestamp := xdmp:invoke-function( function(){ xdmp:request-timestamp() }, <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode> </options> ) let $result1 := xdmp:invoke-function( function(){ cts:search(/, $query)[1] } , <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode> <timestamp xmlns="http://www.w3.org/1999/xhtml">{$timestamp}</timestamp> </options> ) let $_ := xdmp:invoke-function( function(){ xdmp:document-insert(concat('/item.',sem:uuid-string(),'.json'), object-node{'name' : 'Agent Smith'}) }, <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">update-auto-commit</transaction-mode> </options> ) let $result2 := xdmp:invoke-function( function(){ cts:search(/, $query)[1] }, <options xmlns="xdmp:eval"> <transaction-mode xmlns="http://www.w3.org/1999/xhtml">query</transaction-mode> <timestamp xmlns="http://www.w3.org/1999/xhtml">{$timestamp}</timestamp> </options> ) return document-uri($result1) = document-uri($result2)
This makes both search transactions execute in the same timestamp. The second transaction remains ignorant of the insert that happened a step before.
The following guidelines can be used as reference when developing your applications:
Subscribe to get all the news, info and tutorials you need to build better business apps and sites