Hi Rachel,
Can you pass a cts:query into your cts:uri-match call?
How many forests do you have? More forests might help depending upon what you are doing.
But if all of your URIs in your db follow this pattern, ultimately it is going to have to search through a lot of URIs. You could make your URI space a little more selective which might speed it up. Maybe the strings in your URIs are all very similar (the URI match is essentially a string compare)?
What kind of hardware are you running on? The speed of your memory and cpu can be a factor here too.
-Danny
From: general-***@developer.marklogic.com [mailto:general-***@developer.marklogic.com] On Behalf Of Rachel Wilson
Sent: Wednesday, October 22, 2014 9:05 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Surprising slowness of cts:uri-match
Hi,
I was wondering if anyone had a reply to this.
We're digging even deeper into improving our performance for an API and in several places (because we use it liberally) cts:uri-match ends up being the bottleneck. We are happy to redesign our data and queries where we can to avoid it, but it continues to surprise us that this is the case because we thought the uris are indexed and the function is designed to use wildcards because it's a matcher.
A typical call would be
let $uris := cts:uri-match("/project/" || $projectId ||"/jobs/*",
But we're most surprised by this one, we used as a test, because there aren't even any wildcards.
let $thereShouldBeOnlyOne := cts:uri-match("/project/" || $projectId || "/content/" || $contentId)
Some insight into the inner workings of that function would be great
From: Rachel Wilson <***@bbc.co.uk<mailto:***@bbc.co.uk>>
Date: Thursday, 16 October 2014 17:25
To: MarkLogic Developer Discussion <***@developer.marklogic.com<mailto:***@developer.marklogic.com>>
Subject: Surprising slowness of cts:uri-match
In our experience cts:uri-match is surprisingly slow. For example when profiling a pretty complicated query taking 0.7 seconds, the single cts:uri-match() call takes 70-80% of the total time. (Shallow% and Deep% being the same)
But we thought it should be reading the URI lexicon and so in a database with only 483,475 docs should be lightening fast. We've had to stop using cts:uri-match calls in loops for this reason.
Are there any match patterns to be avoided perhaps? Wildcards in the middle of the pattern, rather than trailing wildcards for example?