Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] where clauses do not appear to use indexes as they should, query consequently very slow #5352

Open
alanpaxton opened this issue Jul 3, 2024 · 0 comments
Labels
bug issue confirmed as bug performance bottlenecks, opportunities for rewriting, optimization

Comments

@alanpaxton
Copy link
Collaborator

alanpaxton commented Jul 3, 2024

Describe the bug
An XQuery using where clauses should be rewritten to use the defined attribute indexes, but it is not. It is consequently very slow.

When I run this query against a large database (the auction site generated by the xmlgen tool) xmlgen

xquery version "3.1";

for $p in /site/people/person
    let $a :=
        for $t in /site/closed_auctions/closed_auction
        where $t/buyer/@person = $p/@id
        return $t
    let $c := fn:count($a)
    where $c > 0
    return <item person="{$p/name/text()}" id="{$p/@id}">{count ($a)}</item>

I have created and loaded the following indexes (sample from collection.xconf)

<index>
    <!-- Range indexes -->
    <range>
        <create qname="@id" type="xs:string"/>
        <create qname="@person" type="xs:string"/>
    </range>
</index>

Expected behaviour
The performance should be much quicker than it is, and Monex should report the configured attribute index queries as being used.

By rewriting the above query to the following form, I get the same results, and can instead see that eXist-db uses the configured indexes, and therefore I also get better performance:

xquery version "3.1";

for $p in /site/people/person
    let $a := /site/closed_auctions/closed_auction[buyer/@person = $p/@id]
    let $c := fn:count($a)
    where $c gt 0
    return <item person="{$p/name/text()}" id="{$p/@id}">{count($a)}</item>

To quote @adamretter :

Basically, the Query Rewriter in eXist-db is very bad at re-writing where clauses to use indexes as an optimisation, it is better at re-writing predicates. So in general, when working with eXist-db, I advise people to use predicates instead of where clauses.

To Reproduce

  • Generate the default auction database with xmlgen ./xmlgen-unix -f 1 -o file-auction.xml
  • Install the above indexes in collection.xconf
  • load file-auction.xml from eXide using xmldb:store-files-from-pattern
  • paste the above query in eXide

Context

  • Build: eXist-7.0.0-SNAPSHOT
  • Java: openjdk version "21.0.2" 2024-01-16
  • OS: Mac OS X 14.5

Additional context

  • How is eXist-db installed?
    • It is built from source (exist-db/exist HEAD of develop branch)
  • Any custom changes in e.g. conf.xml?
    • No
@line-o line-o added the performance bottlenecks, opportunities for rewriting, optimization label Jul 3, 2024
@adamretter adamretter added the bug issue confirmed as bug label Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issue confirmed as bug performance bottlenecks, opportunities for rewriting, optimization
Projects
None yet
Development

No branches or pull requests

3 participants