Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PathFindingOperator #93

Open
Dtenwolde opened this issue Feb 15, 2024 · 3 comments · May be fixed by #135
Open

Implement PathFindingOperator #93

Dtenwolde opened this issue Feb 15, 2024 · 3 comments · May be fixed by #135
Labels
enhancement New feature or request experiment For experimenting existing features path-finding Issues related to path-finding, such as the path mode or path prefix.

Comments

@Dtenwolde
Copy link
Contributor

Dtenwolde commented Feb 15, 2024

This issue serves as a way to track the progress on the PathFindingOperator

Working on in https://github.com/cwida/duckpgq-extension/tree/pathfindingoperator and https://github.com/cwida/duckdb-pgq/tree/pathfindingoperator (Make sure to be on the correct branch in both repositories)

The idea is to create a path-finding operator with two sinks. This acts similarly to the IEJoin. We insert that in this function, instead of the iterativelength() UDF. For this binding phase, we generate a logical query plan, so there cannot be a physical path-finding operator inserted quite yet. We need to create the two sinks here. One side is the src, and dst pairs (tasks) and the other side is the CSR. Importantly without the CREATE_CSR_EDGE() UDF because that will be done in one of the sinks of the new operator.

Can include optimizations such as #23

Plan for now:

  1. Get the CSR as the first sink to this new operator.
  2. Get the (src,dst)-pairs as the second sink.
  3. Implement the path-finding algorithm
  4. Look into how to parallelize.

Potential optimizations:

  1. Duplicate (src,dst)-pairs -> only execute it once, and later blow it up to get all the results again.
  2. Same src many times -> collapses into one src, then fully explore the graph

An example query for what we have for now (initial idea):

SELECT *
FROM pairs AS p
WHERE p.src BETWEEN (select csr_id from (SELECT
            0 as csr_id,
            (SELECT count(a.id) FROM Student a),
            CAST (
                (SELECT sum(CREATE_CSR_VERTEX(0,
                    (SELECT count(a.id) FROM Student a),
                      sub.dense_id,
                      sub.cnt)
                )
                FROM (
                    SELECT a.rowid as dense_id, count(k.src) as cnt
                    FROM Student a
                    LEFT JOIN Knows k ON k.src = a.id
                    GROUP BY a.rowid) sub
            ) AS BIGINT),
            a.rowid,
            c.rowid,
            k.rowid    FROM Knows k
                        JOIN student a on a.id = k.src
                        JOIN student c on c.id = k.dst)) AND p.dst;
@Dtenwolde Dtenwolde added enhancement New feature or request experiment For experimenting existing features path-finding Issues related to path-finding, such as the path mode or path prefix. labels Feb 15, 2024
@Dtenwolde
Copy link
Contributor Author

TODO: Figure out how to include the lower and upper bound into the query

@Dtenwolde
Copy link
Contributor Author

In the current implementation, when doing the following query, it first computes all shortest paths and only then filters out the pairs:

-FROM GRAPH_TABLE (pg
    MATCH p = ANY SHORTEST (a:Person)-[k:Knows]->{2,3}(b:Person)
    WHERE (a.id, b.id) in (SELECT (src, dst) FROM pairs)
    COLUMNS (a.id AS id1, b.id AS id2, element_id(p))
    ) tmp
    ORDER BY tmp.id1, tmp.id2;

It should ideally first do the filter on the pairs, and only then do the shortest path function. This could be a potential optimization rule if we can detect this.

Copy link

github-actions bot commented Sep 3, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Sep 3, 2024
@Dtenwolde Dtenwolde removed the stale label Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request experiment For experimenting existing features path-finding Issues related to path-finding, such as the path mode or path prefix.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant