Factorize and solve the Laplacian on the GPU #84
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR #75 added GPU support to CombinatorialSpaces.jl via the CUDA.jl API.
Currently, we directly offer GPU support for the classic DEC primitive operators:$d, \star$ , and $\wedge^{pp}$ . And Decapodes.jl is able to automatically create bindings for higher-order operators on the GPU - such as the Laplacian - by expanding their definitions - e.g. $d\star d\star$ - and pre-multiplying those matrices. The branch $\wedge_{10}^{p d}, \wedge^{dp}_{01}$ , and the interpolating musical operator $\flat\sharp$ . These operators are implemented as matrix-multiplications are relatively straight-forward to port to the GPU.
llm/cuda-wedge-music
is adding GPU support forHowever, some operators - such as the geometric Hodge star for 1-forms - are instead implemented by solving a matrix implementing that operator.$\star^{-1}$ was implemented by using
gmres
from Krylov.jl, and performantly solves this system, since it is "mostly" diagonal.However, when we need to solve the heat equation by solving the sparse$\Delta_0$ matrix,
gmres
is not able to provide adequate performance. Further, even the factorization of this matrix on the GPU is prohibitively-expensive.So, we need to provide a canonical way of solving this problem quickly in the CombinatorialSpaces library. The current prototype factors the matrix on the CPU, sends its components to the GPU, and in-houses an LU-solve.
We are also experimenting with approximations to the Laplacian on well-structured meshes that can be exploited for superior factorizations.