Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecordBatch might have logical row mapping on physical arrays #974

Open
viirya opened this issue Sep 26, 2024 · 0 comments
Open

RecordBatch might have logical row mapping on physical arrays #974

viirya opened this issue Sep 26, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@viirya
Copy link
Member

viirya commented Sep 26, 2024

Describe the bug

This is related to #973. After applying the fix, the test we run locally with Iceberg table with deleted rows fails on incorrect query result.

It is because Iceberg doesn't actually delete the rows from the arrays but stores a row mapping in arrays that is used to skip deleted rows when iterating rows in a columnar batch.

However, once we export the underlying Arrow record batch of a columnar batch to native side. The record batch is totally physical without the logical row mapping info. That's said, the deleted rows will occur in the query.

When exporting a record batch from Iceberg record batch, we need to export the row mapping together.

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

@viirya viirya added the bug Something isn't working label Sep 26, 2024
@viirya viirya self-assigned this Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant