Skip to content

DataFrame.explode returns a PY[object] column, even if coercible to something better #377

Answered by jaychia
xcharleslin asked this question in Q&A
Discussion options

You must be logged in to vote

Thanks for raising this!

This is expected behavior since we can't currently infer any return types of operations on Py columns without some form of user input.

For example, if a user's Python list looks like this:

df = DataFrame.from_pydict({
    "a": [
        [1, 2, 3],
        ["foo", "bar", "baz"],
    ],
})

We can only deduce that the type of the "a" column is a list but Daft has no typing information about its contents at static time. Without knowing the contents of the data itself ahead of time, Daft cannot make any decisions about the elements' types at static time.

However, by casting the resulting exploded dataframe, the user is asserting that the data is of a given type and Da…

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by jaychia
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #372 on December 07, 2022 18:30.