Plan to support recursive data structures? #6

MichaelChirico · 2019-09-24T10:06:20Z

A lot of my common use cases store map & array data types. It would be great to have support to read such parquet with miniparquet.

Is this out if scope?

hannes · 2019-09-24T12:05:53Z

Are they stored as nested tables or more complex values? Also, can you provide some sample files please?

MichaelChirico · 2019-09-25T06:07:33Z

I'm not sure how to answer about their storage, but the Hive type is array and/or map. Though those types are potentially recursive (and hence highly complex), I've only used one-level complexity (e.g. array(int) or map(int, varchar)).

Will try and create something & pass along. Any preferred medium?

hannes · 2019-09-25T07:48:24Z

medium, e.g. wetransfer?

MichaelChirico · 2019-09-25T07:48:57Z

yes, or dropbox, i could try gist...

MichaelChirico · 2019-09-27T07:13:06Z

parquet_test.tar.gz

seems i can upload tar.gz here! i ran the following in SparkR and attached is the compressed output:

# spark start boilerplate
iris = iris
names(iris) = gsub('.', '_', names(iris), fixed = TRUE)
irisSDF = createDataFrame(iris)
irisSDF %>% createOrReplaceTempView('iris')

sql("
select 1 as int, 'a' as str, 1.1 as dbl,
       timestamp('2019-09-20T12:34:56Z') as ts,
       true as bool, date('2019-09-21') as dt,
       map(Species, Sepal_Length) as mp,
       array(Sepal_Width) as arr
from iris
") %>% write.parquet('/path/to/output')

hannes · 2019-09-27T09:22:27Z

thanks, will see what i can do

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plan to support recursive data structures? #6

Plan to support recursive data structures? #6

MichaelChirico commented Sep 24, 2019

hannes commented Sep 24, 2019

MichaelChirico commented Sep 25, 2019

hannes commented Sep 25, 2019

MichaelChirico commented Sep 25, 2019

MichaelChirico commented Sep 27, 2019

hannes commented Sep 27, 2019

Plan to support recursive data structures? #6

Plan to support recursive data structures? #6

Comments

MichaelChirico commented Sep 24, 2019

hannes commented Sep 24, 2019

MichaelChirico commented Sep 25, 2019

hannes commented Sep 25, 2019

MichaelChirico commented Sep 25, 2019

MichaelChirico commented Sep 27, 2019

hannes commented Sep 27, 2019