Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve bq_perform_upload to upload as PARQUET file #608

Closed
robayo opened this issue May 17, 2024 · 2 comments
Closed

Improve bq_perform_upload to upload as PARQUET file #608

robayo opened this issue May 17, 2024 · 2 comments

Comments

@robayo
Copy link

robayo commented May 17, 2024

Hi,
I'm looking to upload some big df into Big Query (4GB~) and the current format for uploads is JSON which is not very efficient. I'm asking if an option to upload using the PARQUET format can be coded into the package which could improve compression and hence upload times.

@apalacio9502
Copy link
Contributor

Hello @robayo,

I have experienced the same issue, and I believe the library should allow the user to choose the data format they want to load. I just submitted a pull request that you might want to try out: #609. You only need to add the source_format variable to bq_table_upload:

 bq_table_upload(
  ...
  source_format = "PARQUET"
)

Regards,

@apalacio9502
Copy link
Contributor

Hi @robayo,

In the development version that you can install with pak::pak("r-dbi/bigrquery"), you will be able to define the format in which the information will be transmitted bq_table_upload(... , source_format = "PARQUET")

Regards,

@robayo robayo closed this as completed Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants