This website uses Cookies. Click Accept to agree to our website's cookie use as described in our Privacy Policy. Click Preferences to customize your cookie settings.
I have e-commerce product details stored in CSV format in the GCS
bucket. Each file contains the attributes of a single product. The
number of such files is approximately 4 billion.I am planning to use a
BigQuery external table to query the data.It w...
I have e-commerce product details stored in CSV format in the GCS
bucket. Each file contains the attributes of a single product. The
number of such files is approximately 4 billion.I am planning to use a
BigQuery external table to query the data.It w...
I have created a BigQuery external table. It refers to a GCP bucket
where I have csv files containing e-commerce product details. There are
close to 1.5 million files; each file contains a single product detail
in the form, .When I try to run a simpl...
I am reading product content in JSON format from Kafka and creating
files in the GCS bucket. The file content is JSON, one of the fields in
the JSON is update time. I am creating one file for each product.I want
to make sure that I do not override th...
I have e-commerce product data in JSON format. This data is frequently
updated. I am planning to load this data into BigQuery. Given that the
JSON files are huge (a few hundred attributes) and there are many
updates, I found that it is not very cost-...
Hi @ms4446 , I was wondering if there is a completely different approach
that can be explored.To summarize my use case, we are storing product
data. There are close to 3 billion products, they are stored as JSON.
Products are constantly being updated...
Thanks, @ms4446 , for the reply.We are already using a native BigQuey
table for commonly used queries. We have identified the attributes that
are frequently queried; we have extracted them from JSON and stored them
in individual columns in a BigQuery...
Thank you, @MaxImbrox. I have a follow-up query.I understand the
approach where I am storing the update time in metadata, and then
retrieving it to decide whether to override the file or not. I am
assuming this would be done on the client side.I did ...