How to export data from a BigQuery table to Cloud Storage?

When attempting to export data from a BigQuery table to Cloud Storage, you encountered an issue. The error message indicates that the operation cannot be performed on a nested schema field event_params. How can this issue be resolved?
 
code:
destination_uri =
"gs://{}/{}".format(bucket_name, "shakespeare.json")
dataset_ref = bigquery.DatasetReference(project, dataset_id)
table_ref = dataset_ref.table(table_id)
print(destination_uri, table_ref)

extract_job = client.extract_table(
    table_ref,
    destination_uri,
    # Location must match that of the source table.
    location="us",
)  # API request
extract_job.result()  # Waits for job to complete.
 
error message:
Traceback (most recent call last):
File "/home/tangbo508/python-files/test.py", line 25, in <module>
extract_job.result() # Waits for job to complete.
File "/usr/local/lib/python3.9/dist-packages/google/cloud/bigquery/job/base.py", line 922, in result
return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/google/api_core/future/polling.py", line 261, in result
raise self._exception
google.api_core.exceptions.BadRequest: 400 Operation cannot be performed on a nested schema. Field: event_params
0 1 1,108
1 REPLY 1

The error message you're encountering indicates that you're trying to export a BigQuery table to Cloud Storage, but there's a nested schema field called "event_params" in the table, which is causing the issue. BigQuery doesn't support exporting tables with nested fields directly to Cloud Storage.

To resolve this issue, you can perform a flattening of the nested schema, so that you're working with a flat structure. You can use the FLATTEN function in a SQL query to do this before exporting the data. Here's an example of how to modify your code to flatten the table:

from google.cloud import bigquery

client = bigquery.Client()
bucket_name = "your_bucket_name"
dataset_id = "your_dataset_id"
table_id = "your_table_id"
destination_uri = f"gs://{bucket_name}/shakespeare.json"

# Build a query to flatten the nested schema
query = f"""
SELECT * FROM `{project}.{dataset_id}.{table_id}` """

job_config = bigquery.QueryJobConfig(destination=destination_uri, write_disposition="WRITE_TRUNCATE")

query_job = client.query(query, location="US", job_config=job_config)
query_job.result() # Wait for the query job to complete print(f"Table data exported to {destination_uri}")



By using a query that flattens the nested schema, you should be able to export the data to Cloud Storage without encountering the "nested schema" error.