Garbage Collection in BigTable

Hi GCP Community, I had a couple of questions regarding the Garbage collection process of the BigTable Service, hope someone could shed some lights at my doubts. Here are the 2 questions:

  1. If a row did not have any relationships with any column qualifiers (it means all cells related to that row were deleted or expired), will the row be deleted during the garbage collection process?
  2. If a column qualifier did not have any relationships with all rows (all cells in that column were deleted or expired), will the column be deleted during the garbage collection process?

I wish to know the answers to these questions so that I can better design BigTable for my use case.

0 2 646
2 REPLIES 2

Hi @schitiz,

Welcome to Google Cloud Community!

In Bigtable, when a row or a column qualifier no longer has any cells, it is considered to be "tombstoned" and will eventually be removed during the garbage collection process.

Regarding your first question, if a row did not have any relationships with any column qualifiers, it means that all cells in that row were deleted or expired, so the row will be tombstoned and eventually be removed during the garbage collection process.

Regarding your second question, if a column qualifier did not have any relationships with all rows, it means that all cells in that column were deleted or expired, so the column will be tombstoned and eventually be removed during the garbage collection process.

It's important to note that the exact time at which a tombstoned row or column qualifier is removed during the garbage collection process is determined by a number of factors, including the overall load on the Bigtable cluster and the rate at which new data is being added.

Also, keep in mind that bigtable stores data in multiple versions, so when you delete the row or cell, the system won't actually remove the data , but mark it as deleted, and Garbage Collection will just remove those versions of data that are marked as deleted, this is how historical data can be restored and bring back the data if necessary.

In summary, if a row or column qualifier no longer has any cells, it will be tombstoned and eventually removed during the garbage collection process. Understanding this behavior is important when designing your Bigtable schema and when working with expired or deleted data.

Thank you

Hi, does the row's key and value(all cells) both get removed during garbage collection if i am using the age-based garbage policy for for example 30 days?