Postgres disk usage is 85% consumed: Please let know how clean up space in the postgres and hence on the instance?

asurajpai
Participant V

Hi,

We found that postgres disk usage is 85% consumed. We need to clean it up. Let us know which tables needs to be truncated. Moreover I’m able to see that tables are created per proxy version.

Do let us know what is the RA being followed for postgres filled.

Solved Solved
1 8 4,538
2 ACCEPTED SOLUTIONS

Hi @Suraj Pai, please refer to the section "Recurring Analytics Services Maintenance Tasks" in your Apigee Edge Operations Guide documentation for details on how to prune analytics data.

You may also be interested in these community articles that provide some extra details on autovacuum and deleting data.

In case you don't have them already, the docs for on premise are not published so we cannot provide direct hyperlinks for them. They available via the distribution site for on premise software: ftp.apigee.com (login required) in the docs folder or via the Apigee Support Portal (login also required) in the libraries tab. If you do not have logins to these, you need to contact Apigee Support, you can use the web form here, to request access to those sites.

Also, if this has answered your question, please click "Accept". This will also help others searching for answers on the same in future.

View solution in original post

Not applicable

Hi @asurajpai,

Q1> Do you need to run Purge script to be executed on both master and slave?

Answer: No - Only running on master is sufficient; assuming replication is working fine between master and slave.

Q2> Will clean up of childfactables retrieve disk space?

Answer: Yes - Cleanup is essentially dropping the old childfactables, which would release space.

Q3> What are childfactables?

Answer: Childfactables are daily-partitioned fact data. Every day new partitions are created and data gets ingested into the daily partitioned tables. So at a later point in time, when the old fact data will not be required, we can purge the respective childfactables. The listing of childfactables against the respective dates [in epoch] is present in "analytics.childfactables".

Q4> Purging not releasing space - Issue?

Answer: We had a bug in our code, where data got inserted into the main fact table instead of respective child-fact table [because of a timezone bug]. In that case out 100 records in a day, a high percentage of data could have landed in main fact table instead of child-fact. So after purge - very less amount of disk space is released. The bug had been resolved in future OPDK release. Kindly confirm which version you are using. @sukruth / @Sanjoy Bose can confirm the version in which the fix was made.

Q5> How is data stored?

Answer: Explained earlier.

To fix your current solution: you need to DELETE data from analytics."<org>.<env>.fact" for the unwanted period [where client_received_start_timestamp <= now() - interval 'X days']. Then you need to run VACUUM FULL. These operations will take a considerable amount of time and is a function of how much data needs to be cleared. There are other alternatives which are slightly faster - but still will take time.

Q6> Job to Purge

Answer: We have an updated version of purging script which deletes records from fact table apart from dropping the child-fact tables. But don't run it on the first go before attempting the above recommended fix, as it can run for a long time. @sukruth / @Sanjoy Bose can give details of the script and the location.

Q7> Master-Slave replication type

Answer: Stream WAL records

Q8> Documentation on DB Maintenance

Answer: We recommend keeping the default PG settings. In case the aggregate tables have grown considerably in size, we may need to reindex them. We usually do not recommend VACUUM on main fact table - as the child-fact tables ideally should have almost all records and dropping them should be sufficient. In your case you would require to DELETE and VACUUM FULL on the main fact table. @sukruth / @Sanjoy Bose can give details if there is any such documentation.

View solution in original post

8 REPLIES 8

Hi @Suraj Pai, please refer to the section "Recurring Analytics Services Maintenance Tasks" in your Apigee Edge Operations Guide documentation for details on how to prune analytics data.

You may also be interested in these community articles that provide some extra details on autovacuum and deleting data.

In case you don't have them already, the docs for on premise are not published so we cannot provide direct hyperlinks for them. They available via the distribution site for on premise software: ftp.apigee.com (login required) in the docs folder or via the Apigee Support Portal (login also required) in the libraries tab. If you do not have logins to these, you need to contact Apigee Support, you can use the web form here, to request access to those sites.

Also, if this has answered your question, please click "Accept". This will also help others searching for answers on the same in future.

Hi @mschreuder,

These are the further question related to the topic.

1) As mentioned in the document, do we need to run the clean up script on both postgres master and slave/ only the master and do a replication on slave? Please clear this confusion.

2) Will cleaning up of these analytics.childfactables will be able to retrieve the disk space?

3) What are these analytics.childfactables? Please provide more information.

4) I feel that NoOfDaysToPurgeBackFromCurrentDay does not have any affect. Initially I executed the step for 30 days and when I again executed this script for NoOfDaysToPurgeBackFromCurrentDay=60 then it says the tables deleted are 0. Please check if the script in terms of the SQL needs to be changed.

5) After executing the script all the analytic were intact and available. Where is the analytic data coming from after deleting all the org.env.fact_## details? What tables contain the analytic data.

6) Do we need to run a job for this script for periodic clean up of the analytics.childfacttables?

7) In Apigee PostgreSQL replication scripts, what is the type of master slave replication methodology used? Is it streaming?

😎 Is there any documentation/ scripts from Apigee on DB maintenance jobs like Vaccuming & Reindexing / DB Refresh on database in regular intervals.

BR, Suraj

Hi @asurajpai I have requested help from my colleagues to assist in answering - due to the holidays the responses will be delayed. While we're waiting for that, there are some docs with more details on vacuuming here direct from the postgres site that might help with some of your questions.

Hi @mschreuder

Please let know if there was some reply for the queries above.

,

Not applicable

Hi @asurajpai,

Q1> Do you need to run Purge script to be executed on both master and slave?

Answer: No - Only running on master is sufficient; assuming replication is working fine between master and slave.

Q2> Will clean up of childfactables retrieve disk space?

Answer: Yes - Cleanup is essentially dropping the old childfactables, which would release space.

Q3> What are childfactables?

Answer: Childfactables are daily-partitioned fact data. Every day new partitions are created and data gets ingested into the daily partitioned tables. So at a later point in time, when the old fact data will not be required, we can purge the respective childfactables. The listing of childfactables against the respective dates [in epoch] is present in "analytics.childfactables".

Q4> Purging not releasing space - Issue?

Answer: We had a bug in our code, where data got inserted into the main fact table instead of respective child-fact table [because of a timezone bug]. In that case out 100 records in a day, a high percentage of data could have landed in main fact table instead of child-fact. So after purge - very less amount of disk space is released. The bug had been resolved in future OPDK release. Kindly confirm which version you are using. @sukruth / @Sanjoy Bose can confirm the version in which the fix was made.

Q5> How is data stored?

Answer: Explained earlier.

To fix your current solution: you need to DELETE data from analytics."<org>.<env>.fact" for the unwanted period [where client_received_start_timestamp <= now() - interval 'X days']. Then you need to run VACUUM FULL. These operations will take a considerable amount of time and is a function of how much data needs to be cleared. There are other alternatives which are slightly faster - but still will take time.

Q6> Job to Purge

Answer: We have an updated version of purging script which deletes records from fact table apart from dropping the child-fact tables. But don't run it on the first go before attempting the above recommended fix, as it can run for a long time. @sukruth / @Sanjoy Bose can give details of the script and the location.

Q7> Master-Slave replication type

Answer: Stream WAL records

Q8> Documentation on DB Maintenance

Answer: We recommend keeping the default PG settings. In case the aggregate tables have grown considerably in size, we may need to reindex them. We usually do not recommend VACUUM on main fact table - as the child-fact tables ideally should have almost all records and dropping them should be sufficient. In your case you would require to DELETE and VACUUM FULL on the main fact table. @sukruth / @Sanjoy Bose can give details if there is any such documentation.

Hi @SaikatDey, thanks for such a details information.

We are on OPDK 4.15.07.00.

@sukruth / @Sanjoy Bose Please let know when the fix for the Question 4 will be available.

@sukruth / @Sanjoy Bose Please let know when the fix for the Question 4 will be available.

Not applicable

You can read the whole details that how you would clean the postgres disk from the https://bestvacuum.reviews/leaf-vacuum/ .Check the tables and clean which are not working fine for you.