xTuple.com xTupleU Blog & News Customer Support

Database Bloat

I have recently noticed a sharp climb in database size. It makes me nervous at the rate its growing.
Almost 200MB in one month. The total backup of my database was 243,000 KB at the beginning of 2019 And 325,283 KB at the beginning of 2020. Its now 1,353,290 KB. that is a pretty steep climb. What is a normal size? At first I blamed the size jump on enterprise VS distribution. but now I am not sure that was it at all

Have you started saving attached files in the database? That can make it grow in a hurry.

I did that for maybe 1 month and then realized quickly that I didn’t want to do that.

I did an SQL statement And got these tables as the top 6

gltrans 351 MB
invhist 266 MB
cohist 227 MB
invbal 223 MB
file 213 MB
comment 179 MB

I’m curious where your “file” table was at a month ago? “file” and “image” are usually the culprits when you start growing that quickly.

Another thing to look at is the xtbatch.eml table. xtConnect will save all outgoing EDI emails and their attachments, and the default PDF resolution if you haven’t set it explicitly is 1200dpi.

1 Like

I actually have been monitoring the problem for a while. The file table has held steady. I dont let anyone attach files to the database any more. we do everything strictly by url. the file table has actually been reduced in size. I have been going back and converting the attached files to URL’s and deleting the file record from PostgreSQL

I followed your advice to xtbatch.emlbody and found that it was a rather large table. I will probably clean this out more periodically.

As a quick update to how that turned out. The backup completed yesterday at midnight with a Size of 392 MB. that’s approximately 1 GB down. the only thing I did was

UPDATE xtbatch.emlbody SET emlbody_msg = ''

Or something similar to that to clear out all of the PDF’s I still keep the email history, removed the attachments, as I can probably get those from the sent history from my email provider

Couple of things to note:
[1] Deleting data in PostgreSQL does not necessarily remove the data or recover space. Make sure you run a VACUUM after deletion to recover the space.
[2] With the email body table emlbody PostgreSQL can store large binary data (such as PDF attachments and images) in a section called TOAST (The Oversized-Attribute Storage Technique). I have found removing data from the emlbody table does not necessarily clean out that data.

I use the following SQL to check table sizes (including TOAST) hopefully this can help you (and others) determine why their database is growing. Obviously this example is just looking at the emlbody table - adjust to suit your requirements.

SELECT *, pg_size_pretty(total_bytes) AS total
    , pg_size_pretty(index_bytes) AS INDEX
    , pg_size_pretty(toast_bytes) AS toast
    , pg_size_pretty(table_bytes) AS TABLE
  FROM (
  SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM (
      SELECT c.oid,nspname AS table_schema, relname AS TABLE_NAME
              , c.reltuples AS row_estimate
              , pg_total_relation_size(c.oid) AS total_bytes
              , pg_indexes_size(c.oid) AS index_bytes
              , pg_total_relation_size(reltoastrelid) AS toast_bytes
          FROM pg_class c
          LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
          WHERE relkind = 'r'
  ) a
) a
WHERE table_schema = 'xtbatch'
AND table_name = 'emlbody'
order by total_bytes desc;
1 Like

One thing I mentioned casually but is worth pointing out explicitly: you can change the DPI of the printed PDF on every EDI Profile. By default it is 1200 which is an insanely high resolution for a PDF document. You can usually drop that down a lot and still get good results, and it would decrease the file size significantly.

I followed that advice as soon as I removed the old data. I have seen 300 DPI in a lot of places and find that works just fine

Such a simple adjustment. I’m guilty, I leave these at 1200 out of habit. I think this might even be the default in the old Quick Start databases? I’m making those changes at several clients ASAP.

I’m curious if 300 dpi will support the client’s logos, but it’s worth experimenting.

Is this DPI setting new? Honestly, in 8 years(?) of running xTuple, I don’t remember ever seeing it before.

The Print Resolution setting was added in 2016, xTuple Connect v3.8.0.