Splitting or Distributing very large data on PostgresQL 9.6

Posted on

Question :

I have a dedicated server with 10x1TB. I can do RAID and make them as single one. But this gave me 10TB data. What will happen if PostgreSQL data exceeds 10TB? Which will exceed soon (I have a special software that requires TBs of database and writes/reads always)

Is there any method/feature in PostgreSQL or pgpool-II to add more servers or hdds when needed. What should I do to prevent such scenario?

Also, I have GlusterFS infrastructure, Should I deploy my database in that? Or is there any filesystem exactly for this?

Note: I know there is replicate. But my situation is splitting very large database on multiple disks or servers or distribute on filesystems like GlusterFS. Like where server is full I add more and it continues to write…

Answer :

As @a_vlad stated, you should do have a plan. Also, for later use I would like to recommend you guys two things.

  1. Tablespace
  2. Citus Data (distribute data across workers)

Citus is a distributed database that combines the low-latency requests that power real-time applications with high throughput, interactive analytics on billions of events.

Citus does this by extending PostgreSQL to distribute your workload across multiple hosts, leveraging the memory, storage, and processing power of multiple machines.

Read more

Leave a Reply

Your email address will not be published. Required fields are marked *