/ postgresql

Copy data from one postgres instance to another. remote copy options explored: Copy, CSV and STDIN

Its very common use case to copy data from one database instance to another be it from local to staging, staging to production etc.

For copying data from one PostgreSQL instance to another, there are a couple of techniques.

You can take a data dump as an SQL file or as a CSV from your PostgreSQL instance and do a restore.

But what happens when you are dealing with Gigs of data?

It might not be a good idea to do a dump of 100s of Gigs to your local system and then copy to the remote host and restore.

Well, PostgreSQL also has a \copy statement which can be quite handy in this case. What if I say that you can even do a remote copy of data from one database to another over ssh. Interesting isn't it.

Let's explore various options for copying data to from local/remote servers.

I am not going to cover the pg_dump and pg_restore methods here as there are a lot of resources out there explaining those in detail.

Understanding COPY and \copy statements

COPY command is to input and output data between database and file only in the database server. If we connect to the database remotely (E.g. from another computer), we cannot use COPY command from the remote computer. In other words, input file or output file has to be in the database server. This is because SQL statements that are entered at the psql prompt are sent to the server before they are executed. This means that any file paths included in SQL statements are interpreted by the server. Since the server doesn't know what directory a user is in when they execute a statement, this means that all file paths have to be absolute.

On the other hand, \COPY, the meta command provided by PostgreSQL, is to input or output file in the client computer. If we connect to database server remotely by utilizing psql command, we can input or use a file on the client computer. This meta command initiates copies from the client (which is the psql process in this case), and this allows it to interpret paths that are relative to the user's current directory.

Copy data from a CSV file to local database.

The simplest solution to copy data from one database to another is to save to a local file and then restore it

\COPY users TO 'users.csv' CSV HEADER

The above command selects the data that you want and then writes it to a CSV file using the \copy command.

Now its time to restore the CSV file to the second database.

Loading the data is also pretty straightforward with the same \copy command.

\COPY users FROM 'users.csv' WITH CSV HEADER;

Copy data using STDIN to a local database.

While it's useful to save the data to a local CSV file, it's not always needed. You can even copy the data from one database or table using the STDOUT-> STDIN technique.

psql
    -h localhost \
    -d your_primary_db \
    -U postgres -c \
    "\copy users (id, email, first_name, last_name) from STDIN
    with delimiter as ','" \ < /tmp/users.csv

Note that the above statement also leverages SQL statements inside the \copy statement thereby you can choose what data you need to copy.

Copy data from a CSV file to remote database.

The following command copies data from a local CSV file to a remote PostgreSQL database

psql \
    -h remotehost \
    -d your_primary_db \
    -U postgres \
    -c "\copy users (id, email, first_name, last_name)  from '/tmp/users.csv' with delimiter as ','"

Copy data using STDIN to a remote database.

Now, lets use STDIN for reading the CSV file and copying data to a remote host.

psql
    -h remotehost \
    -d your_primary_db \
    -U postgres -c \
    "\copy users (id, email, first_name, last_name) from STDIN
    with delimiter as ','" \ < /tmp/users.csv

Copy data from one server to another server

psql \
    -U user_name \
    -h production_server \
    -d database_name \
    -c "\\copy users to stdout" | \
psql -U user_name \
    -h staging_server \
    -d database_name \
    -c "\\copy users from stdin"

The above command STDOUTs the data from the production server and copies the same data over STDIN on the staging server.

Other techniques

There are also a few other techniques that can be used to copy data from one server to another like setting up replication between databases or doing a full snapshot replication of the db.

Reference:

https://www.postgresql.org/docs/current/static/sql-copy.html

Hope that someone finds it useful. 🖖🏻

Manu S Ajith

Manu S Ajith

Tech Entrepreneur, dating Elixir, in long-term ❤️ w/ Ruby, had multiple one night stands w/ Go. Into functional paradigms DDD/CQRS/EventSourcing architecture these days. @manusajith on the interwebs

Read More