Restoring a Postgres database to AWS RDS using Docker

In this post I look at using Docker to restore a Postgres dump file to a Postgres database running in the cloud on AWS RDS.

Keep it clean

One of the big selling points of docker, for me, is that I can have lots of apps and utils running in nice containers on my dev laptop, without having to install them locally.  This ensures my laptop stays nice and responsive and I don’t clutter/break my laptop with lots of weird dependencies and running processes that I’m then too scared to delete.

Postgres is a good example – I don’t want to install it locally, but I do need access to the command line tools like psql and pg_restore, to be able to work with my databases effectively.

One way of accessing these tools would be to ssh onto the AWS cloud instances, but there’s a bunch of reasons most pertinently security (not to mention the faff) why you’d want to avoid that every time you want to run some sql.  So let’s look at how we use Docker to ease the pain instead.

Start Me Up

With Docker installed you can build this simple Dockerfile to create a local Postgres container.  The User and Password env vars aren’t strictly required, however, if you want to actually connect to the containerised DB, it’s pretty handy

You can build, run and connect to the container as follows (assumes you are on Mac)

Note line 4 where I map the data-load dir I created at line 1 to a new directory called data-loader inside my container.  This means that when I copy the Postgres dump file into my local data-load directory, it will be available to the postgres tools available in the container.

Line 6  allows me to connect to the container, swap the imageId  for your locally running containerID.

Restoring your database with pg_restore

I’ll assume you already have a Postgres database set up within the AWS cloud.  So now we have connected to our container, we can use pg_restore to use restore our dumpfile into AWS (note this command will prompt you for the admin password)

A note on schemas

If you’re doing a partial restore, you may want to restore your dumpfile to a separate schema.  Unfortunately there appears to be no way to do this from the command line.  What you have to do is to rename the public schema, create a new public schema and restore into that, then reverse the process.

This StackOverflow answer outlines the process.

Restore Complete

You should now have a complete restore of your dumpfile in the cloud.  Please add comments if anything is unclear.

Multi-Container Deployment with Docker Compose

docker_future

Why Docker?

At thinkWhere we always aim to keep pace with latest tech-industry trends which is easier said than done in such a fast paced sector! However one unavoidable technology trend we’re now employing across our application deployment model is Docker. Docker is a software containerisation platform guaranteeing that software will always run the same, regardless of it’s environment. Docker offers many benefits over traditional application deployment, including:

Simplicity – Once an application is Dockerized, you have full control (start, stop, restart, etc) with only half a dozen commands. As these are generic Docker commands, it is easy for anyone unfamiliar with the specifics of an application to get started.

It’s already Dockerized Docker Hub is the central marketplace for Docker images to be shared with other Docker users. Often you find official Docker images for an application already exist or you can find another users efforts to build upon. Docker images we have used include MapFish Print and Nginx. We have also containerised our own flavours of MapProxy and GeoServer.

Blueprint of application configuration – A Dockerfile provides the blueprint or instructions to build an application. This can be stored in source control and refined overtime to improve the build. It also removes any ambiguity of build/configuration differences between various deployments.

suHPcQ

Rapid to deploy – Having this blueprint for each application means that all we need is a server or virtual machine with Docker engine installed. This has drastically reduced the time spent deploying and configuring our applications.

Plays nicely with continuous integration – Amazon Web Services offer services dedicated to deploying and managing containerised applications. We recently constructed a Shippable Pipeline which builds and pushes new images to the Docker Hub repository as changes are merged into a code base. These new images are pulled down by the Amazon Elastic Container Service and deployed seamlessly. The ‘throw away’ nature of Docker lends itself to scalability, therefore services such as this can be scaled up or down with just a few clicks of the mouse.

Why Compose?

These days its rare to deploy applications which exist in a completely standalone context without the need to communicate with at least one other application or service. If all these applications happen to be Dockerized then Docker Compose is a great tool to create a multi-container application. In a nutshell, Compose lets us start/stop/build a cluster of related Docker containers.

The Compose File

In addition to the existing Dockerfile for each image, a single docker-compose.yaml file must be created for a Compose project. It’s here we define how the various containers will behave and communicate with each other.

This example compose file is from a Flask-restful application I wrote to serve GB Postcode GeoJSON from MongoDB. You can see it working and try your own postcode here.

This Compose configuration comprises three containers – a Python web application which talks to a Mongo database all sitting behind an Nginx web server.

web:
  restart: always
  build: ./web
  expose:
    - "8000"
  volumes:
    - /usr/src/app/static
  env_file: .env
  links:
    - db
  command: /usr/local/bin/gunicorn -w 2 -b :8000 app:app

nginx:
  restart: always
  build: ./nginx/
  ports:
    - "80:80"
  volumes:
    - /www/static
  volumes_from:
    - web
  links:
    - web:web

db:
  restart: always
  build: ./mongo/
  ports:
    - "27017:27017"
  volumes:
    - /home/matt/postcode_search/mongo:/data/db

This Compose file is largely self explanatory as you can see the three container configurations (web, db, nginx) are defined separately.

Some of the entries in the example file above will be familiar to Docker users, such as mapping a volume or forwarding ports to the host. The Compose file is best used for setting some of these parameters which may have previously been configured when starting a single container with the ‘docker run’ command. This is because a single command is used to start all the containers within a Compose cluster, rather than starting them individually.

In order to allow communication between containers a ‘links’ entry is used. You can see that the ‘web’ container will be linked to the ‘db’ container. The ‘links’ entry is used in the same way as the ‘–link’ option for ‘docker run’. When the Compose cluster is started the links entries are used to determine the startup order of the containers.

Another unfamiliar entry may be ‘volumes_from’. As you may have guessed this simply mounts all the volumes from another container. In this case the Nginx container needs visibility of the static files from the Python application.

So to bring up the application we simple use the ‘docker compose up’ command. With this single command we can build (if required) and start our three containers. Easy!

d79025b6d689c4eb240b70f84a3c3c94eb1f753378709ce1ab57f0938d8a93a1