Backup / Restore Data to / from Azure Cosmos Database with Mongo DB API

Azure Cosmos Database (formerly known as Azure DocumentDB) is a PaaS offering from Microsoft Azure. As a document store, it falls into the same category as MongoDB, CouchDB or RethinkDB and other No SQL DBs and just like those, it handles documents in the JSON format.

Azure Cosmos DB automatically takes backups of all your data at regular intervals.  These automated backups are currently taken approximately every four hours and latest 2 backups are stored at all times. If the data is accidentally dropped or corrupted, you can contact Azure support within eight hours.

Now what happens if you figure out after 8 hours that your data is lost or if its corrupted in your development / staging environments or something accidentally went wrong with production while everyone was on holidays. Or for some reason the connectivity to internet is broken but your developers still need to work on a critical feature. In this blog post, we’ll learn how to backup / restore data from Azure Cosmos Database using Mongo DB API.

Microsoft does provide an Data Migration tool which can be used further scripted but it has two major limitations. Using it is not intuitive enough and it does not currently support Azure Cosmos DB MongoDB API either as a source or as a target.

Pre-requisites

Install Mongo DB

We’ll be using native mongo db executable for backup / restore of data from cosmos database. To get them to work, we’ll first need to install Mongo DB either on a physical / virtual machine located on-prem / cloud. Mongo DB  is available for download from MongoDB download center.

Grab Azure Cosmos DB connection details

For this, go to Azure Portal -> Search for cosmos db in reference -> Select the cosmos DB -> Connection String. Note down the host, port, username and password mentioned in the portal:

 

Connection details for Azure Cosmos db
Connection details for Azure Cosmos DB

Backup / Restore Azure Cosmos DB at command line window

Go to location where you have install Mongo DB and then navigate to bin folder. By default this should be C:\Program Files\MongoDB\Server\3.6\bin\ on the machine where Mongo DB is installed.

Now, we can backup content simply by using below command:

mongodump.exe –host <hostname:port> -u <username> -p <port> –ssl –sslAllowInvalidCertificates –out <directory to store backups>

Since Cosmos DB has strict SSL requirements, we’ll need to use ssl parameter. Also mongodump is always better and recommended option than using mongoexport. It is because mongodump can export / import data in BSON format and thus reliably preserves rich data types.

Below is one of the sample runs:

Using mongodump to export data from Cosmos DB
Using mongodump to export data from Cosmos DB

Similarly, we can restore data using below command:

mongorestore.exe –host <hostname:port> -u <username> -p <port> –ssl –sslAllowInvalidCertificates –dir <directory which have backups>

We can also choose to restore at either Cosmos / database / collection level by slightly modifying above command. We can also choose to create a batch script and schedule same using windows task scheduler or integrate with build and release pipelines to run at designated intervals.

Backup / Restore Azure Cosmos DB at PowerShell window

Mongo DB executables do not support PowerShell as first party command. Though you would be able to download / upload data to Azure Cosmos DB fine and without issues but all of the output is treated as error.

For this, first we need to define connection details for cosmos db and store them inside variables:

$cosmosServer = “mycosmos.documents.azure.com:10255” #Define the cosmos db hostname and port
$cosmosUser = “mycosmos-username” #Define username
$cosmosPassword = “mycosmos-password” #Defines password

Let’s also create a directory to store backups and make sure it exists:

$backupDir = “C:\cosmos-backup\$(Get-Date -Format yyyyMMddhhmmss)”
if(!(Test-Path $backupDir)){
Write-Host “Creating temporary backup directory as $backupDir”
New-Item -Path $backupDir -ItemType Directory | Out-Null
}
Set-Location $backupDir

We can then call mongorestore.exe and pass the required parameters using below commands:

$mongoExe = “C:\Program Files\MongoDB\Server\3.6\bin\mongodump.exe”
$arguments = “–host $cosmosServer -u $cosmosUser -p $cosmosPassword –ssl –sslAllowInvalidCertificates –out $backupDir -vvvvv”
Start-Process $mongoExe $arguments -Wait -RedirectStandardError “standarderror.txt” -RedirectStandardOutput “standardoutput.txt”

Note that we have captured output of Start-Process so that we can be informed about the errors that may have occurred during the backup process. Again, as mentioned above, since mongodump does not natively support PowerShell, even standard output is reported as error. So you’ll need to parse and understand it carefully.

Below is overall script which can be used towards backup process:

You may tune it to adjust your needs. Also, as mentioned above, we can use mongorestore.exe to restore data back to the Azure Cosmos DB and also script it on above guidelines.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s