Azure Cosmos Database (formerly known as Azure DocumentDB) is a PaaS offering from Microsoft Azure. As a document store, it falls into the same category as MongoDB, CouchDB or RethinkDB and other No SQL DBs and just like those, it handles documents in the JSON format.
Azure Cosmos DB automatically takes backups of all your data at regular intervals. These automated backups are currently taken approximately every four hours and latest 2 backups are stored at all times. If the data is accidentally dropped or corrupted, you can contact Azure support within eight hours.
Now what happens if you figure out after 8 hours that your data is lost or if its corrupted in your development / staging environments or something accidentally went wrong with production while everyone was on holidays. Or for some reason the connectivity to internet is broken but your developers still need to work on a critical feature. In this blog post, we’ll learn how to backup / restore data from Azure Cosmos Database using Mongo DB API.
Microsoft does provide an Data Migration tool which can be used further scripted but it has two major limitations. Using it is not intuitive enough and it does not currently support Azure Cosmos DB MongoDB API either as a source or as a target.
Install Mongo DB
We’ll be using native mongo db executable for backup / restore of data from cosmos database. To get them to work, we’ll first need to install Mongo DB either on a physical / virtual machine located on-prem / cloud. Mongo DB is available for download from MongoDB download center.
Grab Azure Cosmos DB connection details
For this, go to Azure Portal -> Search for cosmos db in reference -> Select the cosmos DB -> Connection String. Note down the host, port, username and password mentioned in the portal:
Backup / Restore Azure Cosmos DB at command line window
Go to location where you have install Mongo DB and then navigate to bin folder. By default this should be C:\Program Files\MongoDB\Server\3.6\bin\ on the machine where Mongo DB is installed.
Now, we can backup content simply by using below command:
mongodump.exe –host <hostname:port> -u <username> -p <port> –ssl –sslAllowInvalidCertificates –out <directory to store backups>
Since Cosmos DB has strict SSL requirements, we’ll need to use ssl parameter. Also mongodump is always better and recommended option than using mongoexport. It is because mongodump can export / import data in BSON format and thus reliably preserves rich data types.
Below is one of the sample runs:
Similarly, we can restore data using below command:
mongorestore.exe –host <hostname:port> -u <username> -p <port> –ssl –sslAllowInvalidCertificates –dir <directory which have backups>
We can also choose to restore at either Cosmos / database / collection level by slightly modifying above command. We can also choose to create a batch script and schedule same using windows task scheduler or integrate with build and release pipelines to run at designated intervals.
Backup / Restore Azure Cosmos DB at PowerShell window
Mongo DB executables do not support PowerShell as first party command. Though you would be able to download / upload data to Azure Cosmos DB fine and without issues but all of the output is treated as error.
For this, first we need to define connection details for cosmos db and store them inside variables:
$cosmosServer = “mycosmos.documents.azure.com:10255” #Define the cosmos db hostname and port
$cosmosUser = “mycosmos-username” #Define username
$cosmosPassword = “mycosmos-password” #Defines password
Let’s also create a directory to store backups and make sure it exists:
$backupDir = “C:\cosmos-backup\$(Get-Date -Format yyyyMMddhhmmss)”
Write-Host “Creating temporary backup directory as $backupDir”
New-Item -Path $backupDir -ItemType Directory | Out-Null
We can then call mongorestore.exe and pass the required parameters using below commands:
$mongoExe = “C:\Program Files\MongoDB\Server\3.6\bin\mongodump.exe”
$arguments = “–host $cosmosServer -u $cosmosUser -p $cosmosPassword –ssl –sslAllowInvalidCertificates –out $backupDir -vvvvv”
Start-Process $mongoExe $arguments -Wait -RedirectStandardError “standarderror.txt” -RedirectStandardOutput “standardoutput.txt”
Note that we have captured output of Start-Process so that we can be informed about the errors that may have occurred during the backup process. Again, as mentioned above, since mongodump does not natively support PowerShell, even standard output is reported as error. So you’ll need to parse and understand it carefully.
Below is overall script which can be used towards backup process:
|$cosmosServer = "mycosmos.documents.azure.com:10255" #Define the cosmos db hostname and port|
|$cosmosUser = "mycosmos-username" #Define username|
|$cosmosPassword = "mycosmos-password" #Defines password|
|$backupDir = "C:\cosmos-backup\$(Get-Date –Format yyyyMMddhhmmss)"|
|Write-Host "Creating temporary backup directory as $backupDir"|
|New-Item –Path $backupDir –ItemType Directory | Out-Null|
|$mongoExe = "C:\Program Files\MongoDB\Server\3.6\bin\mongodump.exe"|
|$arguments = "–host $cosmosServer -u $cosmosUser -p $cosmosPassword –ssl –sslAllowInvalidCertificates –out $backupDir -vvvvv"|
|Start-Process $mongoExe $arguments –Wait –RedirectStandardError "standarderror.txt" –RedirectStandardOutput "standardoutput.txt"|
|Write-Host "Displaying standard output:"|
|Write-Host "Displaying if any errors:"|
You may tune it to adjust your needs. Also, as mentioned above, we can use mongorestore.exe to restore data back to the Azure Cosmos DB and also script it on above guidelines.