High Availability for MongoDB
MongoDB is a no-sql database that is very easy to setup and administer.
High availability is achieved using replica sets that provide automatic fail-over. It is important to note that fail-over is automatic, in that it does not require manual interaction.
In this post we examine how replica sets are configured and how automatic failover is achieved. This assumes that mongodb is already installed.
In order to understand replica sets it is important to understand the mongod process. mongod is the process that listens for incoming requests on a certain port and process those requests. If there is only one mongod process involved, then it is a single point of failure, however to incorporate failover using replica sets we create multiple mongod process and designate them as primary, secondary and arbiter.
- Primary -> primary member handles all write operations. It is the current master instance.
- Secondary -> it is a replica set member that replicates the contents of the master database.
- Arbiter -> the purpose of arbiter is to solely vote in elections. Arbiter does not replicate data.
Create 3 mongod instances, once each for primary, secondary and arbiter
At this points, 3 instances are created. It could be verified.
Now, lets define which instance is primary, secondary and arbiter.
Connect to one instance.
At this point, we are in mongo shell, now we create a javascript that a json object that defines the primary, secondary and arbiter.Assigning a priority of 10 means it is primary. After it is defined then rs.initiate is run
At this point, replica set is initiated. Now will create a collection in primary and verify it was replicated in secondary.
Now exit and connect to secondary.
Check the data in created collections, it will give an error initially because secondary needs to be set as slave.
To test the automatic failover, from another window kill the primary and watch the prompt changes to primary of the open mongo shell. Through the use of polling algorithm arbiter detects primary is down and makes secondary as primary.
If you can run a python script, I suggest to perform this test using python. It will help you understand, how application reacts to automatic fail over.
Install pymongo and run this program.
While this program is running, in another window, kill the mongod primary process and restart it after a few seconds.
Clearly, you can see the primary output of the python program alternate between 192.200.15.72:30000 and 192.200.15.72:40000 very seamlessly.
No disruption is observed by the application.