![]() I do have a question and bear with me if I ask something conceptionally incorrect, but if we were to change that configuration, would we see our database not accepting writes until our secondary caught up in the oplog? I just want to make sure that I’m not sacrificing the health of the primary in order to have the secondary caught up. I’ve seen numerous bits of documentation around write concern, and I think that is the right direction. The concern is that we keep falling that we find ourselves outside of that oplog range. We typically see about 30GB of data ingest daily, but our 450GB oplog is enough to give us 42 hours worth of runway. Filtering out what data we truly need is another discussion as we need to do some cleansing on our data. In the last year, our database has grown from 12TB to 22TB. We do have a ton of data that gets inserted into MongoDB on the daily. Thanks for pointing out the 4.4.8 vulnerabilities and we definitely need to get on 4.4.10 as soon as we can. Should the secondary become offline for some reason, the writes could hang indefinitely waiting for confirmation from the offline secondary (since arbiters cannot acknowledge writes), so some mitigation plan is probably needed for this case. One caveat with majority write concern is that there might be issues with arbiters. In short, using majority write concern means that the write will be acknowledged after it was replicated to the secondary, so there’s less chance of it falling off the oplog. This will allow natural throttling for the inserts so that it won’t overwhelm your replica set. Is this correct? How much data are you inserting?Ī quick fix that I can think of off the top of my head is to use majority write concern for all your writes. So much, in fact, that 450GB of oplog is not enough. ![]() ![]() It seems to me from your description that you have a lot of data inserted into MongoDB almost all the time. "operationTime" : Timestamp(1636981009, 2780)įirst of all, I would encourage you to upgrade as soon as possible to MongoDB 4.4.10 due to critical issues identified in 4.4.8. "heartbeatIntervalMillis" : NumberLong(2000), "writeConcernMajorityJournalDefault" : true, We just currently have a lot of pain points with the current setup that was not designed correctly for the current use case so any feedback/suggestions would be very greatly appreciated! rs.conf(): The network and hardware specs are not a concern between the sites but how can I narrow down on the latency and potential configuration issues? Long term, I think we need to scale out sharding to handle our data intake but we need to at least restore our desired HA configuration so we can work towards the desired endstate. It became even more of an issue when it came to creating indexes as we had to remove our secondary’s voting and priority rights to get our primary back to the expected performance for our application.įrom research, this typically falls down to network speeds/bandwidth, latency, hardware specs, or configuration issues. Our secondary has fallen completely out of sync twice in the last month which is very concerning considering the fact that the oplog give us 40+ hours of runway. Oplog last event time: Mon 06:47:30 GMT-0600 (CST) Oplog first event time: Sat 12:05:09 GMT-0600 (CST) Log length start to end: 153741secs (42.71hrs) Oplog is set to 450GB rs0:SECONDARY> rs.printReplicationInfo() The specs on the primary and secondary are CPU: 28core, 56 thread and 256GB RAM and 10Gbit network interfaces. Site 2: 1 Secondary server, voting, electible We are currently running MongoDB v4.4.8 across our environment. In the last month, we’ve been encountering issues with our HA configuration when it comes to replication of the primary and secondary servers.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |