Numberly Moves from MongoDB to Scylla to Simplify Operations and Reduce Costs

Numberly is a leading programmatic advertising firm that helps brands collect and activate consumer data across digital channels. The company’s multi-platform, multi-device, multi-format solutions include real-time bidding, targeted display advertising, CRM program and an advanced data management platform. A big part of Numberly’s service offering is helping clients make sense of multiple sources of data to build and improve their relationships with customers and prospects.

Alexys Jacob-Monier, Numberly’s Chief Technology Officer, wrote a blog post about evaluating and choosing the right NoSQL data store, in which he explains, “Our line of business is tied to our capacity at mixing and correlating a massive amount of different types of events coming from various sources that all have their own identifiers.”

Challenges with MongoDB and Hadoop

Numberly had previously handled real-time ID matching using MongoDB and batch ID matching using Apache Hive. This required them to maintain two copies of every ID matching table. Neither MongoDB nor Hive was able to sustain both the read/write lookup/update ratio while performing within the low latencies that Numberly’s SLAs required, so the company was saddled with the operational burden of ensuring data consistency between the two data stores.

ScyllaDB is the kind of Open Source project we love at Numberly: a core of talented and benevolent people powering a smart and lightning fast piece of software!” –Alexys Jacob-Monier, Chief Technology Officer, Numberly

They found that MongoDB’s primary/secondary architecture hindered performance because of a loss of write throughput. Alexys summarized his 7-year MongoDB experience with, “To say the least, it is inefficient and cumbersome to operate and maintain.”

Numberly looked at using Apache Cassandra but was apprehensive about using a Java solution because of the heap sizes and tuning, and garbage collection that can temporarily stop the service.

Numberly was looking for a database that could sustain their ID matching tables’ workloads while maintaining consistently low upsert/write and lookup/read latencies. They realized that the right database would not only reduce costs but also lead to better data consistency and result in greater operational simplicity.

Scylla Lowers Cost and Complexity

Numberly started by running Scylla through rigorous load testing. Scylla passed with flying colors, leading Numberly to replace their 15-node Mongo cluster with a 3-node Scylla cluster.

“ScyllaDB is the kind of Open Source project we love at Numberly: a core of talented and benevolent people powering a smart and lightning fast piece of software!”

With its production Scylla Enterprise deployment, Numberly has realized a number of benefits, including:

  • Cost savings
  • Ease of operations–reducing the footprint by 80%
  • Production reliability
  • Data consistency
  • Datacenter awareness
  • Performance boosts

Next Steps

  • Learn more about Scylla from our product page.
  • See what our users are saying about Scylla.
  • Download Scylla. Check out our download page to run Scylla on AWS, install it locally in a Virtual Machine, or run it in Docker.
  • Take Scylla for a Test drive. Our Test Drive lets you quickly spin-up a running cluster of Scylla so you can see for yourself how it performs.

Interesting readings

http://www.ultrabug.fr/evaluating-scylladb-for-production-1-2/

http://www.ultrabug.fr/evaluating-scylladb-for-production-2-2/

https://www.slideshare.net/ScyllaDB/joining-billions-of-rows-in-seconds-replacing-mongodb-and-hive-with-scylla

  • Ask Question