Tuesday, January 29, 2013

Polyglot Persistence



In computing, a polyglot is a computer program or script written in a valid form of multiple programming languages, which performs the same operation  to compile or interpret it.

In NoSQL world, Polyglot Persistence contains a variety of different data storage technologies for different kinds of data in any decent sized enterprise.  Complex applications combine different types of problems, so picking the right language for the job may be more productive than trying to fit all aspects into a single language.  In Big Data era, there's been an explosion of interest in new languages, particularly functional languages like Clojure, Scala, Erlang.  In the new strategic enterprise application, the persistence should be no longer relational.  

A common example is configuring an Apache Solr server to stay in sync with a SQL-based database. Then you can do scored keyword/substring/synonym/stemmed/etc queries against Solr but do aggregations in SQL.

Another example is to use the same datastore, but store the same data in multiple aggregated formats. For example, having a dataset that is rolled up by date (each day getting a record) can also be stored rolled up by user (each user getting a record). Depending on the query you want to run, you choose the set that will give you the best performance. If the data is large enough, the overhead of keeping the two sets synchronized more than pays for itself in increased query speed.

Herez the attached reference architecture of Polyglot Persistence in a typical Web App.

No comments:

Post a Comment