In order to fully understand OLTP and OLAP, it's necessary to provide a bit of context. In the early days of software existence, data was typically stored in a single file. However, as it began to address bigger problems, relational database management systems (DBMS) took the market by storm. Throughout the following decades, those were everyone's solution for data storage.
With the advent of the web, everything changed massively. Search engines and social networks are now modeling data in domains in which relations between data are not easily identifiable or, sometimes, not even needed (e.g. search engines indexing documents).
Nevertheless, a few terms of the old world are still used today, and it's important to look at them with a modern approach. Two of them are, precisely, OLTP and OLAP. But for an overall context, let's take a look at the following image that shows the relation between OLTP and OLAP.
What the image above underlines is that OLTP and OLAP are not competing approaches to the same issue, but processes that complement each other. Next you'll find a more in depth explanation of each one of those terms.
What is OLTP?
The term OLTP refers to Online Transaction Processing. It's often used to mention databases that store and manage data relevant to the day-to-day operations of a system or company. In the past this term was usually linked to relational databases in operation, where the main focus was to gather data from what was happening in a given context.
In short: OLTP is used to store and manage data for day-to-day operations.
As the information being stored on an OLTP data store was often critical to the business, a huge effort was put to ensure the Atomicity, Consistency, Isolation and Durability (ACID) of the data. Data stored according to these four principles are marked as ACID compliant, and this is where relational database management systems excel.
But having an ACID compliant Datastore does not mean that we don't have to make any additional effort to ensure our data is compliant with those principles. The way we process the data matters. For instance, how can we guarantee that data is consistent if we allow redundancy in our data store?
If we are storing clients' addresses, it's important to ensure that when the client moves to another place, the address is updated everywhere. But storing addresses in several places makes it hard to keep the data in a consistent state. This is why relational databases are often designed to match the 5th normal form - a way of designing relational data that avoids redundancy.
As said previously, the world has changed since the OLTP term was defined, and nowadays it's easy to store data on non-relational databases. Most of those data stores comply only with some of the four principles of ACID. But depending on the use case, it's OK to relax on one or more of these principles in exchange for other benefits (speed, scalability, etc).
For example, if we are storing "likes" in a post on a social network, is it really important to ensure that the number of likes is 100% accurate? Or is it OK to display 995 likes instead of 998, in exchange for a faster response to millions of users?
As OLTP refers to Online Transaction Processing, we see that the term is not bounded to relational databases or even fully compliant ACID databases. It simply refers to the way these data stores are used. If, for example, we're using a document data storage (e.g. MongoDB) to store and process data from the day-to-day operations of a social app (e.g. to register users, store likes, etc), we're OK to say that it's OLTP.
What is OLAP?
The term OLAP refers to Online Analytical Processing, and is often used to mention databases that store and manage data relevant for data analysis and decision making.
OLAP is strongly connected to Business Intelligence (BI), a specialization of software development targeted at delivering applications for business analysis. In other words, the objective of BI is to allow top level executives to query and explore data without the help of involving IT staff.
In short: OLAP is used analyze data and make decisions.
The biggest advance that this area has brought was the capacity to generate reports on the fly. It ended the need to call the IT department to ask for a custom report, or to automate the generation of specific reports. A BI system can now answer questions that the developers didn't had the need to know it in advance that the question was going to be asked.
BI systems are made possible by organizing the data in a form called Hypercube. This form explores the many dimensions of the data, and allows users to aggregate or drill down data by navigating the dimensions of the cube.
The fun part is that, with the right interface, top level management can generate reports on the fly, without the help of IT.
OLAP systems can be implemented using relational databases, and this technique is often named ROLAP (Relational OLAP). But for that, we need to design the database not in the 5th normal form but in the 3rd normal form.
We can live with redundant data when analyzing data. What really matters is the capacity to navigate through the dimensions of the data. And this is where ROLAP shines, as a database schema in the 3rd normal form is suited for aggregations and drill downs.
Putting it all together
When encountering the terms OLTP and OLAP for the first time, it's easy to question: which one is better? When in fact one should be asking: how does one complement the other?
We now know that:
- OLTP is used to store and manage data for day-to-day operations;
- OLAP is used to analyze that data.
This is exactly how they are used in an existing business.
The data from the upper part of the above example (HR Database, CRM, Billing System) is usually processed in batch - often overnight - via a process called Extract, Transform and Load (ETL). It's the name given to the operation that collects data from several OLTP sources and puts it in an OLAP data warehouse, allowing cross-system analysis. In the lower part of the figure, you can see that the data was properly stored and organized in the OLAP cube.
That way, the people performing the analysis can work with up to date information and make timely decisions, without disrupting operations.an