From Development

PostgreSQL vs MySQL: how to choose?

There are many Relational Database Management Systems (RDBMS) available in the market, and PostgreSQL and MySQL are among the two most popular ones. Both options offer many advantages and are highly competitive. Therefore, it is essential to understand their differences in order to choose the most appropriate one for each case.

In that sense, this article provides a deep comparison between PostgreSQL and MySQL, considering aspects such as the data types, ACID compliance, indexes, replication, and more. Further, it entails which one to choose and highlights the importance of considering the application's requirements.

Table of Contents

What is PostgreSQL?
What is MySQL?
RDBMS vs ORDBMS
PostgreSQL vs MySQL: which one is more popular?
PostgreSQL vs MySQL: key differences
    ➤  Data Types
    ➤  Coding
    ➤  SQL
    ➤  Indexes
    ➤  ACID
    ➤  MVCC features
    ➤  Replication and Clustering
PostgreSQL vs MySQL: comparison table
PostgreSQL vs MySQL: how to choose?
Conclusion

What is PostgreSQL?

PostgreSQL was first released in 1996 and created in the University of California, at the Computer Science Department. Currently, its development is under PostgreSQL Global Development Group.

PostgreSQL is an open-source Relational Database Management System (RDBMS) that can also be considered an Object-Relational Database Management System (ORDBMS) since it supports some object-oriented features, such as table inheritance and function overloading.

What is MySQL?

MySQL was introduced to the market in 1995, shortly before PostgreSQL. It is an open-source (available under the GNU GLP) Relational Database Management System (RDBMS). Moreover, this database is managed and owned by the Oracle Corporation.

Over the years, MySQL has built quite an impressive and reliable reputation. Plus, it also stands out in the community for its ease of use.

RDBMS vs ORDBMS

Before moving on to the comparison between PostgreSQL vs MySQL, let's first understand how ORDBMS differs from RDBMS. MySQL is a purely relational database. Hence the data is stored in a structured format (with columns and rows). Plus, the values within each table can be related to each other, and tables can even relate to other tables.

PostgreSQL is an ORDBMS. These systems consist of a relational model, meaning that it is still possible to relate values and tables while also following the object-oriented model's principles. Thus, ORDBMS can include the concept of classes, objects, and inheritance.

In terms of structure, MySQL and PostgreSQL are actually pretty similar. They both use tables as their core component, meaning that the data is organized into rows and columns. Additionally, PostgreSQL also integrates stored procedures, views, constraints, triggers, roles and further supports NoSQL. In turn, MySQL offers almost the same features (or very identical ones), and since the MySQL 5.7 version release (2015), it also started including NoSQL features.

PostgreSQL vs MySQL: which one is more popular?

Both PostgreSQL and MySQL are among the most popular databases available in the market. However, to be more precise, according to popularity statistics from May 2021, MySQL is still more popular than PostgreSQL, as the DB-Engines Ranking highlights.

Furthermore, this data is aligned with the TOPDB Top Database Index, which is based on Google search results. According to the index, MySQL is in the second position and PostgreSQL in the sixth.

TOPDB Top Database Index

Regarding community support, both PostgreSQL and MySQL benefit from active communities as well as extensive documentation support.

PostgreSQL vs MySQL: key differences

Data Types

Currently, both PostgreSQL and MySQL enable developers to work with JSON as a data type in tables. However, things were not always this way. Up until the launch of the MySQL 5.7.8 version, the database system did not support JSON files.

So far, JSON support remains one of the leading NoSQL features that MySQL has integrated. In contrast, PostgreSQL further supports XML, arrays, user-defined type, and hstore, thus offering the ability to operate with more data types than MySQL. The main benefit of having a variety of options available is that it can increase functionality. For instance, by accepting arrays as a data type, PostgreSQL can also offer host functions that are compatible with those arrays.

Nonetheless, despite the benefits of using alternative formats to store data, it can also be more complex to implement such data formats, considering they do not follow a well-established standard. Therefore components used alongside the database might not comply with PostgreSQL formats. This does not have to be a disadvantage, but rather something to watch out for.

Coding

When it comes to coding in PostgreSQL vs. MySQL, a couple of differences should be considered. Let's start by case sensitivity. On the one hand, PostgreSQL is case-sensitive. This means that developers must capitalize strings as they appear in the database; otherwise, the query will fail. On the other hand, MySQL is not case-sensitive. Hence, there is no need to capitalize strings while querying.

Another difference in terms of coding lies in the characters' sets and strings. PostgreSQL does not allow UTF-8 syntax; thus, there is no need to convert sets and strings to this syntax. Contrarily, some versions of MySQL require this conversion.

SQL

SQL stands for Structured Query Language, and it is considered the standard when it comes to querying data languages. However, it is not necessarily applied in the same way across all database systems.

SQL's fundamentals are SELECT, INSERT, DELETE, and UPDATE. Additionally, it may also involve some extra features and differences in terms of syntax.

MySQL is only partly SQL compliant because it does not support all the features (e.g., no check constraint). However, it does provide many extensions. In turn, PostgreSQL is more SQL compliant than MySQL, conforming to most of the main features. To be more precise, PostgreSQL supports at least 160 out of the 179 mandatory features.

Indexes

The more extensive a database is, the more crucial indexes become. When handling tables with millions of rows, indexes can be extremely helpful and can improve database performance. Before looking at the particularities of each database system, let's first indicate what they have in common: PostgreSQL and MySQL offer support for B-trees and hash indexes. Now that that is clear let's take a deep look at their approaches toward indexes.

On the one hand, in MySQL, most indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are stored in B-trees. However, there are some exceptions:

  • Indexes on spatial data are stored in R-trees;
  • MEMORY tables also support hash indexes;
  • InnoDB adopts inverted lists for FULLTEXT indexes

On the other hand, in PostgreSQL, the indexes are considered secondary indexes. Hence, the indexes are stored separately from the table's heap, which is the main data area. Consequently, when executing an index scan, the data must be fetched from both the heap and the index. To solve this inconvenience, PostgreSQL developed support for index-only scans, meaning that developers no longer have to ask for heap access to query an index. Two measures must be followed to apply this method: the index type must support index-only scans, and the query can only reference columns stored in the index.

To make the most out of the index-only scan method, developers can create a covering index. The covering index retrieves every needed column. Thus, it includes the columns that are required by a specific type of query that runs frequently.

Covering indexes only became available in PostgreSQL since version 9.2 (2012). However, by then, MySQL was already using them to retrieve data by scanning the index without having to touch the table data. Last but not least, PostgreSQL indexes support additional features that MySQL has not yet developed, such as partial indexes, expression indexes, and bitmap indexes.

ACID

ACID stands for atomicity, consistency, isolation, and durability. It describes the properties that a robust database system must have to ensure transactions are reliable and consistent. As we explain in our SQL vs NoSQL article, many (not to say most) relational database management systems are ACID compliant. This does not mean that NoSQL databases cannot be ACID compliant. In fact, MongoDB, Apache's CouchDB, and IBM Db2 serve as an example of NoSQL database systems that can integrate and follow ACID principles.

MySQL is not since it does not support some principles, such as consistency, isolation, and durability. However, MySQL incorporates components such as the InnoDB and the NDB Cluster storage engines, allowing developers to adhere to the ACID model closely if they wish to.

In comparison, PostgreSQL is ACID-compliant since it provides all the required features to adopt the ACID model fully. Nonetheless, implementing these features and following the respective properties can slow performance.

As mentioned, the "I" in ACID stands for "isolation", which is not very easy to achieve. For truly proper isolation, developers need to ensure that transactions are serializable, which means that the outcome of performing a set of transactions should be the same as some serial execution of those transactions. Thus, a database with serializability offers arbitrary read/write transactions and is consequently able to guarantee consistency. Unfortunately, even though ACID properties ensure reliability and consistency (a plus for PostgreSQL), complete isolation can also limit concurrency and overall slower performance (PostgreSQL's drawback).

MVCC features

PostgreSQL introduced multi-version concurrency control (MVCC) features first than MySQL, and this used to be one of its most significant advantages.

MVCC features provide developers concurrent access to the database without having to lock the data. Therefore, every developer connected to the database sees a "snapshot" of the data while querying the data. Until a transaction is fully executed, the other users/developers do not see the changes in the database. Simply put, readers and writers do not block each other, and MVCC makes it easier for them to interact. This feature provides "transaction isolation" (or "snapshot isolation", as Oracle names it) throughout each database session, thus avoiding transactions to seem inconsistent and possible conflicts with locks.

In MySQL, it is possible to benefit from the MVCC feature by using InnoDB. InnoDB is the default MySQL engine that enables the database system to be ACID compliant and to have MVCC. Developers can choose to use other engines; however, it might imply losing these two characteristics.

Replication and Clustering

As the name suggests, replication consists of a process that allows developers to copy data from a database to replica databases, enabling every user to have the same level of information. Plus, replication entails several benefits, such as automated backups, fault tolerance, scalability, and the ability to perform long queries without disturbing the main cluster.

Both PostgreSQL and MySQL support replication. In MySQL, replication is one-way asynchronous; thus, one database server acts as the master one, and the others are "slaves" (the replicas). In contrast, PostgreSQL offers synchronous replication, meaning that it has two databases running simultaneously, and the primary database is synched with the replica database. Further, cascade and synchronous replication can also be performed when using PostgreSQL.

Another aspect that both PostgreSQL and MySQL support is clustering. Clustering utilizes shared storage to replicate an equal set of data to every node in an environment. This allows databases to tolerate failures, due the redundancy created by replicating data across multiple nodes in an environment.

Despite having one-way asynchronous replication, the MySQL cluster adopts synchronous replication internally. This way, MySQL removes single points of failure from the system and ensures that the data is written to various nodes, avoiding any negative impact and failures on the transactions. Plus, MySQL developers can also utilize MySQL Cluster, a multimaster technology that prioritizes linear scaling.

Regarding clustering, PostgreSQL supports streaming or synchronous replications and also has Postgres-XL, which is a database clustering environment.

Comparison Table

Database System PostgreSQL MySQL

RDBMS vs ORDBMS

ORDBMS - relational database that supports some object-oriented features, such as table inheritance and function overloading.

RDBMS - MySQL is a purely relational database.

Data Types

Supports JSON, XML, arrays, user-defined type, and hstore.

Supports JSON.

Coding

PostgreSQL is case-sensitive, and does not allow PostgreSQL does not allow UTF-8 syntax.

MySQL is not case sensitive and some versions require sets and strings to convert to UTF-8 syntax.

SQL

PostgreSQL supports at least 160 out of the 179 mandatory SQL features; more SQL compliant than MySQL.

MySQL is only partly SQL compliant.

Indexes

Indexes are stored separately from the table's heap, but developers can use index-only scans and can also create covering indexes.

Most indexes are stored in B-trees.

ACID

ACID compliant.

MySQL incorporates components such as the InnoDB and the NDB Cluster storage engines, allowing developers to closely adhere to the ACID model.

MVCC Features

Introduced the MVCC feature first than MySQL.

MVCC feature by using InnoDB.

Replication

Cascade and synchronous replication

Replication is one-way asynchronous

Clustering

supports streaming or synchronous replications and also has Postgres-XL.

Adopts synchronous replication internally; can also use MySQL Cluster, a multimaster technology.

PostgreSQL vs MySQL: how to choose?

Regarding the differences discussed so far, the choice between both database systems is not always that clear. One thing we know can say for sure is that no matter what, there's not a wrong answer. Both database systems are popular and have a reliable reputation. However, one might be more suitable than the other, depending on the context.

As mentioned, MySQL is an RDBMS, whereas PostgreSQL is an ORDBMS since it includes object-oriented features, such as function overloading and table inheritance. This difference itself might be enough for some developers to opt for PostgreSQL, considering that it makes it easier for developers to model complex application object structures.

Moreover, PostgreSQL is more SQL compliant than the competitor alternative and is also known for sustaining data integrity upon transactions, adopting the ACID model. Contrarily, in order to be ACID compliant, MySQL requires the use of the InnoDB and the NDB Cluster storage engines. However, not having to be ACID compliant necessarily can also make MySQL faster when it comes to reading data.

In fact, opting for MySQL also has its advantages. So far, it remains more popular than PostgreSQL and benefits from an extensive community as well as a vast number of third-party tools. Another great plus is that MySQL stands out for being fast, reliable, and an uncomplicated database system that is easy to understand and set up. Further, over the recent years, MySQL has continued to introduce relevant features (such as the MVCC).

All things considered, PostgreSQL is richer in terms of in-built features, and it has proved its ability to handle complex queries (e.g., subqueries, filtered results, joins, etc.), as well as extensive databases. However, if the priority is to have a database system that is fast, reliable, and fairly easy to manage, then MySQL is also an excellent choice.

Ultimately, the choice between PostgreSQL vs MySQL will always depend on the application's requirements. For instance, when handling a database with a lot of unstructured data, opting for PostgreSQL might be more beneficial since it supports more data types.

Conclusion

The comparison between PostgreSQL vs MySQL should not focus on which one is better but rather on choosing one or the other for a specific application. In other words, the application requirements should always be aligned with the database system's characteristics and abilities.

Consequently, in order to choose wisely, it is crucial to understand how PostgreSQL and MySQL differ regarding critical aspects and how developers can make the most out of each option.

Found this article useful? You might like these ones too!

At Imaginary Cloud, we simplify complex systems, delivering interfaces that users love. If you’ve enjoyed this article, you will certainly enjoy our newsletter, which may be subscribed below. Take this chance to also check our latest work and, if there is any project that you think we can help with, feel free to reach us. We look forward to hearing from you!