Understanding the difference between relational and graph database
When you pick your PLM / PDM system, you also choose a certain type of database whether you know it or not. So why not take some time and understand how the different types of databases work and how they impact the performance of your product lifecycle or product data management system?
The relational database was invented in 1970 by E. F. Codd at IBM and 10 years later the vast majority of PLM systems were relying on it. But with the evolution of graph databases, some of the PLM systems started to rely on this new technology. However, most PLM applications nowadays are still based on the traditional relational database.
To understand the impact of the various kinds of databases on your product data management system, first, let’s take a look at how a relational database works. It is very easy to understand if you just imagine a traditional table with many rows and columns in it. The relational database consists of multiple data tables that simply stores your data as you would do it in an excel spreadsheet. What makes it different from a spreadsheet is how it can discover relationships between different pieces of data across multiple tables.
To do that, the relational database extends its tables with a ‘foreign keys’ column that refers to the primary key attributes of other tables. By matching the foreign keys with the primary key attributes, rows are paired together from various tables.
When you perform a search query, the relational database will have to scan through all rows in all tables in question to find the data that are related to the search. This is a very resource intense process on its own, but when it comes to an increased number of relationships between data, the time and memory needed for the tasks increase exponentially. The higher number of relationships requires the introduction of the so-called JOIN tables which solely contains the list of foreign keys and primary keys matched together.
Graph theory was born in the 18th century but it hasn’t been used until the 90s. The history of graph databases is closely linked to Google, Facebook, Twitter, Amazon, and eBay. The usage of graph theory allowed these companies to become some of the most impactful organizations in the world. The graph database provided a pioneering approach to map the social graph and the web graph. Nowadays graph database is used across many different industries like hospitality, healthcare, retail, etc. It is also used by the manufacturing industry to some extent, but I believe that it could hugely benefit from the wider implementation of graph database technology.
A graph database is built up from nodes and edges (vertices). Nodes are data records that contain a list of relationships organized by type and direction. The relationships (edges) are the connections between the nodes, that can have not just different directions but labels as well. When you query this database, it scans through the relationships that have direct access to the nodes in question, without having to go through the full list of records. This structure enables us to build sophisticated models that are very flexible at the same time – it can be easily extended with new types of data at a large scale without having to perform migrations. A graph database describes data the way that it exists in real life – small and richly interconnected, allowing us to search our data from any point of interest.
Furthermore, a graph database is great at describing hierarchical data. The structure of engineering product data can benefit from a database that can effectively support the complex relationships between different assemblies, sub-assemblies, individual components and all related business data.
Considering the above, we can conclude that a graph database typically provides the best performance when it comes to richly connected data – taking less time and memory to perform a query. A graph database is also more flexible, especially when adding a new type of data. Furthermore, it can capture information as it exists in real-life, with rich connections, allowing users to gain a much deeper insight into their data.
When we consider the impacts of the phenomena such as industry 4.0, big data, digital twins and Internet of Things, it becomes clear that in the years to come, the success of manufacturing companies will rely on the way they manage their data, and more importantly on the extent they can gain insight to the patterns and relationships that exist within their data, especially throughout the design phase. A graph database is a great choice for manufacturers when it comes to deciding on their PLM system. It allows them to easily gain invaluable insights into their product data in many different ways, and that can provide a huge competitive advantage over competitors.