Database joins are widely used to extract information from tables in a database linked by a relationship. While familiar to database developers, they are less understood by other professionals, such as accountants or auditors, who may need them to perform data analytics. This article explains database joins using a practical example, based on purchase orders and invoices tables present in most modern financial/accounting systems.
We have two tables: Purchase Orders (POs) and Invoices that we need to analyze, shown on figure 1. The 1st (left) table is called POs, while the 2nd (right) table is called Invoices. POs and Invoices are linked through a common field containing a PO number. We need to obtain a dataset containing all fields from POs and Invoices in order to perform various analyses.
Extracting corresponding/linked POs and Invoices is usually done by executing a query against a database. In the query we need to specify the field(s) that link the POs and Invoices and how to link them. Linking can be done in one of four ways, described below.
Inner Join — returns all records with a match in the other table. This query will return all POs with a corresponding invoice and all invoices with a corresponding PO.
Outer Join — returns records from both tables, regardless of having a corresponding record in the other table. This query will return all POs, including those without an invoice and all invoices, including those without a PO.
Left join — returns all records from the 1st table (POs). If there is no corresponding record in Invoices, fields populated from there will be empty. Records from Invoices with no corresponding record in POs are not returned.
Right join — returns all records from the 2nd table (Invoices). If there is no corresponding record in POs, fields populated from there will be empty. Records from POs with no corresponding record in Invoices are not returned.
The database joins don’t need to be just between two tables. They can be used to link more than two tables depending on the business objectives of the analytics we’re running. Extending the above example, we can create a query to perform 3-way match, a fundamental accounting control. In a 3-way match we want to link analyse POs, Invoices and Receipts, which requires three tables to be linked. In addition, we may decide that we also need vendor information to be included in the analyses, which is usually available in a separate table. The diagram below shows the tables needed for a 3-way match analytics.
Database joins are a powerful, yet not too complicated, tool that allows complex analyses to be carried out. By familiarising themselves with joins non-technical professionals, such as accountants, auditors, HR, etc., with an interest in analytics can become very productive in their work and obtain important business answers in very short times.