Working With Joins, Aggregations, and Built-In Functions

An introduction to the terminology used in querying multiple tables simultaneously.

Stepping Up Your Game

Up till now we’ve queried data in one table in a dataset at a time. While there is information to be had from single-table queries, you can do so much more if you can pull bits of data from different tables and present it all together in one result set.

You might wonder, if storing all the data in a dataset in one table works, why bother to split it up into multiple tables across the dataset? Why not just leave it all in one huge table? The short answer is for efficiency and for the integrity of the data. It’s what differentiates a relational database from a spreadsheet. Large enterprise data especially will be stored in many different tables that you will want to access from a single query.

Basic Terminology

Running queries against multiple tables at a time requires the use of joins. There are several ways to specify the conditions used to create joins, and there are also types of joins which differ in the amount and original location of the data being queried. Joins are the mechanism used to link tables in a dataset together for the duration of a query so that related data in the linked tables can be combined into one query-result table. There are several different types of joins which each describe a different relationship between the joined tables, and we will examine them in depth in the following sections.

As you add more data into your results there will be times when, instead of seeing all the rows in a table, you might want to see a sum, or average, or count across rows. This functionality comes from creating data groups, or aggregations from the use of aggregate functions.

In addition to combining tables you can also combine queries. This functionality is mostly of value when you are trying to get information from datasets which are not optimally designed for the for the queries you need to run.

Finally there are times when you might need to manipulate the data in strings or date format for your query. SQL has a host of built-in functions created specifically for that purpose.

In the following sections we will explore joins, aggregation, additional expressions, combining queries, and built-in functions in depth.