dbx

Selecting Data & Getting Summary Statistics

By dbx on 2024-09-19

1. Basic Data Selection with SELECT

The SELECT statement allows you to retrieve data from any table. For example, to retrieve all columns from the persons table:
SELECT * FROM persons;
SELECT firstname, lastname, city FROM persons;
You can also specify particular columns:
This retrieves the first names, last names, and cities of all persons.

2. Using JOIN to Query Related Tables

Since there are relationships between the persons, houses, and street tables via the house_id column, you can use JOIN to combine data from these related tables.

Example: Join persons and houses

SELECT p.firstname, p.lastname, h.address, h.city FROM persons p JOIN houses h ON p.house_id = h.id;
This query retrieves each person's first and last name along with the address and city of their house by joining the persons and houses tables through the house_id relationship.

Example: Join street and houses

SELECT s.name, s.postal_code, h.address FROM street s JOIN houses h ON s.house_id = h.id;
This query retrieves street names, postal codes, and house addresses by joining the street and houses tables.
 
notion image
 

3. Getting Summary Statistics

SQL provides several functions to get summary statistics from your data. You can use these to analyze counts, sums, averages, and more.

a. COUNT() - Count the number of rows

SELECT COUNT(*) FROM persons;
This query counts the number of persons in the persons table.

b. AVG() - Calculate the average value of a numeric column

You can calculate averages across a column of numeric data:
SELECT AVG(salary) FROM persons;
(Assuming salary exists in the persons table.)

c. GROUP BY - Group data and apply an aggregate function

You can use GROUP BY to group rows that share a common value, and then apply aggregate functions to each group.
SELECT city, COUNT(*) FROM persons GROUP BY city;
This groups the persons by city and counts how many persons live in each city.

4. Using WHERE to Filter Data

The WHERE clause allows you to filter the rows returned by a query. For example, you can select only persons who live in a specific city:
SELECT firstname, lastname FROM persons WHERE city = 'New York';

Example: Filtering with Related Tables

You can also filter data across related tables. For example, to find all persons living in houses on a specific street:
SELECT p.firstname, p.lastname FROM persons p JOIN houses h ON p.house_id = h.id JOIN street s ON s.house_id = h.id WHERE s.name = 'Main Street';
This retrieves the names of all persons living in houses on 'Main Street'.

5. Ordering Results with ORDER BY

You can sort the results of a query using ORDER BY. For example, you can order persons by their last name:
SELECT firstname, lastname FROM persons ORDER BY lastname ASC;
This orders the results by last name in ascending order. You can use DESC for descending order.

6. Joining All Three Tables: A Comprehensive Query Example

Now that you understand how to join tables, here’s a more advanced query that combines all three tables:
SELECT p.firstname, p.lastname, h.address, s.name as street_name FROM persons p JOIN houses h ON p.house_id = h.id JOIN street s ON s.house_id = h.id WHERE s.city = 'Los Angeles';
This query retrieves each person’s first name, last name, house address, and street name for those living in houses located in Los Angeles.

7. Using Aggregate Functions on Joined Data

You can also combine joins with aggregate functions. For example, you can count the number of people living on each street:
SELECT s.name AS street_name, COUNT(p.id) AS num_persons FROM street s JOIN houses h ON s.house_id = h.id JOIN persons p ON p.house_id = h.id GROUP BY s.name;
This counts how many persons live on each street by joining the street, houses, and persons tables.

Summary:

In this tutorial, you learned how to:
  • Use the SELECT statement to query data.
  • Retrieve related data from multiple tables using JOIN.
  • Use filtering with WHERE and sort results with ORDER BY.
  • Apply summary statistics such as COUNT, AVG, and GROUP BY to analyze your data.
  • Work with a relational schema involving foreign key relationships between persons, houses, and street.
By combining these techniques, you can effectively query and analyze data from complex relational schemas like the one you are working with.