Jed Rembold
March 19, 2025
Accessing data using dblink is a
two-step process: connect and then query
To connect, use dblink_connect
(shocking!)
SELECT dblink_connect(|||connection string|||)The connection string can either be a full database url, or a space delimited sequence of keywords, e.g.:
'host=localhost user=postgres password=1234'Once the connection has been made (within a session) you can send across queries
SELECT * FROM dblink(|||query string|||)dblink is essentially blind, in that
it has no idea what would be returned by the query
SELECT *
FROM dblink(|||query string|||)
AS |||table alias||| ( |||col₁||| |||type₁|||, |||col₂||| |||type₂|||, |||etc.|||);If you have a command to run remotely, and don’t need a table
back, you can use dblink_exec
This may be useful for creating, updating or deleting data from a remote table.
SELECT dblink_exec(
'CREATE TABLE test (col1 text, col2 int)'
);postgres_fdw extension is a bit
more of a setup, but makes working with remote tables a breezeYou first need to define a foreign server
CREATE SERVER |||server_name|||
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (|||server config settings|||)Server name can be whatever you like, provided it is unique
Server settings are a comma delimited, space separated sequence of host, port, dbname, etc
You essentially need to give your user permission to access the remote server
CREATE USER MAPPING FOR CURRENT_USER
SERVER |||server_name|||
OPTIONS (|||user options|||)Server name will be the name you just used for the created server
User options should include the corresponding remote user info like username and password
Last but not least, you can import in the tables you want
Can do one at a time with
CREATE FOREIGN TABLE, but I far prefer
IMPORT FOREIGN SCHEMA
IMPORT FOREIGN SCHEMA |||schema name|||
FROM SERVER |||server_name|||
INTO |||local schema name|||Can also include LIMIT TO or
EXCEPT followed by a list of table names to
fine-tune which tables are imported
CREATE TABLE |||new table||| AS (
SELECT * FROM |||og table|||
);Think back to HW3 where you needed to filter out all the outlier taxi-ride speeds
Subqueries can combine those steps! You can directly compare a value to the subquery output so long as the query outputs only a single value
SELECT *
FROM |||table 1|||
WHERE |||column 1||| > (
SELECT avg(|||column 2|||)
FROM |||table 2|||
);You can use the output of a subquery anywhere you would normally reference a table name
Such a table is then called a derived table
You must give such subqueries a table alias, else you have no way to refer to them
SELECT *
FROM (
SELECT |||column 1, column 2|||
FROM |||table name|||
) AS |||table alias|||; Sometimes it is very useful to have a constant value assigned to an entire column
You can include a subquery in your selected columns if it outputs only a single value
Give it an alias so the column heading has some meaning!
SELECT
|||column 1|||,
|||column 2|||,
(SELECT avg(|||column 3|||) FROM |||table 2|||) as |||column alias|||
FROM |||table 1|||;|||expr||| IN (|||subquery|||) and
|||expr||| NOT IN (|||subquery|||)EXISTS (|||subquery|||) and
NOT EXISTS (|||subquery|||)|||expr||| |||op||| ANY (|||subquery|||)|||expr||| |||op||| ALL (|||subquery|||)|||expr||| is a column name or value, and
|||op||| is a boolean operator
(=,>,>=,etc)NOT IN just reverses the situation1 IN (1,2,3) gives True1 IN (2,3) gives False1 IN (NULL,2,3) gives
NULLNOT IN, since
1 NOT IN (NULL,2,3) will give
NULL, and not True as you might expectSELECT 1 FROM ... is usually usedSELECT |||column₁, column₂|||
FROM |||table₁|||
WHERE EXISTS (
SELECT 1
FROM |||table₂|||
WHERE |||table₁.column₁||| = |||table₂.column₂|||
);At times you don’t want to see if a value is in the subquery output, but rather how it compares to the output
A boolean expression needs to return a single True or False though
ANY and ALL
“broadcast” the boolean comparison across all the subquery rows
ANY will return True if any of
the rows evaluate to True with the expressionALL will only return True if all
of the rows evaluate to True with the expression|||expr||| = ANY (|||subquery|||) is
thus identical to
|||expr||| IN (|||subquery|||)
SELECT |||column|||
FROM |||table 1|||
WHERE |||column||| < ALL ( SELECT |||column||| FROM |||table 2|||);subq1 schema in your database of choice which
will contain three common tables: roster,
assignments,
submissionsWITH keyword:WITH
|||derived table name||| (|||new column names|||) AS (
|||subquery|||
)
SELECT ...
Using a CTE and other subqueries, let’s extract the middle 50% of the taxi_rides by duration.
| Name | Color | Condition | Number |
|---|---|---|---|
| Apple | Red | Good | 5 |
| Orange | Orange | Bad | 3 |
| Cherry | Red | Good | 10 |
| Banana | Yellow | Good | 1 |
| Pineapple | Yellow | Bad | 4 |
| Color | Good | Bad |
|---|---|---|
| Red | 15 | 0 |
| Orange | 0 | 3 |
| Yellow | 1 | 4 |
Standard SQL has no way to create these cross-tabulations
Postgres has a function to help, but it is located in an extension or module
To use an extension, you need only enable it for your particular database
CREATE EXTENSION |||extension_name|||;To facilitate making pivot tables, we need to add the
tablefunc extension to our database, which
will add the crosstab function
You can always remove an extension using
DROP if you need later
crosstab function has a lot going on,
so let’s break things downSELECT *
FROM crosstab(
|||subquery 1 string|||,
|||subquery 2 string|||
)
AS (
|||row label column||| TEXT,
|||column 1||| |||type 1|||,
|||column 2||| |||type 2|||,
⋮
);
Let’s construct a pivot table comparing the total number of taxi rides at different days of the week across different hours.
CASE statement to
accomplish this
CASE
WHEN |||some condition||| THEN |||output|||
WHEN |||some other condition||| THEN |||different output|||
⋮
ELSE |||fallback output|||
END
Comparisons are made in order, so the first condition that matches, that output is used
If you do not include an ELSE part,
then NULL will be output if nothing else
matches
CASE statements most often show up in
SELECT statements where column outputs are
being selected, but they could potentially also show up in filtering or
ordering statements
A CASE statement will not evaluate
results where the condition is not met, so they can be used to prevent
certain errors as well (such as dividing by 0)
SELECT
CASE WHEN a != 0 THEN 5/a END
FROM tablename;Let’s compute how many taxi rides were considered:
CASE, SQL also has the
COALESCE statementCOALESCE returns the first non-null value
in the listNULL valuesSELECT
student_id,
COALESCE(grade, 0)
FROM grades;
Using the same grade book tables as earlier (here if you need them again), do the following:
||
operator. So fname || ' ' || lname would give
the first and last name with a space between as a single string. Feel
free to use joins here.| student_name | Quiz 1 | Quiz 2 | Quiz 3 |
|---|---|---|---|
| Hailey Gray | 1 | 1 | 21 |
| ⋮ | ⋮ | ⋮ | ⋮ |