Importing and Exporting

Jed Rembold

Monday, September 8, 2025

Announcements

  • HW2 is due at the end of Wednesday!
    • You’ll have everything you need to do all the problems after today
    • Feedback for HW1 went out yesterday!
      • Didn’t see an email about it? Go to your repository, and then go to the “Pull Requests” tab and then click the “Feedback” pull request
  • Look over the first part of Ch 6 for Wednesday
  • Polling: polling.jedrembold.prof

Review Question

All the following values could be added to a NUMERIC(6,4) type column except for one, which would result in an error. Which number would result in the error?

  1. 0.981213
  2. 101.3
  3. 60
  4. 34.192834

Back to Types

Timestamps

  • A TIMESTAMP field holds basically both a date and a time (in that order)
  • The same properties and formats that apply to each individually apply here as well
  • Also have the option of with or without a timezone
    • TIMESTAMP by itself is without a timezone, as per SQL standards
    • TIMESTAMP WITH TIME ZONE is with a timezone. Shorthand in Postgres is TIMESTAMPTZ
  • Internally, timestamps with a timezone are always stored in UTC, but then when displayed are converted back to the local timezone
  • Examples:
    • 2022-01-18 14:30:00
    • Jan-18-2022 2:30 PM PST

Intervals

  • Represents a span of time
  • Generally given by first a number and then by a unit
    • Possible units: microsecond, millisecond, second, minute, hour, day, week, month, year, decade, century, millennium
    • Abbreviations or plurals of the above also work
  • Examples:
    • 1 day
    • 3 century 2 mins
    • 45 ms
    • 1 mon 87 us
  • Intervals can be used in calculations with other date/time data types

MORE!

  • These are just a small sample of some of the most common data types
  • Postgres supports many more!
    • Booleans
    • Geometric types
    • Network address types
    • JSON types
    • A full list can be found here
  • We’ll address others as the come up sporadically through the rest of the semester, but you have the core basics now

Conversions

  • There are plenty of instances where you may need to convert between types

    • Sometimes calculations (coming soon!) need a certain data type
    • Sometimes the information was just not stored in the most ideal type
  • Can use the CAST function to convert between reasonable data types

  • Core syntax is

    CAST(|||column name||| AS |||new data type|||)
  • Postgres has a shorthand conversion syntax that uses double colons:

    |||column name|||::|||new data type|||

Delimited Files

SQL I/O

  • INSERT INTO is great for small tables or adding a few rows, not so great for bulk populating of a table
  • Frequently, data is stored in delimited text files for flexibility and portability
  • Postgres can import data from or write data to these sorts of delimited text files using its COPY command
    • This command is Postgres specific, though other SQL variants have their similar methods
  • Postgres needs to have permission to access the file or folder in order to import or export!
    • If having issues:
      • On Windows: The C:\Users\Public should be universally accessible
      • On Mac: The /tmp folder should be universally accessible, or just choose a particular folder and give Postgres permissions to read and write to it

Delimited Files

  • A delimited text file simply uses a special character, called the delimiter, to indicate where column breaks should be
  • Each line contains the information for a single row
  • Most common delimiter is the comma, and hence the term CSV or “Comma-Separated Values”
    • Other character delimiters exist though, so may need to adjust
  • If the delimiter naturally appears in an entry, and isn’t indicating a column break, then that entry needs to be surrounded with a text qualifier
    • The most common text qualifier is a pair of double quotes, but it could be other symbols
  • Often times, the first row of the file is a delimited description of each column name or what it represents
    • Not always present, in which case you (hopefully) have other documentation to consult to understand the column meanings

The Classic CSV

id,first_name,last_name,birthday
1,Michael,Scott,"Mar 15, 1964"
2,Dwight,Schrute,"Jan 20, 1970"
3,Pam,Beesly,"Mar 25, 1979"
4,Jim,Halpert,"Oct 1, 1978"
5,Kelly,Kapoor,"Feb 5, 1980"
  • Note:
    • The birthday column entries are quoted owing to the comma within them
    • No spaces show up anywhere except within quoted blocks
    • The first line contains a header, which is useful for understanding but contains no actual data
    • Make sure you have no empty line at the end!

Imports and Exports

Importing Data

  • The COPY command will copy information into an existing table, it won’t create the table

    • Thus you are still responsible for creating the table (and associated data types for each column) before COPY will be useful
      • Yes, this can still be painful for huge tables. There are some scripts that can help with it or do some automated checking, but they tend to be far from perfect.
  • Syntax of the COPY statement:

    COPY |||tablename|||
    FROM |||full path to file|||
    WITH ( |||import options||| );

Importing Details

  • The full text filename needs to be the entire path that points to your file
    • On Windows, that would start with C:\
    • On MacOS, that would start with /
  • You have a basic selection of import options:
    • FORMAT CSV sets the delimiter to a comma and the default text qualifier to double quotes
    • HEADER specifies that there is a header, and so the first row of the file should be skipped
    • DELIMITER 'x' sets x to be the delimiter instead of the default comma (or tab)
    • QUOTE 'x' sets x to be the new text qualifier, instead of double quotes
  • If you run the COPY command multiple times, it will just keep adding the CSV contents to the end of the current table

Subset Imports

  • Sometimes an CSV might not have all the data you want in your SQL table

  • You can import all the information from the CSV into only a subset of the SQL table columns

    • As far as I know, you can’t easily go the other direction, importing only a portion of the CSV columns
  • Just requires that you specify the target SQL columns after specifying the table

    COPY |||tablename||| (|||column₁|||, |||column₂|||, |||column₃|||)
    FROM |||full path to file|||
    WITH (FORMAT CSV, HEADER);

Understanding Check

CREATE TABLE purchases (
  "date" DATE,
  "store" VARCHAR(20),
  "purchased" TEXT,
  "tot_price" NUMERIC(10,2));

How to import the data to the right into the above table?

date:store name:purchased
Jan-5-22:Walmart:eggs,milk
Jan-10-22:Roths:cereal,steak
Jan-16-22:Roths:milk,butter
Jan-18-22:Winco:beans
COPY purchases
FROM 'C:\DATA\purchases.csv'
WITH (FORMAT CSV, HEADER);
COPY purchases (date, store, purchased)
FROM 'C:\DATA\purchases.csv'
WITH (FORMAT CSV, HEADER, 
      DELIMITER ':');
COPY purchases (date, store, purchased)
FROM 'C:\DATA\purchases.csv'
WITH (FORMAT CSV, DELIMITER ':');
COPY purchases (date, "store name")
FROM 'purchases.csv'
WITH (FORMAT CSV, HEADER,
      DELIMITER ',');

Exporting Data

  • Exporting data takes information from your SQL table and allows you to store it in a text file

    • Note that this process isn’t lossless, as the data types of each column are not stored, just the contents and (maybe) column names
  • Syntax-wise, it is almost exactly like copying into a SQL table, except using TO instead of FROM

    COPY |||tablename|||
    TO |||full path to file|||
    WITH ( |||import options||| );
  • This exports all columns from the specified table to the filename with the desired format

Exporting Subsets

  • In many instances, you may not want all the columns to be exported. In that case, you have a few options:
  • Exporting only particular columns:
    • Just specify the desired columns after the table name
    COPY |||tablename||| (|||column₁|||, |||column₂|||, |||column₃|||)
    TO |||full path to file|||
    WITH ( |||import options||| );

Exportings Queries

  • Queries return a table, so you can also export that!
    • Embed the query instead of a named table
    COPY (
      SELECT |||col₁|||, |||col₂||| FROM |||tablename|||
      ORDER BY |||col₂|||
      )
    TO |||full path to file|||
    WITH ( |||import options||| );

Misc

Some Commentary

  • Like many programming languages, SQL has the concept of a comment, which is a piece of text that is entirely ignored by the system when running commands
  • In SQL, you can comment a single line at a time using double dash: --
  • To block comment, you begin with a /* and end with a */
-- This is a comment
SELECT *
FROM |||table_name||| -- this is my table name
ORDER BY |||column|||;/* this
is all ignored */
// reveal.js plugins // Added plugins