Jed Rembold
January 29, 2025
terminal on MacOScmd on Windowsurxvt, xterm,
etc. on LinuxThe shell dictates how the user interacts with the terminal
There are many possible shells, some of which have been around a very long time, and many of which build on one another
| Command | Description |
|---|---|
pwd |
Prints out the current working directory |
cd arg |
Changes directory to the provided directory path |
ls |
Lists the contents of the current directory |
/ in
Linux/Mac and \ on Windows/ on Mac or Linux, usually
C:\ on Windows).. refers to the parent directory. refers to the current directory"/teaching/class/data science in the natural sciences"/teaching/class/data\ science\ in\ the\ natural\ sciencesNavigating your file structure is fun, but not useful by itself
You have several commands available to you for creating, moving, or removing files:
| Command | Description |
|---|---|
mkdir |||arg||| |
Makes a new directory at the given argument location. Defaults to relative |
mv |||og_file||| |||new_file||| |
Moves the |||og_file||| to the new
location (and name) as |||new_file||| |
cp |||og_file||| |||dup_file||| |
Copies the |||og_file||| to the location
(and name) of |||dup_file||| |
rm |||arg||| |
Removes the file given by |||arg||| from
the system. This bypasses any trashcan! |
Many commands have various options that can be toggled or set through the use of flags
A flag is a short command prefaced by a
- or -- that comes
after a shell command
Short forms are usually a - followed
by a single character
ls -sLong forms are usually a -- followed by
a longer term or phrase
ls --sizeShort form flags can be concatenated to effectively combine multiple flags
ls -sh = ls -s -hHow do you know what options are available? The help option!
ls --helpSometimes the help options for a command doesn’t give you quite enough information or details to understand what the option does
Almost all BASH commands have built in help pages in the form of a manual
To access and read a manual, use the BASH command
man cmd, where cmd
is the command you want to get information about
man lsman actually opens up the
documentation in another program called less,
which allows you to scroll around and then press
q when done to return to your original
screen
Choose a folder on your system that has a lot of files in it, and then see if you can achieve the following in the shell:
.)The path is just a colon delimited list of directory locations
It is stored in a special variable called an environment variable
echo $PATHecho will generally print
whatever argument comes after it to the screen| Variable | Contains |
|---|---|
| PATH | The list of all directories searched for a command |
| USER | The currently logged in user |
| HOME | The home directory of the current user |
| SHELL | The current shell being used |
| TERM | The current terminal emulator being used |
| LANG | The current language encoding |
stdin is your keyboard and
stdout is the screen being printed to< and >
commands< changes
stdin to be whatever comes after> changes
stdout to be whatever comes afterThe BASH cat function concatenates any
of its arguments together, sticking them together end to end, and
outputs them to stdout
cat is frequently just used with a single
argument to display it to the screenIf given no arguments, cat instead
reads from stdin
We can redirect any of these as we might see fit:
cat > temp.txt
cat < temp.txt
cat < temp.txt > new.txtBe aware that >
overwrites the contents of a file if you redirect
there. You can use >> to append the contents
to a file if you prefer
Because so many shell commands accept input from
stdin and output to
stdout, it can be very useful to take the
output of one and pass it into another
This could be done with intermediate files, but it is more convenient to pass the information directly, using what are called pipes
Pipes utilize the pipe character | and
basically say: “take the stdout from the left
command and use it as the stdin on the right
command”
ls | catYou can chain as many pipes together as you like
This is where the real flexibility of shell scripting comes into play! It allows you to combine many smaller tools effortlessly to achieve a final product
The wc command stands for “word count” and
will give you information about the number of words, characters or lines
read in from a file or stdin. Use this in
conjunction with ls to write the number of
current objects in a folder to a file called
obj_count.txt. Don’t forget you can check the
help flag or man page of wc for some more
information on how you can customize its output!
Probably have come across instances where you don’t want to have
a bunch of NULL values in your output
query
You can not filter them out using
colname != NULL
NULL value comparison always
gives NULL, no matter the comparisonNeed to use the keywords IS NOT NULL
(or IS NULL if you want to find the null
values)
SELECT |||column|||
FROM |||table|||
WHERE |||column||| IS NOT NULL;-- to turn
everything following on that line to a comment/* and
*/ to bulk comment everything between the two
symbols-- This is a comment
SELECT *
FROM |||table||| -- this is my table name
ORDER BY |||column||| /*
Nothing here of note!
Still nothing! */
;
SELECT statements on values
directly if needed, omitting a reference to a table!
Just leave off the FROM portion of the
statement
SELECT CAST('Jan 25, 2022' AS DATE);
SELECT 'aardvark' > 'Zebra';| Operator | Description | Example |
|---|---|---|
+ |
Addition | 4 + 5 |
- |
Subtraction | 8 - 5 |
* |
Multiplication | 2 * 4 |
/ |
Division | 8 / 2 |
| Operator | Description | Example |
|---|---|---|
% |
Modulo (remainder) | 7 % 5 |
^ |
Exponentiation | 3 ^ 2 |
|/ |
Square root | |/9 |
||/ |
Cube root | ||/27 |
! |
Factorial | 5! |
@ |
Absolute value | @ -42 |
pg_typeof() function can be useful at
times if you need to check a data typeSQL follows your standard order of operations
If you want some other ordering, you need to use parentheses to group things accordingly, as operations in parentheses happen first
SELECT 4 ^ (9 % 2 + 1);| Function | Description |
|---|---|
sqrt(num) |
Square root |
abs(num) |
Absolute value |
factorial(num) |
Factorial of a number |
sin(num) |
Sine of radians |
sind(num) |
Sine of degrees |
asin(num) |
Inverse sine in radians |
degrees(num) |
Convert radians to degrees |
log(num) |
Base 10 logarithm |
ln(num) |
Base e logarithm |
round(num,scale) |
Round number to scale decimal places |
You may have noticed that the default column names when selecting a calculation can be nonsensical
SQL provides a way for you to name a column on the fly, using the
AS keyword
Technically forms what is commonly called an alias
Syntax:
SELECT ||| fancy calculation ||| AS |||column name|||;You can then refer to that column name later in your command if you want or need
ORDER BY or (later)
GROUP BYWHERE or (later)
HAVINGYou can do operations on entire columns at a time
Just use the column name in the calculation expression
Calculations are done on a row by row basis
Can use anywhere else you would use a calculation (selections, filters, ordering, etc.)
SELECT col1, col2, col1 - col2 FROM tablename;


NULL values
are ignored in the calculations| Function | Description |
|---|---|
avg(col) |
Finds the average or mean of a column |
sum(col) |
Computes the sum of a column |
count(*) |
Computes the number of input rows in the table |
count(col) |
Computes the number of non-null input rows in that column |
max(col) |
Finds the maximum value in a column |
min(col) |
Finds the minimum value in a column |
WHERE statementsThe median is commonly desired as it is a measure of the distribution center less affected by outliers
Postgres (and most other SQL variants) do not
have a median() function
Instead they offer more general percentile functions
Ordering matters here, so the syntax needs to be a bit different:
SELECT
percent_func(|||desired frac|||) WITHIN GROUP (ORDER BY |||column|||)
FROM |||table|||;
WITHIN GROUP functions entirely
ignore NULL valuespercentile_cont is best used for
continuous values
percentile_disc is best used for discrete
values
mode()Sometimes, especially as we continue to build sophistication in our queries, you may want to save the output of a query to another database table
This can give you a multistep process in approaching more complicated actions
Sometimes useful to use a temporary table as well by
adding the TEMP keyword
To set a table to a query output, use the
AS keyword
CREATE TEMP TABLE |||temporary table name||| AS (
SELECT |||desired columns||| FROM |||existing table|||
);