Jed Rembold
Wednesday, November 5, 2025
Which of the below phrases would match to the following regular expression:
[a-z]+-\d{2}$
~ is a case sensitive match using the
following regex~* is a case insensitive match using the
following regex! in front to negate
the search (where things do not match the regex)SELECT colname
FROM tablename
WHERE colname ~ '[a-z]*\s\d{2}';
SUBSTRING(|||text|||, |||regex|||) is
probably the most straightforward method
SELECT substring('today is November 6, 2024', '(\w+) \d{2}');
regexp_match(|||str|||, |||regex|||) is what
you want
UNNESTSELECT regexp_match(
'today is November 6, 2024',
'(\w+) \d{2}.*(\d{4})')
);
regexp_replace(|||text|||, |||re|||, |||replacement|||)
will replace the first matches of re in text with the
replacement
'g' argument on the end to do
so globally (replacing all matches)regexp_split_to_array(|||text|||, |||re|||)
will split the text into an array on each match to reregexp_split_to_table(|||text|||, |||re|||)
will split the text into a table column on each match to re. This is
just like unnesting the array.1 1/4 form)tsvectortsqueryto_tsvector will break a
string down into its component lexemes, and keep track of where each
occured in the original stringSELECT to_tsvector('I flew back to Salem on Monday');
>> 'back':3 'flew':2 'monday':7 'salem':5
to_tsvectorto_tsqueryto_tsquery takes a sequence of words with
symbols connecting them conveying relationships
& - and| - or! - not<-> - followed bySELECT to_tsquery('fly & monday')To actually complete a full text search, you ask Postgres to look
through a tsvector object for a particular
tsquery
The syntax to do so utilizes the match operator, which is two
“at” symbols: @@
SELECT |||some tsvector||| @@ |||some tsquery|||;Using @@ is a true/false search, so
the query is either found or not
WHERE as well!Individual tsvectors can not be easily
ordered, so indexing a column with tsvector
contents using the normal B-Tree method would not be effective
For this sort of content, use a Generalized Inverted Index or GIN indexing method instead
CREATE INDEX |||index name||| ON |||table|||
USING GIN(|||column|||);Seriously consider adding an index to your
tsvector column, as it can
significantly speed up these sorts of searches
ts_headline function can capture
snippets of text around a match and display them
ts_headline operates on the original
text, not the tsvector! This will absolutely make it
slower, so use it wisely!ts_headline:
StartSel/StopSel:
the delimiters that will showcase the wordMinWords/MaxWords:
the min or max number of words to show around the matchMaxFragments: a max number of fragments
to show if the match occurs multiple timests_rank will give an arbitrary rank based
on how many times your query words appear in the text
ts_rank_cd does similar, but also
considers the proximity of searched lexemesThe file here contains the
SQL commands to generate and populate a simple table
alice which hold the raw chapter contents of
the book: Alice in Wonderland. You will need to set up your own
tsvector column and index. In the same groups
as earlier, see if you can use the data to answer the following: