Debugging 10 Representations

Jed Rembold

September 15, 2025

Announcements

  • Homework
    • Problem Set 2 due tonight!
      • The autochecks should be working now!
    • Problem Set 3 posted by the end of today
  • Wordle Project coming up after PSet 3
  • Polling: polling.jedrembold.prof

Today

  • Touch on debugging strategies
  • Focusing today on how a computer can internally store more complex and abstract information
    • Initially will look at numbers
    • New data type coming Wednesday though!

Purging Debugs

Debugging

If debugging is the process of removing software bugs, then programming must be the process of putting them in.

Edsger W. Dijkstra

  • Everyone makes mistakes when writing code
  • A core skill then is in efficiently finding the bugs that you introduce
  • We’ll spend the first part of today looking at some good practices
    • As always though, practice makes perfect

Strategy #1

Concentrate on what your program IS doing, instead of what it SHOULD be doing.

  • It is impossible to find code that is missing
  • Instead focus on determining what your program is doing, or why it is behaving a certain way
  • Only once you understand what it is currently doing can you entertain thinking about how to change it productively

Strategy #2

Let Python help you: print or log the state of different variables.

  • Many errors are caused by you expecting a variable to have some content that it doesn’t have
  • Get Python to help you by adding print statements to print those variables out
  • Add print statements in blocks of code that you aren’t sure are being accessed to see if you see a message

Strategy #3

Stop and read. The documentation. The error messages.

Parsing Error Messages

  • Start at the bottom! That is where the general type of error and a short description will show up.
  • Want to know where it happened? Look one line up from that.
    • Will show a copy of the line where the error occurred
    • One line up from that will include the line number
  • Want nicer error messages?
    • The rich library offers some very pretty error messages: install with pip install rich

    • At the top of your code, then include:

      from rich.traceback import install
      install(show_locals=True)

Strategy #4

Use PythonTutor or a debugger to track EXACTLY what is happening in your program.

Strategy #5

Don’t make random changes to your code in the hopes that it will miraculously start working.

  • Making random changes is easy, fast, and doesn’t require a lot of thought
  • Unfortunately it is, at best, a wildly inefficient method of debugging, and at worst, actively detrimental
  • If you don’t know what you need to fix yet, you either haven’t:
    • Defined what you are attempting to do clearly enough, or
    • Understood / tracked your program well enough to know what it is currently doing

Strategy #6

Talk it out.

  • Explaining things verbally, in plain English, uncovers a shocking amount of misconceptions or mistakes
  • Find someone to talk at about your programming issues
    • It isn’t even important that they understand how to code, or even can talk back to you (though that might help in some cases)
    • Rubber Duck Debugging is where a software developer explains an issue out loud to an inanimate rubber duck

Strategy #7

Test your code as you go! Either manually or automatically.

  • Know that everyone makes mistakes. The longer you go without testing that something in your program works, the harder it is to find where the mistake eventually is.
  • Write code that you test in small pieces as you go
    • Decomposition into smaller functions is great for this: test each function individually as you go
    • In the projects we try to construct the Milestones for this exact same purpose

Assertions

  • You can use Python’s assert statement to write test functions, which take the form:

    assert |||condition|||

    where |||condition||| is any operation that returns a True or False

  • Assert statements “expect” a condition to yield a True, and if they do, nothing happens

    • No news is good news in this case
  • If an assert condition evaluates to False, an error is raised

  • Naming your test functions starting with the word test_ will make them automatically discoverable by other tools

Testing Example

  • Suppose we wanted to write some checks of the greatest_factor function from last class

    def test_greatest_factor():
        """Runs several tests on greatest_factor"""
        assert greatest_factor(10) == 5
        assert greatest_factor(7) == 1
        assert greatest_factor(51) == 17
        assert greatest_factor(9) == 3

Binary Bits

Bit Power

  • The fundamental unit of memory inside a computer is called a bit
    • Coined from a contraction of the words binary and digit
  • An individual bit exists in one of two states, usually denoted as 0 or 1.
  • More complex data can be represented by combining larger numbers of bits:
    • Two bits can represent 4 (\(2\times 2\)) values
    • Three bits can represent 8 (\(2\times2\times2\)) values
    • Four bits can represent 16 (\(2\times2\times2\times2\) or \(2^4\)) values, etc
  • My laptop here has 16GB of system memory, and can therefore keep track of approximately \(2^{16\times 1,000,000,000 \times 8}\) states!

An old code, but it checks out

  • Binary notation is an old idea
    • Described by German mathematician Leibniz back in 1703
  • Leibniz describes his use of binary notation in an easy-to-follow style
  • Leibniz’s paper notes that the Chinese had discovered binary arithmetic 2000 years earlier, as illustrated by the patterns of lines in the I Ching!
Gottfried Wilhelm von Leibniz

Back to Grade school

image/svg+xml 0 1 10 100 10 digits: 0 - 9

Now in Binary

image/svg+xml 2 digits: 0 & 1 1 2 4 8 1 8 x 1 4 x 0 2 x 0 1 x + + + 0 0

Representing Integers

  • The number of symbols available to count with determines the base of a number system
    • Decimal is base 10, as we have 10 symbols (0-9) to count with
      • Each new number counts for 10 times as much as the previous
    • Binary is base 2, as we only have 2 symbols (0 and 1) to count with
      • Each new number counts for twice as much as the previous
  • Can always determine what number a representation corresponds to by adding up the individual contributions

Specifying Bases

  • So the binary number 00101010 is equivalent to the decimal number 42
  • We distinguish by subsetting the bases: \[ 00101010_2 = 42_{10}\]
  • The number itself still is the same! All that changes is how we represent that number.
    • Numbers do not have bases – representations of numbers have bases

Understanding Check

How could you represent the number of items shown to the right in a binary representation?

  1. 1011
  2. 10110
  3. 10010
  4. 11010

Binary Clocks

Other Bases

  • Binary is not a particularly compact representation to write out, so computer scientists will often use more compact representations as well
    • Octal (base 8) uses the digits 0 to 7
    • Hexadecimal (base 16) uses the digits 0 to 9 and then the letters A through F


  • Why octal or hexadecimal over our trusty old decimal system?
    • Both are powers of 2, so it makes it easy to convert back to decimal
      • 1 octal digit = 3 binary digit, 1 hex digit = 4 binary digit

Base(ic) Practice

  • The Java compiler has a fun quirk where every binary file is produces begins with


  • What is this in decimal? octal? hexadecimal?

Representation Matters

Representation

  • Sequences of bits have no intrinsic meaning!
    • Just the representations we assign to them by convention or by building certain operations into hardware
    • A 32-bit sequence represents an integer only because we have designed hardware to manipulate those sequences arithmetically: applying operations like addition, subtraction, etc
  • By choosing an appropriate representation, you can use bits to represent any value you could imagine!
    • Characters represented by numeric character codes
    • Floating-point representations to support real numbers
    • Two-dimensional arrays of bits representing images
  • To be useful though, everyone needs to agree on a representation!

Representation Pitfalls

  • How we choose to represent values has consequences!
  • Python represents floating point (fractional) numbers using two integers
    • One to represent the significant digits
    • One to represent the exponent (where the decimal place is)
  • \(1\frac{1}{4}\) Example
    • In decimal: \(\quad\displaystyle 1\frac{1}{4} = \frac{1}{1} + \frac{2}{10} + \frac{5}{100} = 1.25 = (125, -2)\)
    • In binary: \(\quad\displaystyle 1\frac{1}{4} = \frac{1}{1} + \frac{0}{2} + \frac{1}{4} = 1.01 = (101, -10)\)

Floating Binary

  • Say we wanted to convert the value \(\tfrac{7}{8}\) to a binary floating point representation: \[\frac{7}{8} = \frac{0}{1} + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} = 0.111 = (111, -11)\]
  • Now how would we convert \(\frac{1}{10}\) to binary??
    • We run into a problem! An infinitely repeating sequence! \[\frac{1}{10} = \frac{0}{1} + \frac{0}{2} + \frac{0}{8} + \frac{1}{16} + \frac{1}{32} + \frac{0}{64} + \frac{0}{128} + \frac{1}{256} + \cdots = 0.0001100110011\ldots\]
    • Have to stop the sequence somewhere and approximate it: \[\frac{3}{32} = 0.09375\quad\text{or}\quad\frac{25}{256} = 0.09765625\]

Consequences

  • The best we can do within the range of normal integers \[\frac{3602879701896397}{2^{55}} = 0.10000000000000000555111512312578270\]
  • When doing operations on these numbers, extra decimals will sometimes get rounded off, suddenly making the number look precise, but you might always have a tiny bit of this rounding error showing up in floating point values.
  • So be careful using == for floating numeric comparisons! Rounding might result in unexpected falsehoods
    • 0.1 + 0.1 + 0.1 != 0.3
    • Far better to check if two numbers are within a small margin of one another, or greater or less than the other
// reveal.js plugins