Writing and Analysis
Jed Rembold
July 1, 2025
Announcements
- I plan to review the data summaries by the weekend and get you
feedback
- Online Portfolio due July 15
- This is analysis and modeling month!
- Rough Draft due July 29
- I am working on getting some other faculty to visit in the later
halves of classes to consult with you on statistics or ML work
- We are halfway through the project: do you need to revisit your team
charter?
Overview
- At some point this month you are going to need to start writing
- Lots of potential ways to go about this
- Word / Google Docs
- LaTeX
- RMarkdown
- Quarto
- Most of these have their pros and cons, but I would
highly suggest working with a markup language for any
large document
- Your final document needs to be in a web HTML format, so that can
also limit options
Quarto
- If I was writing such a document, I would currently use Quarto
- You already know RMarkdown–Quarto is essentially a superset of
RMarkdown that can work with other languages as well
- It can work seamlessly with Rmd files, with Jupyter notebooks, or
with its own Qmd files
- Editors can thus include RStudio, Jupyter notebooks, or an IDE such
as VSCode
- Basic usage mimics RMarkdown, but also has excellent documentation for many other
things you might want to do
Organization
- A little planning up-front can smooth a lot of headaches down the
line
- Some things to consider:
- This needs to be published as HTML eventually
- Keeping source materials separate from published materials helps
distinguish roles and stay organized
- How much of your analysis code do you want “baked-in” to your
document, versus how much should run independent?
- You will often want multiple people to be able to be editing or
writing at the same time without issue
Organization Recommendations
- Use GitHub Pages for HTML publishing
- Publishing from a separate
gh-pages
branch will keep your source material and your published material
totally separate
- May require adding preview content to
.gitignore
- I would probably limit “baked-in” code to figure generation. Any
other analysis I would execute from independent scripts/notebooks.
- Break up your document into sections, which can all be included in
the final render. This facilitates being able to work on different
sections at the same time.
- Organize other content that your document needs into folders:
images, scripts,
csvs, etc.
Making it Happen
- Make sure you have Quarto installed and configured in your editor of
choice
- Accepting the assignment here will get you and
your partner a starting point on most of this
- A
gh-pages branch has already been
created, and appropriate .gitignore settings
created
- A handful of files corresponding to different sections have been
created, and included in the main
capstone.qmd file
- Still to do:
- In the repo settings:
- You have been given admin permission to this repo so that you can
change the repo name from the standard to one that better reflects your
project. Keep in mind that this will be part of the eventual URL
- Under the Pages option, select the
gh-pages branch and publish from root.
- Clone locally to your system
- You need to clone here, not just grab the zip, as the zip will not
contain the other branch info and will not work for publishing
Project Contents
- There are many ways this is done, but larger written research papers
generally include the following information:
- An introduction: what problem are you solving, why is it
interesting, and how are you approaching it?
- Background: what is the current state of research in this
area? What has other work done, and how does your work fit within that
work? What general knowledge would a reader need to have in this area to
understand the rest of your document? Sometimes this blends with the
introduction.
- Data: how did you go about collecting your data and what
data were you eventually working with?
- Analysis: what analysis was done on the data?
- Results: what results did you arrive at?
- Conclusions: what does it all mean? What should the main
takeaways be, and where could the project be extended in the
future?
Research Writing Dos
- Every figure should
have a caption, and the figure should be explicitly
referenced in the main document.
- Important equations
should be explained and referenced in text before they are used anywhere
in shown code.
- Always include units! In graphs, labels, or when talking about
values in text.
- In should go without saying, but visuals should always have labeled
axes.
- Cite
your data and references. If referring to other works or publications,
make sure to give them credit. You can use BibTex collections to help
facilitate this.
- Use peers and AI to proofread your writing and suggest
improvements.
Research Writing Don’ts
- You can show some code, but it is generally a poor idea to show
much. The code is the nitty-gritty details, while a research paper
generally tries to communicate at a much higher level. Explain what the
code did instead, and flowcharts or similar visuals can be very useful.
Your code will be available in your GitHub repo.
- Don’t rely on screenshots or snapshots. If information is important
enough to show in the paper, it is important enough to render
properly.
- Ensure that your analysis supports your conclusions. Don’t claim
something when your data didn’t really support it. It diminishes the
other great work you did.
- You don’t need to tell the reader every wrong path and dead end you
went down! In fact, you almost assuredly shouldn’t.
- Don’t feel like you need to write sections in order. You can always
return to them later to help them blend with earlier sections.
Team work
- The rest of the evening is set aside for you to start work on your
analysis and machine-learning portions of your project!
- You have spent the last month focused on data acquisition: take some
time tonight to plan for what the coming month is going to look like.
When will you be working on what? What deadlines are you setting for
yourself?
// reveal.js plugins
// Added plugins