Chapter 16: The Portfolio
Learning Objectives
- Understand the advantages of reproducible reports over traditional documents
- Create Quarto documents integrating code, analysis, and narrative
- Structure research reports following academic conventions
- Generate tables and figures automatically from code
- Manage citations using BibTeX
- Render documents to multiple formats (PDF, HTML, Word)
- Implement the “one-click report” workflow for reproducibility
- Debug common Quarto rendering issues
You’ve completed your research. You have: - A research question grounded in theory - 200 songs systematically coded using a pilot-tested codebook - Cleaned data in R - Visualizations showing patterns - Statistical tests quantifying relationships - Effect sizes contextualizing findings - An honest interpretation acknowledging limitations
Now you need to package it all into a research report.
You could, theoretically, write everything in Microsoft Word. Copy-paste tables from R. Save figures as images and insert them manually. Type your results section by hand. Update the tables every time your data changes. Re-export every figure when you adjust the color scheme.
But this workflow is fragile, tedious, and error-prone.
Quarto solves this by integrating code, analysis, and narrative into a single reproducible document. Change your data? The entire report updates automatically. Adjust a visualization? It re-renders instantly. Add a citation? The bibliography updates. Everything is scripted, documented, and reproducible.
This chapter teaches you to create professional research reports that embody the reproducibility principles you’ve practiced throughout this book.
What is Quarto?
Quarto is a scientific publishing system that turns plain text files into polished documents: PDFs, HTML web pages, Word documents, presentations, even books.
The magic: You write in Markdown (simple formatting syntax), embed R code chunks for analysis, and Quarto combines them into a final document with: - Automatic table and figure numbering - Citations and references - Formatted code output - Professional styling
The advantage over Word: - Reproducible: Rerun the entire analysis with one click - Version-controllable: Track changes with Git - Automated: Tables and figures update when code changes - Multi-format: Same source file → PDF, HTML, or Word
The learning curve: Higher than Word initially, but pays off quickly for research projects.
Installing Quarto
Already have R and RStudio? You’re almost ready.
Install Quarto: 1. Download from quarto.org 2. Install for your operating system 3. Restart RStudio
Check installation:
Open RStudio, go to File → New File → Quarto Document. If the option appears, you’re ready.
Anatomy of a Quarto Document
A Quarto document (.qmd file) has three components:
1. YAML Header (Metadata)
At the top of the file, between --- markers:
---
title: "Lyric Sentiment and Chart Performance"
author: "Your Name"
date: "2026-02-10"
format: pdf
---This controls document properties: title, author, output format, styling.
2. Markdown Text (Narrative)
Plain text with simple formatting:
# Introduction
Popular music has increasingly embraced negative emotional themes. This study
examines whether lyric sentiment relates to chart performance.
## Research Question
**RQ1**: Do songs with negative lyric sentiment achieve higher chart positions
than songs with positive sentiment?Markdown syntax: - # Heading 1, ## Heading 2, ### Heading 3 - **bold**, *italic* - - bullet point - [link text](url)
3. Code Chunks (Analysis)
R code embedded in special blocks:
```{r}
#| label: load-data
#| message: false
library(tidyverse)
music_data <- read_csv("data_clean/final_unified_dataset.csv")
```Code chunk options (after #|): - label: Names the chunk (useful for debugging) - message: false: Suppresses messages when loading packages - echo: false: Hides code, shows only output - warning: false: Suppresses warnings
Your First Quarto Report
Create a new Quarto document:
File → New File → Quarto Document → Fill in title and author → OK
Default template appears:
---
title: "Untitled"
format: html
---
## Quarto
Quarto enables you to weave together content and executable code into a
finished document...Click “Render” (the blue arrow at the top of the editor)
RStudio runs the code, combines it with text, and produces an HTML file that opens in a browser.
Congratulations: You just created a reproducible document.
Building a Research Report
Let’s create a complete research report structure.
Basic Structure
---
title: "Lyric Sentiment and Chart Performance in Popular Music"
author: "Your Name"
date: "2026-02-10"
format:
pdf:
toc: true
number-sections: true
bibliography: references.bib
---
# Abstract
This study examined the relationship between lyric sentiment and chart
performance using systematic content analysis of 200 Billboard Hot 100
songs (2015-2024). Results indicate...
# Introduction
[Background and context]
# Literature Review
[Theory and prior research]
# Method
## Sample
## Coding Procedure
## Reliability
# Results
# Discussion
# References
::: {#refs}
:::Key YAML options: - toc: true: Creates table of contents - number-sections: true: Automatic section numbering - bibliography: references.bib: Specifies citation file
Loading and Preparing Data
Create a setup chunk at the beginning:
```{r}
#| label: setup
#| include: false
# Load packages
library(tidyverse)
library(knitr)
library(kableExtra)
# Load cleaned data
music_data <- read_csv("data_clean/final_unified_dataset.csv")
# Set global chunk options
knitr::opts_chunk$set(
echo = FALSE, # Don't show code by default
warning = FALSE, # Suppress warnings
message = FALSE, # Suppress messages
fig.width = 7, # Default figure width
fig.height = 5 # Default figure height
)
```include: false means this entire chunk is hidden—no code, no output. It just runs silently.
Creating Tables
Descriptive Statistics Table
# Results
## Sample Characteristics```{r}
#| label: tbl-descriptives
#| tbl-cap: "Descriptive Statistics for Music Dataset"
music_data %>%
group_by(Lyric_Sentiment) %>%
summarize(
N = n(),
`Mean Tempo` = round(mean(Tempo, na.rm = TRUE), 1),
`SD Tempo` = round(sd(Tempo, na.rm = TRUE), 1),
`Mean Chart Peak` = round(mean(Max_Rank, na.rm = TRUE), 1)
) %>%
kable(
col.names = c("Sentiment", "N", "Mean Tempo (BPM)", "SD", "Mean Chart Peak"),
align = "lcccc"
) %>%
kable_styling(bootstrap_options = c("striped", "hover"))
```Output (in rendered document):
Table 1: Descriptive Statistics for Music Dataset
| Sentiment | N | Mean Tempo (BPM) | SD | Mean Chart Peak |
|---|---|---|---|---|
| Positive | 50 | 119.2 | 22.3 | 46.5 |
| Negative | 50 | 128.5 | 24.1 | 32.4 |
| Neutral | 30 | 115.8 | 20.9 | 51.2 |
| Mixed | 50 | 123.6 | 23.7 | 38.9 |
tbl-cap creates the table caption. Quarto automatically numbers tables.
Reference the table in text: “As shown in ?tbl-descriptives, negative sentiment songs had higher average tempo…”
Contingency Table
```{r}
#| label: tbl-contingency
#| tbl-cap: "Sentiment by Top-10 Chart Status"
music_data %>%
mutate(top10 = if_else(Max_Rank <= 10, "Top 10", "Not Top 10")) %>%
count(Lyric_Sentiment, top10) %>%
pivot_wider(names_from = top10, values_from = n, values_fill = 0) %>%
kable(
col.names = c("Sentiment", "Not Top 10", "Top 10")
) %>%
kable_styling()
```Creating Figures
Bar Chart
```{r}
#| label: fig-sentiment-dist
#| fig-cap: "Distribution of Lyric Sentiment Categories"
#| fig-width: 6
#| fig-height: 4
ggplot(music_data, aes(x = Lyric_Sentiment, fill = Lyric_Sentiment)) +
geom_bar() +
scale_fill_manual(values = c("Positive" = "#2E86AB",
"Negative" = "#A23B72",
"Neutral" = "#F18F01",
"Mixed" = "#C73E1D")) +
labs(
x = "Lyric Sentiment",
y = "Count"
) +
theme_minimal() +
theme(legend.position = "none")
```fig-cap creates the figure caption. Quarto automatically numbers figures.
Reference in text: “Figure ?fig-sentiment-dist shows the distribution of sentiment categories in our sample.”
Scatterplot with Regression Line
```{r}
#| label: fig-tempo-chart
#| fig-cap: "Relationship Between Tempo and Chart Performance by Sentiment"
#| fig-width: 8
#| fig-height: 5
ggplot(music_data, aes(x = Tempo, y = Max_Rank, color = Lyric_Sentiment)) +
geom_point(alpha = 0.6, size = 2) +
geom_smooth(method = "lm", se = FALSE, linewidth = 0.8) +
scale_y_reverse() +
scale_color_manual(
values = c("Positive" = "#2E86AB", "Negative" = "#A23B72",
"Neutral" = "#F18F01", "Mixed" = "#C73E1D"),
name = "Lyric Sentiment"
) +
labs(
x = "Tempo (BPM)",
y = "Peak Chart Position (lower = better)"
) +
theme_minimal() +
theme(legend.position = "bottom")
```Reporting Statistical Results
You can embed R code inline within sentences:
A chi-square test revealed a significant relationship between sentiment and
top-10 status, χ²(`r chisq.test(music_data$Lyric_Sentiment, music_data$top10)$parameter`)
= `r round(chisq.test(music_data$Lyric_Sentiment, music_data$top10)$statistic, 2)`,
*p* = `r round(chisq.test(music_data$Lyric_Sentiment, music_data$top10)$p.value, 3)`.Renders as:
“A chi-square test revealed a significant relationship between sentiment and top-10 status, χ²(3) = 8.42, p = 0.038.”
The advantage: If you add more data and rerun, all statistics update automatically.
Creating a Results Code Chunk
Store test results once, reference multiple times:
```{r}
#| label: stats-tests
#| include: false
# Run chi-square test
chi_result <- chisq.test(music_data$Lyric_Sentiment, music_data$top10)
# Extract values
chi_stat <- round(chi_result$statistic, 2)
chi_df <- chi_result$parameter
chi_p <- round(chi_result$p.value, 3)
# Calculate Cramér's V
library(rcompanion)
cramers_v <- round(cramerV(music_data$Lyric_Sentiment, music_data$top10), 2)
```Then reference in text:
χ²(`r chi_df`) = `r chi_stat`, *p* = `r chi_p`, Cramér's V = `r cramers_v`Managing Citations
Creating a Bibliography File
Create references.bib in your project folder:
@article{Wickham2014,
author = {Wickham, Hadley},
title = {Tidy Data},
journal = {Journal of Statistical Software},
year = {2014},
volume = {59},
number = {10},
pages = {1--23}
}
@book{Cohen1988,
author = {Cohen, Jacob},
title = {Statistical Power Analysis for the Behavioral Sciences},
publisher = {Routledge},
year = {1988},
edition = {2nd}
}Citing in Text
Data were organized following tidy data principles [@Wickham2014].
Effect sizes were interpreted using Cohen's benchmarks [@Cohen1988].
Multiple citations can be grouped [@Wickham2014; @Cohen1988].Renders as:
“Data were organized following tidy data principles (Wickham, 2014). Effect sizes were interpreted using Cohen’s benchmarks (Cohen, 1988).”
Automatic Bibliography
At the end of your document:
# References
::: {#refs}
:::Quarto automatically generates a formatted reference list from all citations in the document.
Getting BibTeX entries: - Google Scholar: Click “Cite” → “BibTeX” - Zotero: Right-click entry → Export → BibTeX
Rendering to Different Formats
PDF Output
---
title: "My Report"
format:
pdf:
documentclass: article
fontsize: 12pt
geometry: margin=1in
toc: true
number-sections: true
---Requires: LaTeX installed on your system - Mac: MacTeX - Windows: MiKTeX or TinyTeX - Linux: TeX Live
Quick LaTeX install (from R):
install.packages("tinytex")
tinytex::install_tinytex()HTML Output
---
title: "My Report"
format:
html:
toc: true
toc-location: left
code-fold: true
theme: cosmo
---Advantages: - No LaTeX required - Interactive elements possible - Easy to share (single HTML file)
code-fold: true lets readers click to see code if interested.
Word Output
---
title: "My Report"
format:
docx:
toc: true
number-sections: true
reference-doc: custom-template.docx
---Use when: Collaborators need to edit in Word
Custom template: Create a Word document with your preferred styles, save as custom-template.docx, and Quarto applies those styles.
The One-Click Report Workflow
The power of Quarto is full reproducibility:
- Someone sends you new data
- You replace the old CSV with the new one
- You click “Render”
- Every table, every figure, every statistic updates automatically
No manual copying. No stale numbers. No version confusion.
This is what reproducible research looks like.
Example Workflow
Monday: Analyze 200 songs, render report
Wednesday: Advisor says “Can you add 50 more songs?”
Thursday: You code 50 songs, append to CSV
Friday: Click “Render” → entire report updates with n=250 statistics
Traditional workflow: Manually update every table, re-export every figure, retype every statistic. Hours of tedious work. High error risk.
Quarto workflow: Click one button. 30 seconds. Perfect accuracy.
Advanced Features
Figure Panels (Subfigures)
```{r}
#| label: fig-panels
#| fig-cap: "Tempo and Chart Performance by Sentiment"
#| fig-subcap:
#| - "Boxplot comparison"
#| - "Scatterplot with trend lines"
#| layout-ncol: 2
# Panel A
ggplot(music_data, aes(x = Lyric_Sentiment, y = Tempo)) +
geom_boxplot() +
theme_minimal()
# Panel B
ggplot(music_data, aes(x = Tempo, y = Max_Rank, color = Lyric_Sentiment)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
theme_minimal()
```Creates Figure 1 with panels (a) and (b).
Cross-References
As shown in @tbl-descriptives, the mean tempo varied by sentiment.
This pattern is visualized in @fig-tempo-chart (panel b).
## Method {#sec-method}
Our coding procedure (see @sec-method) ensured high reliability...Section references: Use {#sec-label} to label sections, then @sec-label to reference.
Conditional Content
```{r}
show_detailed_analysis <- FALSE
``````{r}
#| eval: !expr show_detailed_analysis
# This chunk only runs if show_detailed_analysis is TRUE
# Useful for supplementary analyses you might want to toggle
```Common Issues and Solutions
Issue 1: “LaTeX Error”
Symptom: PDF rendering fails with LaTeX error
Solution: - Install TinyTeX: tinytex::install_tinytex() - Or switch to HTML output temporarily
Issue 2: “Object not found”
Symptom: Error says variable doesn’t exist
Cause: Code chunks run in order. If you reference music_data before loading it, it fails.
Solution: Ensure data is loaded in early chunk before any analysis chunks.
Issue 3: Figures don’t appear
Symptom: Blank space where figure should be
Solutions: - Check fig-width and fig-height aren’t too large - Ensure echo: false is set (if you don’t want code shown) - Verify the ggplot code actually produces a plot when run in console
Issue 4: Citations don’t work
Symptom: @Author2020 appears as literal text, not formatted citation
Solutions: - Verify bibliography: references.bib in YAML - Check .bib file is in same folder as .qmd - Ensure citation keys match exactly (case-sensitive)
Practice: Building Your Report
Exercise 16.1: Basic Document
Create a Quarto document with: 1. YAML header (title, author, date, PDF format) 2. Three sections: Introduction, Method, Results 3. One code chunk that loads your music data 4. Render to PDF
Exercise 16.2: Tables and Figures
Add to your document: 1. A descriptive statistics table (mean tempo by sentiment) 2. A bar chart showing sentiment distribution 3. Proper captions for both 4. Cross-references in text
Exercise 16.3: Inline Statistics
Write a Results paragraph that: 1. Stores a t-test result in a hidden chunk 2. Reports the statistics inline using `r variable` syntax 3. Includes t-statistic, df, p-value, and means
Exercise 16.4: Citations
- Create a
references.bibfile with 3 entries - Cite them in your Introduction
- Add a References section
- Render and verify the bibliography appears correctly
Reflection Questions
Reproducibility Trade-offs: Quarto requires more upfront learning than Word. Is the reproducibility worth it? For what kinds of projects would you choose Quarto over Word? When would Word be sufficient?
The Automation Question: When you automate table and figure generation, you lose manual control. What if you want a table formatted differently than the default? How do you balance automation’s efficiency with customization needs?
Transparency: Quarto makes your entire analysis visible and rerunnable. This is good for science but potentially uncomfortable for researchers (it exposes all decisions). Should all research code be shared publicly? Where’s the line between transparency and privacy?
Chapter Summary
This chapter taught reproducible document creation with Quarto:
- Quarto integrates code, analysis, and narrative into reproducible documents
- Three components: YAML metadata, Markdown text, R code chunks
- Advantages over Word: Reproducibility, version control, automation, multi-format output
- Tables: Create with
kable(), auto-numbered withtbl-cap, reference with@tbl-label - Figures: Auto-numbered with
fig-cap, reference with@fig-label - Inline code: Embed statistics in text with
`r code` - Citations: Manage with BibTeX, cite with
@key, auto-generate bibliography - Multiple formats: Same source → PDF (via LaTeX), HTML, or Word
- One-click reports: Update entire document when data changes
- Chunk options: Control visibility (
echo), execution (eval), output (include) - Cross-references: Link to tables (
@tbl-), figures (@fig-), sections (@sec-)
Key Terms
- BibTeX: Citation format used by Quarto and LaTeX
- Code chunk: Block of R code embedded in Quarto document
- Cross-reference: Link to numbered table, figure, or section
- Inline code: R code embedded within text (
`r code`) - Knit/Render: Process of executing code and generating final document
- Markdown: Lightweight markup language for formatting text
- Quarto: Scientific publishing system combining code and narrative
- Reproducible document: Document that can be regenerated from source code
- YAML: Metadata format at top of Quarto documents (Yet Another Markup Language)
Looking Ahead
Chapter 17 (Going Live) concludes the textbook with research dissemination and sharing. You’ve created a reproducible research report—now you’ll learn to share it responsibly. This final chapter covers archiving data and code on the Open Science Framework and GitHub, writing abstracts and plain-language summaries for different audiences, creating online research portfolios, ethical considerations in sharing findings, and understanding open science practices. You’ll learn how to make your work discoverable, reproducible, and impactful beyond the classroom.