Interactive Data Visualization in R with plotly and ggiraph

health analysis
data science
data wrangling
R
data visualization
R projects
Build interactive graphics in R using plotly and ggiraph. Explore sunshine data across U.S. cities, compare patterns, and highlight anomalies.
Author

Oliver F. Anderson, MS

Published

February 16, 2023

Keywords

R, plotly, ggiraph, interactive graphics, data visualization, sunshine dataset, US cities

This project demonstrates how to create interactive plots in R using the plotly and ggiraph packages. The dataset comes from Kaggle, which reports the percentage of possible sunshine across major U.S. cities.

The goal was twofold:
- Use plotly to create a line plot where users can isolate individual cities for closer inspection.
- Use ggiraph to produce interactive bar charts that compare annual sunshine levels across cities.

Workflow

  1. Data preparation
    • Cleaned and reformatted city names.
    • Converted monthly sunshine percentages into long format for plotting.
  2. Interactive line plot with plotly
    • Highlighted cities of personal interest (Portland, Los Angeles, Honolulu, Chicago, Boston).
    • Enabled legend interactivity so double-clicking a city isolates its data.
  3. Interactive bar charts with ggiraph
    • Built grouped bar charts of annual sunshine by city.
    • Added custom tooltips with CSS styling for clarity.

Code & Results

Look at the code
# load packages and read in data
library(tidyverse)
library(ggiraph)
library(plotly)
library(ggpubr)
library(patchwork)

sunshine <- readr::read_csv('./data/avg_sunshine.csv')

# get cities into the right format
sunshine$CITY <- str_to_title(sunshine$CITY)
sunshine$CITY <- str_replace(sunshine$CITY, ",", ", ")
sunshine$CITY <- gsub("(\\w$)", "\\U\\1", sunshine$CITY, perl = TRUE)

# get rid of duplicates
sunshine <- sunshine %>% 
  filter(CITY != "CitY")

sunshine <- sunshine[!duplicated(sunshine$CITY), ]

# get data into the right format
sunshine <- pivot_longer(sunshine, JAN:DEC, names_to = "month", values_to = "temp")

sunshine <- sunshine[, -4]
sunshine$ANN<-as.numeric(sunshine$ANN)
sunshine$temp<-as.numeric(sunshine$temp)
sunshine$month<-str_to_title(sunshine$month)
sunshine$month<-as.factor(sunshine$month)
sunshine$perc_temp<-sunshine$temp/100


sun_cities <- sunshine %>% 
  filter(CITY %in% c("Portland, OR", "Los Angeles, CA", "Honolulu, HI", "Chicago, IL", "Boston, MA"))

sun_cities$month <- factor(sun_cities$month, levels=c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))



# basic line plot for ggiraph
gg_sunshine <- sun_cities %>%  
  ggplot(aes(x = month, y=temp, text = paste0("Percent possible sunshine in\n",CITY, ": ", temp, "% in ", month))) +
  geom_line(aes(x = month, y = temp, color = CITY, group = CITY), alpha=0.8)+
  labs(x="Month", y="Average percent of possible sunshine", color = "City", caption = "Data from Kaggle.com: uploaded by user thedevastator", title = "How sunny are the cities that \nare important to me?", subtitle = "Measured by time of sunshine reaching earth from sunrise to sunset")+
  theme_minimal()+
  theme(
        plot.title = element_text(vjust = 2),
        axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
  scale_y_continuous(labels = scales::percent_format(scale = 1))+
  scale_color_manual(values=c("black","#0454a4", "#5688c1","#7ca4d4", "#b8cee6"))

ggplotly(gg_sunshine, tooltip = 'text')
JanFebMarAprMayJunJulAugSepOctNovDec20%40%60%80%
CityBoston, MAChicago, ILHonolulu, HILos Angeles, CAPortland, ORHow sunny are the cities that are important to me?MonthAverage percent of possible sunshine
Look at the code
# working on the ggiraph

tooltip_css <- "background-color:#7ca4d4;color:white;padding:5px;border-radius:3px;"

avg_sun <- sunshine %>% group_by(CITY, ANN) %>% summarize() %>% na.omit()

# Creating plots
avg_sun$tooltip <- c(paste0(avg_sun$CITY,": ", avg_sun$ANN, "%"))

sun1 <- avg_sun[1:38,] %>% 
  ggplot()+
  geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,
                           tooltip = tooltip,
                      data_id = ANN), fill="#0454a4")+
  coord_flip()+
    labs(x="", y="Average annual sunshine")+
  theme_minimal()+
  theme(axis.text.x = element_text(vjust = 2))+
  lims(y=c(0,1))+
  scale_y_continuous(labels = scales::percent)+
  scale_x_discrete(labels=NULL)


sun2<-avg_sun[39:76,] %>% 
  ggplot()+
  geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,
                           tooltip = tooltip,
                      data_id = ANN), fill="#0454a4")+
  coord_flip()+
    labs(x="", y="Average annual sunshine")+
  theme_minimal()+
  theme(axis.text.x = element_text(vjust = 2))+
  lims(y=c(0,1))+
  scale_y_continuous(labels = scales::percent)+
  scale_x_discrete(labels=NULL)

sun3<-avg_sun[77:115,] %>% 
  ggplot()+
  geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,
                           tooltip = tooltip,
                      data_id = ANN), fill="#0454a4")+
  coord_flip()+
    labs(x="", y="Average annual sunshine")+
  theme_minimal()+
  theme(axis.text.x = element_text(vjust = 2))+
  lims(y=c(0,1))+
  scale_y_continuous(labels = scales::percent)+
  scale_x_discrete(labels=NULL)

sun4<-avg_sun[116:153,] %>% 
  ggplot()+
  geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,
                           tooltip = tooltip,
                      data_id = ANN), fill="#0454a4")+
  coord_flip()+
    labs(x="", y="Average annual sunshine")+
  theme_minimal()+
  theme(axis.text.x = element_text(vjust = 2))+
  lims(y=c(0,1))+
  scale_y_continuous(labels = scales::percent)+
  scale_x_discrete(labels=NULL)

 p<-ggarrange(sun1, sun2, sun3, sun4, ncol=2, nrow=2, labels=c("A","B","C","D"))
girafe(
  code = print(p + plot_annotation(title="How sunny is your hometown?", subtitle = "Measured by time of sun shine reaching earth from sunrise to sunset", caption = "A: Abilene TX - Des Moines, IA\nB: Detroit, MI - Las Vegas, NV\nC: Lihue, HI - Providence, RI\nD: Pueblo, CO - Yap- W Caroline Is., PC\nData from Kaggle.com: uploaded by user thedevastator", theme = theme(plot.caption.position = "plot", plot.caption = element_text(hjust = 0)))),
  height_svg = 9,
  width_svg = NULL,
  options = list(
    opts_tooltip(css = tooltip_css, opacity = 1),
    opts_sizing(width = .7),
    opts_hover(css = "fill:#0454a4;stroke-width:2;"),
    opts_hover_inv(css = "opacity:0.1;"),
    opts_selection(
      type = "single", 
      only_shiny = FALSE,
      css = "fill:#0454a4"),
    opts_zoom(max=4)
  )
)
0% 20% 40% 60% Average annual sunshine A 0% 20% 40% 60% 80% Average annual sunshine B 0% 20% 40% 60% 80% Average annual sunshine C 0% 20% 40% 60% 80% Average annual sunshine D A: Abilene TX - Des Moines, IA B: Detroit, MI - Las Vegas, NV C: Lihue, HI - Providence, RI D: Pueblo, CO - Yap- W Caroline Is., PC Data from Kaggle.com: uploaded by user thedevastator Measured by time of sun shine reaching earth from sunrise to sunset How sunny is your hometown?

Discussion

The interactive visualizations make it easy to explore both seasonal and geographic variation in sunshine. With plotly, users can highlight specific cities and compare seasonal changes. With ggiraph, tooltips and hover effects provide immediate context for annual averages across more than 150 cities.

One interesting finding is Los Angeles’s dip in May–June sunshine compared to adjacent months, a feature that might reflect local climate effects (e.g., “June Gloom”).

FAQ: Working with ggiraph and plotly in R

How do I make a plot interactive with plotly?

Wrap a ggplot object in ggplotly() or build directly with plot_ly(). Interactivity (zoom, pan, legend toggling) is handled automatically.

How do I add tooltips in ggiraph?

Use geom_col_interactive() or geom_point_interactive() with a tooltip aesthetic. Custom styling can be passed through opts_tooltip().

Can I combine multiple ggiraph plots?

Yes. Arrange them with packages like patchwork or ggpubr, then pass the combined object into girafe().

What file formats can I export to?

Both packages support exporting to HTML. ggiraph visualizations can also be integrated into Shiny apps, R Markdown, and Quarto documents.

Oliver F. Anderson, MS – Computational Biologist, Data Scientist, and Research Consultant based in Portland, Oregon. I design data-driven solutions in bioinformatics, machine learning, and AI automation for research and biotech.

Back to top