Illuminating Insights: Exploring Interactive Graphics with plotly and ggiraph
health
analysis
code
data wrangling
R
data visualization
Delve into interactive data visualization using plotly and ggiraph in this post. Explore sunshine levels in US cities, analyze intriguing patterns, and uncover potential dream vacation spots.
Published
February 16, 2023
In this post, I explore interactive data visualization using the R packages plotly and ggiraph, featuring a dataset on sunshine levels across the U.S. sourced from Kaggle.
I begin with plotly, creating an interactive line plot that highlights cities of personal significance. Users can double-click cities in the legend to isolate their data for deeper analysis.
Next, I use ggiraph to build a bar chart comparing cities with similar sunshine levels. An interesting anomaly caught my attention —- Los Angeles shows a dip in sunshine during May and June compared to April and July. This raised questions and I aim to investigate further to understand if this pattern is widespread or unique.
Experimenting with these R packages and the sunshine dataset was rewarding. I hope these visualizations inspire you to explore new destinations for your next sunny getaway.
Look at the code
# load packages and read in datalibrary(tidyverse)library(ggiraph)library(plotly)library(ggpubr)# library(widgetframe)library(patchwork)sunshine <- readr::read_csv('~/Desktop/quarto-portfolio/posts/avg_sunshine.csv')# get cities into the right formatsunshine$CITY <-str_to_title(sunshine$CITY)sunshine$CITY <-str_replace(sunshine$CITY, ",", ", ")sunshine$CITY <-gsub("(\\w$)", "\\U\\1", sunshine$CITY, perl =TRUE)# get rid of duplicatessunshine <- sunshine %>%filter(CITY !="CitY")sunshine <- sunshine[!duplicated(sunshine$CITY), ]# get data into the right formatsunshine <-pivot_longer(sunshine, JAN:DEC, names_to ="month", values_to ="temp")sunshine <- sunshine[, -4]sunshine$ANN<-as.numeric(sunshine$ANN)sunshine$temp<-as.numeric(sunshine$temp)sunshine$month<-str_to_title(sunshine$month)sunshine$month<-as.factor(sunshine$month)sunshine$perc_temp<-sunshine$temp/100sun_cities <- sunshine %>%filter(CITY %in%c("Portland, OR", "Los Angeles, CA", "Honolulu, HI", "Chicago, IL", "Boston, MA"))sun_cities$month <-factor(sun_cities$month, levels=c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))# basic line plot for ggiraphgg_sunshine <- sun_cities %>%ggplot(aes(x = month, y=temp, text =paste0("Percent possible sunshine in\n",CITY, ": ", temp, "% in ", month))) +geom_line(aes(x = month, y = temp, color = CITY, group = CITY), alpha=0.8)+labs(x="Month", y="Average percent of possible sunshine", color ="City", caption ="Data from Kaggle.com: uploaded by user thedevastator", title ="How sunny are the cities that \nare important to me?", subtitle ="Measured by time of sunshine reaching earth from sunrise to sunset")+theme_minimal()+theme(plot.title =element_text(vjust =2),axis.text.x =element_text(angle =90, vjust =0.5, hjust=1))+scale_y_continuous(labels = scales::percent_format(scale =1))+scale_color_manual(values=c("black","#0454a4", "#5688c1","#7ca4d4", "#b8cee6"))ggplotly(gg_sunshine, tooltip ='text')
Look at the code
# working on the ggiraphtooltip_css <-"background-color:#7ca4d4;color:white;padding:5px;border-radius:3px;"avg_sun <- sunshine %>%group_by(CITY, ANN) %>%summarize() %>%na.omit()# Creating plotsavg_sun$tooltip <-c(paste0(avg_sun$CITY,": ", avg_sun$ANN, "%"))sun1 <- avg_sun[1:38,] %>%ggplot()+geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,tooltip = tooltip,data_id = ANN), fill="#0454a4")+coord_flip()+labs(x="", y="Average annual sunshine")+theme_minimal()+theme(axis.text.x =element_text(vjust =2))+lims(y=c(0,1))+scale_y_continuous(labels = scales::percent)+scale_x_discrete(labels=NULL)sun2<-avg_sun[39:76,] %>%ggplot()+geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,tooltip = tooltip,data_id = ANN), fill="#0454a4")+coord_flip()+labs(x="", y="Average annual sunshine")+theme_minimal()+theme(axis.text.x =element_text(vjust =2))+lims(y=c(0,1))+scale_y_continuous(labels = scales::percent)+scale_x_discrete(labels=NULL)sun3<-avg_sun[77:115,] %>%ggplot()+geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,tooltip = tooltip,data_id = ANN), fill="#0454a4")+coord_flip()+labs(x="", y="Average annual sunshine")+theme_minimal()+theme(axis.text.x =element_text(vjust =2))+lims(y=c(0,1))+scale_y_continuous(labels = scales::percent)+scale_x_discrete(labels=NULL)sun4<-avg_sun[116:153,] %>%ggplot()+geom_col_interactive(aes(x=reorder(CITY, ANN), y=ANN/100,tooltip = tooltip,data_id = ANN), fill="#0454a4")+coord_flip()+labs(x="", y="Average annual sunshine")+theme_minimal()+theme(axis.text.x =element_text(vjust =2))+lims(y=c(0,1))+scale_y_continuous(labels = scales::percent)+scale_x_discrete(labels=NULL)# sun_plots <- ggarrange(# sun1, sun2, sun3, sun4,# common.legend = TRUE,# legend = "right"# ) p<-ggarrange(sun1, sun2, sun3, sun4, ncol=2, nrow=2, labels=c("A","B","C","D"))girafe(code =print(p +plot_annotation(title="How sunny is your hometown?", subtitle ="Measured by time of sun shine reaching earth from sunrise to sunset", caption ="A: Abilene TX - Des Moines, IA\nB: Detroit, MI - Las Vegas, NV\nC: Lihue, HI - Providence, RI\nD: Pueblo, CO - Yap- W Caroline Is., PC\nData from Kaggle.com: uploaded by user thedevastator", theme =theme(plot.caption.position ="plot", plot.caption =element_text(hjust =0)))),height_svg =9,width_svg =NULL,options =list(opts_tooltip(css = tooltip_css, opacity =1),opts_sizing(width = .7),opts_hover(css ="fill:#0454a4;stroke-width:2;"),opts_hover_inv(css ="opacity:0.1;"),opts_selection(type ="single", only_shiny =FALSE,css ="fill:#0454a4"),opts_zoom(max=4) ))