Select Page

A Sharp Sight Labs reader (and now student), Jason P. recently started learning data science.

He has a background in data analysis (primarily with Excel and related tools in the Microsoft ecosystem) but he wanted to start learning some of the harder skills of data science.

He contacted me after he had diligently reviewed past blog posts on data exploration and foundational data visualization techniques.

When he contacted me, he wanted to apply some of what he had learned by working on a small project.

He also had a dataset in mind: he had found an open dataset about Velib’, a large scale public bike-sharing system in Paris.

So, Jason downloaded the data and created the following visualization:

paris_velib-bike_MAP_800x555

And here’s the code that created it:

#---------------
# LOAD PACKAGES
#---------------

library(ggmap)
library(ggplot2)


#--------------
# GET THE DATA 
#--------------

df.velib_bikes <- read.csv(url("https://vrzkj25a871bpq7t1ugcgmn9-wpengine.netdna-ssl.com/wp-content/uploads/2015/03/paris_velib-bike-stations.txt"), sep=",", header=T) 


#--------------------------
# PLOT THE DOTS ON THE MAP
#--------------------------
map.paris <- qmap("paris", source="stamen", zoom=12, maptype="toner", darken=c(.3, "#BBBBBB")) 

map.paris +
  geom_point(data=df.velib_bikes, aes(x=longitude, y=latitude, size=available_bike_stands, color=available_bike_stands), alpha=.9, na.rm=T) +
  scale_color_gradient(low="#33CC33", high="#003300", name="Number of available\nbike stands") +
  scale_size(range=c(1,11) , guide="none") + 
  ggtitle("Available bike stands in Paris,\n(in open Velib' stations)") +
  theme(text = element_text(family = "Trebuchet MS", color="#666666")) +
  theme(plot.title = element_text( size=32, face="bold", hjust=0, vjust=.5))

 
What I love about this is that within only a few short weeks, Jason created a pretty cool map.

As I’ve emphasized numerous times, it’s essential to learn the foundational visualizations. The reason why is that the foundational techniques lead to more advanced ones.

In this case, this map is simply a variation on the scatterplot. Once you learn the syntax for the scatterplot a map like this becomes fairly easy to create.

Here, Jason learned a little about the ggplot2 scatterplot syntax, and used it along with R’s ggmap create a dot distribution map of Paris’ bike stands.

So to reiterate, if you use the right tools (namely, ggplot2), focus on the basics and really drill them, it’s entirely possible to create something like this within a few weeks.