The 3 Data Visualization Packages You Need for Machine Learning

Obviously, AI taken off in the last year in ways that were hard for most people to predict. AI went from being a somewhat niche technical subject that nerdy guys talked about on college campuses, to being so popular that Boomer parents and grandparents are saying that “AI will change everything.” And although I think … Read more

How to do Simple EDA for Machine Learning

An old image of the Titanic in Belfast, Ireland, with the Python logo off to the upper right hand side of the image.

In this tutorial, I’ll show you how to do some simple exploratory data analysis (EDA) for a machine learning project. In this tutorial, we’ll look at the Titanic dataset, which is commonly used in machine learning tutorials, and has previously been used as a Kaggle dataset. This tutorial will really only scratch the surface. There’s … Read more

How to Make a Seaborn Lineplot

An image that shows a simple line chart made in Python with Seaborn.

In this tutorial, I’ll show you how to create a Seaborn lineplot with the Seaborn Objects interface. So this tutorial will show you the syntax for building a lineplot with the Seaborn objects system, and it will show you step-by-step examples. If you need something specific, just click on any of the links below. Each … Read more

A Quick Introduction to the Seaborn Objects System

An image that shows the high-level syntax of the Seaborn Objects data visualization system.

Have you ever been frustrated with data visualization in Python? Matplotlib – as powerful as it is – has a very clumsy syntax. It’s hard to use. Plotly is OK, but still feels complicated for more advanced visualizations. Personally, I’ve been frustrated with the data visualization options in Python, and I’ve been waiting for a … Read more

How to Use geom_smooth in R

An image that shows a ggplot2 scatterplot with a smooth trend line created with geom smooth.

This tutorial will show you how to use the geom_smooth function in R. It explains what geom_smooth does, explains the syntax, and shows step-by-step examples of how to use this function. If you need something specific, you can click on any of the following links. These links will take you directly to the appropriate place … Read more

The Best Python Package for Data Visualization

An image of a laptop, playing a video that explains how to create a scatterplot in Seaborn

Data visualization is extremely important in data science. Although you often hear about the importance of data manipulation (i.e., “80% of data science is data manipulation”), data visualization is just as important. I explained why in a recent blog post about why you need to master data visualization. That blog post has a detailed explanation … Read more

Why R is My Favorite Language for “First Time” Data Scientists

Probably the most common question I get from new data science students is, “Which language should I learn … R or Python?” This is a somewhat complex question to answer, because it depends on who you are and what your goals are. Having said that, I do have a preference for first time data scientists. … Read more

How to Create Plotly Small Multiple Charts

In this tutorial, I’ll show you how to create small multiple charts with Plotly Express. So I’ll explain the syntax of how to create Plotly small multiple charts. I’ll also show you a few clear examples, so you can see how it’s done. The tutorial has several sections. If you need something specific, just click … Read more