A Quick Introduction to Numpy Random Normal

An image normally distributed data created with np.random.normal in Python.

This tutorial will cover the Numpy random normal function (AKA, np.random.normal). If you’re doing any sort of statistics or data science in Python, you’ll often need to work with random numbers. And in particular, you’ll often need to work with normally distributed numbers. The Numpy random normal function generates a sample of numbers drawn from … Read more

The Best Python Package for Data Visualization

An image of a laptop, playing a video that explains how to create a scatterplot in Seaborn

Data visualization is extremely important in data science. Although you often hear about the importance of data manipulation (i.e., “80% of data science is data manipulation”), data visualization is just as important. I explained why in a recent blog post about why you need to master data visualization. That blog post has a detailed explanation … Read more

How to Use the Sklearn Linear Regression Function

A visual example of simple linear regression, where we fit a line to the training data, and then use that line as a model to predict new values.

In this tutorial, I’ll show you how to use the Sklearn Linear Regression function to create linear regression models in Python. I’ll quickly review what linear regression is, explain the syntax of Sklearn LinearRegression, and I’ll show you step-by-step examples of how to use the technique. If you need something specific, just click on any … Read more

How to Use the Sklearn Predict Method

In this tutorial, I’ll show you how to use the Sklearn predict method to predict outputs using a machine learning model in Python. So I’ll quickly review what the method does, I’ll explain the syntax, and I’ll show a example of how to use the technique. If you need something specific, just click on the … Read more

Why R is My Favorite Language for “First Time” Data Scientists

Probably the most common question I get from new data science students is, “Which language should I learn … R or Python?” This is a somewhat complex question to answer, because it depends on who you are and what your goals are. Having said that, I do have a preference for first time data scientists. … Read more

How to Use Numpy Argsort in Python

An image that shows how Numpy Argsort returns the index values that would sort a Numpy array.

This tutorial explains how to use the Numpy argsort function. It explains the syntax of np.argsort, and also shows clear examples. If you need help with something specific, you can click on any of these links. The links will take you to the appropriate part of the tutorial. Table of Contents: Introduction to Numpy Argsort … Read more

How to Use Pandas Get Dummies in Python

An image that shows how the Pandas get dummies function creates dummy variables from categorical data, in Python.

In this tutorial, I’ll show you how to use the Pandas get dummies function to create dummy variables in Python. I’ll explain what the function does, explain the syntax of pd.get_dummies, and show you step-by-step examples. If you need something specific, just click on any of the following links. Table of Contents: Introduction Syntax Examples … Read more

How to Use the Pandas Astype Function in Python

A simple image that shows how the Pandas astype method changes the datatype of Pandas dataframes or Series objects.

In this tutorial, I’ll explain how to use the Pandas astype function to modify the datatype of Pandas dataframe columns and Pandas objects. I’ll explain what the technique does, explain the syntax, and show you step-by-step examples. If you need something specific, you can click on any of the following links. Table of Contents: Introduction … Read more