{"id":7050,"date":"2023-08-28T15:00:06","date_gmt":"2023-08-28T20:00:06","guid":{"rendered":"https:\/\/www.sharpsightlabs.com\/?p=7050"},"modified":"2024-02-06T15:02:42","modified_gmt":"2024-02-06T21:02:42","slug":"sklearn-make_classification","status":"publish","type":"post","link":"https:\/\/www.sharpsightlabs.com\/blog\/sklearn-make_classification\/","title":{"rendered":"Sklearn make_classification, Explained"},"content":{"rendered":"<p>With the rise of AI, machine learning has suddenly become very popular.  <\/p>\n<p>Machine learning has been around for decades, but machine learning systems are becoming increasingly important in a range of fields, from healthcare, to finance, to marketing.<\/p>\n<p>Python, with a range of libraries for data science and ML, has arguably become the top language for machine learning.  And the most popular machine learning library in Python is scikit-learn (often referred to as sklearn).<\/p>\n<p>In this post, we&#8217;re going to take a close look at one particular function from scikit-learn: make_classification. <\/p>\n<p>This tool helps us generate synthetic datasets for classification problems.  This makes it very useful for practicing machine learning and evaluating machine learning algorithms.<\/p>\n<p>We&#8217;ll look at what make_classification function does, how the syntax is structured, and I&#8217;ll also show you a simple example.<\/p>\n<p>The blog post is divided into sections, and if you need anything specific, just click on one of the following links.<\/p>\n<p><strong>Table of Contents:<\/strong><\/p>\n<ul>\n<li><a href=\"#intro-make-classification\">Introduction<\/a><\/li>\n<li><a href=\"#make-classification-syntax\">Syntax<\/a><\/li>\n<li><a href=\"#make-classification-examples\">Examples<\/a><\/li>\n<li><a href=\"#make-classification-FAQ\">Frequently Asked Questions<\/a><\/li>\n<\/ul>\n<p>That said, let&#8217;s dive into the sklearn make_classification function.<\/p>\n<p><a id = \"intro-make-classification\"><\/a><\/p>\n<h2>A Quick Introduction to Sklearn make_classification<\/h2>\n<p>The sklearn make_classification function allows <a href=\"https:\/\/www.python.org\" target=\"_blank\" rel=\"noopener\">Python<\/a> users to create datasets that they can use for <a href=\"https:\/\/www.sharpsightlabs.com\/blog\/regression-vs-classification\/\" target=\"_blank\" rel=\"noopener\">classification models<\/a>.<\/p>\n<p>It allows you to make data with binary labels and multiclass labels.<\/p>\n<p>For example, here is a plot of a binary dataset that I made with make_classification:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d.png\" alt=\"A scatterplot of a binary dataset, showing a set of blue points and red points.\" width=\"656\" height=\"377\" class=\"aligncenter size-full wp-image-7359\" srcset=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d.png 656w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d-600x345.png 600w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d-610x351.png 610w\" sizes=\"(max-width: 656px) 100vw, 656px\" \/><\/p>\n<p>(I&#8217;ll show you how to create this exact dataset later.)<\/p>\n<p>And importantly, it provides functionality that allows you to specify things like:<\/p>\n<ul>\n<li>the number of features<\/li>\n<li>the number of classes<\/li>\n<li>the number of <em>informative<\/em> features<\/li>\n<li>the number of examples<\/li>\n<li>&#8230; and many other details<\/li>\n<\/ul>\n<p>Now that we&#8217;ve seen a brief overview of its capabilities, let&#8217;s delve deeper into the syntax of make_classification to understand how we can use it properly.<\/p>\n<p><a id = \"make-classification-syntax\"><\/a><\/p>\n<h2>The Syntax of Sklearn make_classification<\/h2>\n<p>Here, I&#8217;m going to explain the syntax of the Scikit Learn make_classification function.<\/p>\n<p>I&#8217;ll explain the high-level syntax, but also some of the details about the most important parameters.<\/p>\n<h4>A quick note<\/h4>\n<p>Everything I&#8217;m about to explain assumes that you have Scikit Learn installed on your machine, and that you&#8217;ve imported make_classification as follows:<\/p>\n<pre>\r\nfrom sklearn.datasets import make_classification\r\n<\/pre>\n<p>With that said, let&#8217;s look at the syntax.<\/p>\n<h3>make_classification syntax<\/h3>\n<p>The basic syntax, is very, very simple.<\/p>\n<p>Assuming that you&#8217;ve imported the function as described above, you can call the function by typing <code>make_classification()<\/code>.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax.png\" alt=\"An image of the syntax of the sklearn make_classification function.\" width=\"620\" class=\"aligncenter size-full wp-image-7360\" srcset=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax.png 1240w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-600x377.png 600w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-1024x644.png 1024w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-768x483.png 768w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-1080x679.png 1080w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-610x384.png 610w\" sizes=\"(max-width: 1240px) 100vw, 1240px\" \/><\/p>\n<p>There are a few important parameters as well that you can specify inside the parenthesis:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters.png\" alt=\"An image that shows the syntax for the most important parameters of the Scikit Learn make classification function, including n_samples, n_features, n_classes and n_informative.\" width=\"778\" class=\"aligncenter size-full wp-image-7361\" srcset=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters.png 1556w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters-600x348.png 600w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters-1024x594.png 1024w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters-768x445.png 768w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters-1536x890.png 1536w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters-1080x626.png 1080w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_syntax-parameters-610x354.png 610w\" sizes=\"(max-width: 1556px) 100vw, 1556px\" \/><\/p>\n<p>In some sense, the parameters are the most important part of the function, because they determine the exact structure and content of the output dataset.<\/p>\n<p>That being the case, let&#8217;s quickly discuss the important parameters.<\/p>\n<p><a id = \"make-classification-parameters\"><\/a><\/p>\n<h3>The Parameters of make classification<\/h3>\n<p>The Scikit Learn make classification function has quite a few parameters, but I believe that the most important are:<\/p>\n<ul>\n<li><code>n_samples<\/code><\/li>\n<li><code>n_features<\/code><\/li>\n<li><code>n_classes<\/code><\/li>\n<li><code>n_informative<\/code><\/li>\n<li><code>n_redundant<\/code><\/li>\n<li><code>class_sep<\/code><\/li>\n<li><code>n_clusters_per_class<\/code><\/li>\n<li><code>random_state<\/code><\/li>\n<\/ul>\n<p>Let&#8217;s look at each of these, one at a time.<\/p>\n<h6><code>n_samples<\/code> (required)<\/h6>\n<p>The <code>n_samples<\/code> parameter controls the number of samples in the output dataset.<\/p>\n<p>Said differently, it controls the number of examples (or, the number of rows of data, if you&#8217;re thinking of a simple row-and-column dataset).<\/p>\n<p>By default, this is set to 100.<\/p>\n<h6><code>n_features<\/code><\/h6>\n<p>The <code>n_features<\/code> parameter controls the number of features in the output dataset.<\/p>\n<p>Remember, the features are like the inputs to a machine learning model.  They are the columns that a machine learning algorithm learns from in order to make a prediction.  Features are like the inputs to a model, and labels\/targets are like the outputs of a model (to learn a bit more about features and labels, read our blog post on <a href=\"https:\/\/www.sharpsightlabs.com\/blog\/supervised-vs-unsupervised-learning\/\" target=\"_blank\" rel=\"noopener\">Supervised vs Unsupervised machine learning<\/a>).<\/p>\n<p>This will include the informative features, redundant features, and repeated features (if you use them when you create your dataset).<\/p>\n<p>By default, this parameter is set to 20.<\/p>\n<h6><code>n_classes<\/code><\/h6>\n<p>The <code>n_classes<\/code> parameter controls the number of classes in the output dataset.<\/p>\n<p>As mentioned above, the classes are the different possible categories for the target variable (remember that <a href=\"https:\/\/www.sharpsightlabs.com\/blog\/supervised-vs-unsupervised-learning\/\" target=\"_blank\" rel=\"noopener\">in supervised learning<\/a> the dataset has a target\/label variable that we&#8217;re trying to predict).<\/p>\n<p>By default, this is set to <code>n_classes = 2<\/code>, so by default, make_classification will produce a binary dataset.<\/p>\n<h6><code>n_informative<\/code><\/h6>\n<p>The <code>n_informative<\/code> parameter controls the number of <code>informative<\/code> features in the output dataset.<\/p>\n<p>So what does informative mean? <\/p>\n<p>An informative feature is one that has a relationship with the target label.  It carries information that enables us to learn how to predict the categorical values in the data.<\/p>\n<p>So the rest of the features (the un-informative ones) may be noisy or otherwise irrelevant.<\/p>\n<p>Introducing uninformative (or noisy) features into a dataset be useful, especially for experimental or educational purposes.  <\/p>\n<p>For example, uninformative features can:<\/p>\n<ul>\n<li>Make the dataset more realistic, since many real-world datasets have irrelevant features.<\/li>\n<li>Help with testing model robustness, since testing a model on data with irrelevant features can help gauge how it handles irrelevant inputs.<\/li>\n<li>Help with practicing feature selection, since we typically use feature selection to remove irrelevant features.<\/li>\n<li>Help test regularization, since regularization typically mitigates the effects of irrelevant or noisy featuers.<\/li>\n<\/ul>\n<p>So it may sound a bit strange to have uninformative features, but if we&#8217;re making a dataset for machine learning practice or algorithm evaluation, it may actually be useful for the synthetic dataset to have uninformative features.<\/p>\n<h6><code>n_redundant<\/code><\/h6>\n<p><code>n_redundant<\/code> enables you to specify how many redundant features there are.<\/p>\n<p>It might seem odd, but redundant features can be useful if you&#8217;re practicing machine learning or testing a particular algorithm.<\/p>\n<p>We can use redundant features to:<\/p>\n<ul>\n<li>Practice feature selection methods<\/li>\n<li>Evaluate model performance in the face of redundant features<\/li>\n<li>Evaluate the regularization ability of an algorithm<\/li>\n<\/ul>\n<p>And more.<\/p>\n<p>So like the &#8220;uninformative&#8221; features discussed earlier, redundant features can serve a useful purpose when we practice ML or try to evaluate algorithm performance.<\/p>\n<h6><code>class_sep<\/code><\/h6>\n<p>The <code>class_sep<\/code> parameter (short for &#8220;class separation&#8221;) controls the amount of separability between the generated classes.<\/p>\n<p>There are some algorithm types where want (or need) the classes to be perfectly separable.  <\/p>\n<p>There are some algorithm types that allow the classes to overlap somewhat (so overlapping classes is good for testing such algorithms).<\/p>\n<p><code>class_sep<\/code> allows you to control the degree to which the classes overlap.<\/p>\n<h6><code>n_clusters_per_class<\/code><\/h6>\n<p><code>n_clusters_per_class<\/code> allows you to specify how many clusters will be generated for every class.<\/p>\n<p>By default, this is set to 2.<\/p>\n<p>Why would you want to use this?<\/p>\n<p>In some classification datasets, all of the data points for a particular class will form a tight &#8220;cluster&#8221;.  They will be grouped together in feature space. <\/p>\n<p>But other times, members of a single class might form multiple clusters of data &#8230; they might form separate groups.<\/p>\n<p>Datasets where classes have multiple clusters are generally more complex, and a <em>synthetic<\/em> dataset with multiple clusters per class may be more &#8220;realistic.&#8221;  <\/p>\n<p>Essentially, the n_clusters_per_class parameter lets you emulate this complexity and real-worldness in the synthetic data created by make_classification.<\/p>\n<h6><code>random_state<\/code><\/h6>\n<p>The random_state parameter allows us to set a seed for the random number generator.<\/p>\n<p>This ensures that any process or function that utilizes random numbers can be reproduced exactly every time we run it.<\/p>\n<p>Essentially, this enables reproducibility when the code is run multiple times, whether by the same individual or different people.<\/p>\n<p>If you want to learn more about seeds and random number generators, read our <a href=\"https:\/\/www.sharpsightlabs.com\/blog\/numpy-random-seed\/\" target=\"_blank\" rel=\"noopener\">tutorial on Numpy Random Seed<\/a>.<\/p>\n<h5>Other Parameters<\/h5>\n<p>There are several other features that I&#8217;m leaving out here for the sake of brevity, like <code>weights<\/code>, <code>flip_y<\/code>, <code>hypercube<\/code>, and several others.<\/p>\n<p>However, many of these will be somewhat rarely used, so in the beginning, you may want to avoid using them unless absolutely necessary.<\/p>\n<h3>The Output of make_classification<\/h3>\n<p>The output of the Scikit Learn make_classification function is 2 <a href=\"https:\/\/www.sharpsightlabs.com\/blog\/numpy-array-python\/\" target=\"_blank\" rel=\"noopener\">Numpy arrays<\/a>.<\/p>\n<p>The first is a Numpy array with shape <code>(n_samples, n_features)<\/code>.  This is the so-called <code>X<\/code> array, which contains the feature data.<\/p>\n<p>The second array is a Numpy array with shape <code>(n_samples,)<\/code>.  This is the so-called <code>y<\/code> array, which contains the labels.  It&#8217;s essentially a vector of labels associated with every example in <code>X<\/code>. Importantly, the <code>y<\/code> array contains integers representing the classes, with the number of unique integers being determined by the <code>n_classes<\/code> parameter.<\/p>\n<p><a id = \"make-classification-examples\"><\/a><\/p>\n<h2>Examples of How to Use Make Classification<\/h2>\n<p>Now that I&#8217;ve shown you the syntax of make_classification, let&#8217;s look at a couple of examples.<\/p>\n<p><strong>Examples:<\/strong><\/p>\n<ul>\n<li><a href=\"#example-1\">Generate Data For Logistic Regression<\/a><\/li>\n<li><a href=\"#example-2\">Create a 2D dataset with separated classes<\/a><\/li>\n<\/ul>\n<h4>Run this code first<\/h4>\n<p>Before you run the examples, make sure that you import the <code>make_classification<\/code> function with this code:<\/p>\n<pre>\r\nfrom sklearn.datasets import make_classification\r\n\r\nimport matplotlib.pyplot as plt\r\nimport seaborn as sns\r\n<\/pre>\n<p>Here, we&#8217;re also importing Seaborn and Matplotlib&#8217;s Pyplot, which we&#8217;ll use to visualize the data we generate.<\/p>\n<p>Once you run it, you&#8217;ll be ready to get started.<\/p>\n<p><a id = \"example-1\"><\/a><\/p>\n<h3>EXAMPLE 1: Generate Data For Logistic Regression<\/h3>\n<p>Here, we&#8217;re going to generate some data that will be well suited for a Logistic Regression model.<\/p>\n<p>We&#8217;re going to make a dataset with:<\/p>\n<ul>\n<li>1000 examples (i.e., samples)<\/li>\n<li>1 feature (an informative feature)<\/li>\n<li>1 cluster per class<\/li>\n<li>mild class separation<\/li>\n<\/ul>\n<p>And we&#8217;re going to initialize a random seed for the random number generator <\/p>\n<pre>\r\nX, y = make_classification(n_samples = 1000\r\n                           ,n_features = 1\r\n                           ,n_informative = 1\r\n                           ,n_redundant = 0\r\n                           ,n_clusters_per_class = 1\r\n                           ,class_sep = 2\r\n                           ,random_state = 2\r\n                           )\r\n<\/pre>\n<p>And let&#8217;s visualize this data with <a href=\"https:\/\/www.sharpsightlabs.com\/blog\/seaborn-scatter-plot\/\" target=\"_blank\" rel=\"noopener\">a Seaborn scatterplot<\/a>, so you can see it:<\/p>\n<pre>\r\nplt.style.use('fivethirtyeight')\r\nsns.scatterplot(x = X.flatten(), y = y, hue = y)\r\n<\/pre>\n<p>OUT:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/make-classification_logistic-regression-data.png\" alt=\"An image of a dataset made with Sklearn make_classification, with 100 samples, 1 feature, and mild class separation, visualized with Seaborn.\" width=\"534\" height=\"377\" class=\"aligncenter size-full wp-image-7364\" \/><\/p>\n<p>Here, we have a dataset with 1 feature and 2 classes.<\/p>\n<p>We&#8217;ll be able to <a href=\"https:\/\/www.sharpsightlabs.com\/blog\/sklearn-logistic-regression\/\" target=\"_blank\" rel=\"noopener\">fit a logistic regression model<\/a> to this, which I&#8217;ll show you how to do in a future blog post.<\/p>\n<p><a id = \"example-2\"><\/a><\/p>\n<h3>EXAMPLE 2: Create a 2D dataset with separated classes<\/h3>\n<p>In this example, I&#8217;m going to show you how to create a slightly more complicated dataset.<\/p>\n<p>Here, we&#8217;ll create a dataset with:<\/p>\n<ul>\n<li>200 examples (i.e., samples)<\/li>\n<li>2 features (both of them informative features, 0 redundant)<\/li>\n<li>1 cluster per class<\/li>\n<li>mild class separation<\/li>\n<\/ul>\n<p>And again, we&#8217;ll use <code>random_state<\/code> to set a random seed for our random number generator (which will make the output of the code precisely reproducible).<\/p>\n<p>Here&#8217;s the code to make the dataset:<\/p>\n<pre>\r\nX, y = make_classification(n_samples = 200\r\n                           ,n_features = 2\r\n                           ,n_informative = 2\r\n                           ,n_redundant = 0\r\n                           ,n_clusters_per_class = 1\r\n                           ,flip_y = 0\r\n                           ,class_sep = 2\r\n                           ,random_state = 7\r\n                           )\r\n<\/pre>\n<p>And let&#8217;s visualize the data:<\/p>\n<pre>\r\nplt.style.use('fivethirtyeight')\r\n\r\nplt.figure(figsize = (10,6))\r\nsns.scatterplot(x = X[:,0], y = X[:,1], hue = y, s = 50)\r\n<\/pre>\n<p>OUT:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d.png\" alt=\"A scatterplot of a binary dataset, showing a set of blue points and red points.\" width=\"656\" height=\"377\" class=\"aligncenter size-full wp-image-7359\" srcset=\"https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d.png 656w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d-600x345.png 600w, https:\/\/www.sharpsightlabs.com\/wp-content\/uploads\/2023\/09\/sklearn-make-classification_binary-scatter-2d-610x351.png 610w\" sizes=\"(max-width: 656px) 100vw, 656px\" \/><\/p>\n<p>As you can see, here, we have a dataset with 2 distinct classes. <\/p>\n<p>There are several tools that we could use to classify this data, such as an SVM or a decision tree (although a decision tree would work better if we could rotate the data).<\/p>\n<p><a id = \"KEYWORD-FAQ\"><\/a><\/p>\n<h2>Frequently Asked Questions About make_classification<\/h2>\n<p>Do you have other questions about the Scikit Learn make classification function?<\/p>\n<p>Is there something that I missed?<\/p>\n<p>Are you still confused about something?<\/p>\n<p>Leave your questions and comments in the comments section at the bottom of the page.<\/p>\n<h2>For more Python Machine Learning tutorials, sign up for our email list<\/h2>\n<p>In this blog post, I&#8217;ve shown you how to use Sklearn make_classification. <\/p>\n<p>This is one simple tool for creating synthetic machine learning datasets.<\/p>\n<p>But if you really want to master machine learning in Python, there&#8217;s a lot to learn.<\/p>\n<p>That being said, if you want to learn more about ML and data science in Python, then sign up for our email list.<\/p>\n<p>When you sign up, you\u2019ll get free tutorials on:<\/p>\n<ul>\n<li>NumPy<\/li>\n<li>Pandas<\/li>\n<li>Base Python<\/li>\n<li>Scikit learn<\/li>\n<li>Machine learning<\/li>\n<li>Deep learning<\/li>\n<li>\u2026 and more.<\/li>\n<\/ul>\n<p>So if you&#8217;re ready to master Python machine learning, then sign up now, and get our tutorials delivered direct to your inbox.<\/p>\n<div class=\"et_bloom_inline_form et_bloom_optin et_bloom_make_form_visible et_bloom_optin_7\" style=\"display: none;\">\n\t\t\t\t<style type=\"text\/css\">.et_bloom .et_bloom_optin_7 .et_bloom_form_content { background-color: #999999 !important; } .et_bloom .et_bloom_optin_7 .et_bloom_form_container .et_bloom_form_header { background-color: #2a2b2d !important; } .et_bloom .et_bloom_optin_7 .carrot_edge .et_bloom_form_content:before { border-top-color: #2a2b2d !important; } .et_bloom .et_bloom_optin_7 .carrot_edge.et_bloom_form_right .et_bloom_form_content:before, .et_bloom .et_bloom_optin_7 .carrot_edge.et_bloom_form_left .et_bloom_form_content:before { border-top-color: transparent !important; border-left-color: #2a2b2d !important; }\n\t\t\t\t\t\t@media only screen and ( max-width: 767px ) {.et_bloom .et_bloom_optin_7 .carrot_edge.et_bloom_form_right .et_bloom_form_content:before { border-top-color: #2a2b2d !important; border-left-color: transparent !important; }.et_bloom .et_bloom_optin_7 .carrot_edge.et_bloom_form_left .et_bloom_form_content:after { border-bottom-color: #2a2b2d !important; border-left-color: transparent !important; }\n\t\t\t\t\t\t}.et_bloom .et_bloom_optin_7 .et_bloom_form_content button { background-color: #4598ab !important; } .et_bloom .et_bloom_optin_7 .et_bloom_form_content .et_bloom_fields i { color: #4598ab !important; } .et_bloom .et_bloom_optin_7 .et_bloom_form_content .et_bloom_custom_field_radio i:before { background: #4598ab !important; } .et_bloom .et_bloom_optin_7 .et_bloom_form_content button { background-color: #4598ab !important; } .et_bloom .et_bloom_optin_7 .et_bloom_form_container h2, .et_bloom .et_bloom_optin_7 .et_bloom_form_container h2 span, .et_bloom .et_bloom_optin_7 .et_bloom_form_container h2 strong { font-family: \"Open Sans\", Helvetica, Arial, Lucida, sans-serif; }.et_bloom .et_bloom_optin_7 .et_bloom_form_container p, .et_bloom .et_bloom_optin_7 .et_bloom_form_container p span, .et_bloom .et_bloom_optin_7 .et_bloom_form_container p strong, .et_bloom .et_bloom_optin_7 .et_bloom_form_container form input, .et_bloom .et_bloom_optin_7 .et_bloom_form_container form button span { font-family: \"Open Sans\", Helvetica, Arial, Lucida, sans-serif; } <\/style>\n\t\t\t\t<div class=\"et_bloom_form_container  with_edge carrot_edge et_bloom_form_text_dark et_bloom_form_bottom et_bloom_inline_1_field\">\n\t\t\t\t\t\n\t\t\t<div class=\"et_bloom_form_container_wrapper clearfix\">\n\t\t\t\t<div class=\"et_bloom_header_outer\">\n\t\t\t\t\t<div class=\"et_bloom_form_header et_bloom_header_text_light\">\n\t\t\t\t\t\t\n\t\t\t\t\t\t<div class=\"et_bloom_form_text\">\n\t\t\t\t\t\t<h2>Sign up for FREE data science tutorials<\/h2><p>If you want to master data science fast, sign up for our email list.<\/p>\r\n<p>When you sign up, you'll receive FREE weekly tutorials on how to do data science in R and Python.\u00a0<\/p>\n\t\t\t\t\t<\/div>\n\t\t\t\t\t\t\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"et_bloom_form_content et_bloom_1_field et_bloom_bottom_inline\">\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t<form method=\"post\" class=\"clearfix\">\n\t\t\t\t\t\t<div class=\"et_bloom_fields\">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t<p class=\"et_bloom_popup_input et_bloom_subscribe_email\">\n\t\t\t\t\t\t\t\t<input placeholder=\"Enter your best email address\">\n\t\t\t\t\t\t\t<\/p>\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t<button data-optin_id=\"optin_7\" data-service=\"aweber\" data-list_id=\"3573443\" data-page_id=\"7050\" data-account=\"josh@sharpsightlabscom\" data-ip_address=\"true\" class=\"et_bloom_submit_subscription\">\n\t\t\t\t\t\t\t\t<span class=\"et_bloom_subscribe_loader\"><\/span>\n\t\t\t\t\t\t\t\t<span class=\"et_bloom_button_text et_bloom_button_text_color_light\">Give me free tutorials!<\/span>\n\t\t\t\t\t\t\t<\/button>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/form>\n\t\t\t\t\t<div class=\"et_bloom_success_container\">\n\t\t\t\t\t\t<span class=\"et_bloom_success_checkmark\"><\/span>\n\t\t\t\t\t<\/div>\n\t\t\t\t\t<h2 class=\"et_bloom_success_message\">Check your email inbox to confirm your subscription ...<\/h2>\n\t\t\t\t\t\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t<span class=\"et_bloom_close_button\"><\/span>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n","protected":false},"excerpt":{"rendered":"<p>With the rise of AI, machine learning has suddenly become very popular. Machine learning has been around for decades, but machine learning systems are becoming increasingly important in a range of fields, from healthcare, to finance, to marketing. Python, with a range of libraries for data science and ML, has arguably become the top language &#8230; <a title=\"Sklearn make_classification, Explained\" class=\"read-more\" href=\"https:\/\/www.sharpsightlabs.com\/blog\/sklearn-make_classification\/\" aria-label=\"Read more about Sklearn make_classification, Explained\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":7376,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","om_disable_all_campaigns":false,"footnotes":""},"categories":[86,19,84],"tags":[],"publishpress_future_action":{"enabled":false,"date":"2024-05-21 07:30:36","action":"change-status","newStatus":"draft","terms":[],"taxonomy":"category"},"_links":{"self":[{"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/posts\/7050"}],"collection":[{"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/comments?post=7050"}],"version-history":[{"count":1,"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/posts\/7050\/revisions"}],"predecessor-version":[{"id":8029,"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/posts\/7050\/revisions\/8029"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/media\/7376"}],"wp:attachment":[{"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/media?parent=7050"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/categories?post=7050"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sharpsightlabs.com\/wp-json\/wp\/v2\/tags?post=7050"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}