Sign in

I would like to start this blog off by giving all credit to whom it’s do. A fellow named Ismael Araujo made a fascinating post about around a month ago that can be found here. He explains quite nicely how you can run over forty models in around ten lines of code using a library called Lazy Predict. I will explain my thoughts on the library and demonstrate how to implement it with both classification as well as regression models in this post.

Datasets…

To start, I will demonstrate how to get classification predictions using a drug type dataset. …


In statistics, exploratory data analysis is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. This is a word for word definition of what going through the process of EDA involves. At first this may seem like a daunting task when first starting in the data science world, but luckily a very intuitive tool has been designed to combat this. …


Hyperparameters are some of the most useful and powerful tools in the world of data science once you know what they are, what they do, how they work, as well as how and when to use them most effectively. In this post I will discuss the implementation of two functions that both are included in the model_selection package of Scikit-learn(SK-learn). The first part of the package that I am going to cover is a method called, RandomizedSearchCV. The second that I will be going over is called GridSearchCV.

Photo by Marten Newhall on Unsplash

For this post it is assumed that you have a general understanding…


This will be my second blog post describing how to import CSV (Comma separated values) files into your desired platform to work with. My first post discusses how to import data into Google Colaboratory, a link to it can be found here. In this post I will be explaining methods for working specifically with Jupyter Notebook.

Method 1…

Both of the methods that I will discuss require pandas to work successfully, you can use the cell found below to import it. …


Data science is a world of fascination and freedom to do whatever your heart may desire once you have any understanding as to what is happening. Beyond having any programming skills or understanding of data analyzation, the very first thing you’re going to need to do to work with an actual set of data is to have it imported into whatever platform or coding software you choose to work with. In this short blog post I will be demonstrating three different methods that can be used to import specifically CSV (Comma separated values) files into Google’s free platform, Google Colaboratory.


Cars are one of the most essential tools in just about all of our daily lives, while the money we spend from day to day certainly is. With that being said it is a justifiable inquiry to look into what is going to have the most impact on the price that you will most likely end up paying for your vehicle.

The Dataset…

The dataset that I chose to use was originally scraped from www.auctionexport.com, which is a website dedicated to car sales in the US. It was then uploaded to kaggle.com, where I found it and put the numbers…


This link is to a Kaggle post that explains in detail how cardiovascular diseases are the #1 cause of death globally and the many things that may have an impact on it.

Heart failure is a very common event caused by cardiovascular diseases and this dataset contains many variables that can be used to predict mortality rate in the case of heart failure.

@leerowe

Data Scientist | ML Engineer | Create Value

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store