An Exploratory Data Analysis (EDA) is crucial when working on data science projects. Understanding your underlying data, its nature, and structure can simplify decision making on features, algorithms or hyperparameters. A critical part of the EDA is the detection and treatment of outliers. Outliers are observations that deviate strongly from the other data points in a random sample of a population.
In two previously published articles, I discussed how to detect different types of outliers using well-known statistical methods. One article focuses on univariate and the other on multivariate outliers.
In this final post, I want to discuss how to…
An Exploratory Data Analysis (EDA) is essential when working on data science projects. Understanding your underlying data, its nature, and structure can simplify decision making on features, algorithms, and hyperparameters. One crucial part of the EDA is the detection of outliers. Outliers are observations that are far away from the other data points in a random sample of a population.
In a previously posted article, I introduced statistical methods to detect univariate outliers commonly used in practice. In this post, I want to discuss what multivariate outliers are, how they can be detected, and visualized during EDA. …
In my job search process at the beginning of this year, I had the opportunity to meet a recruitment specialist from a renowned business school. Although his usual work involves dealing with business students, he shared some valuable advice with me, which, I believe, could benefit any job seeker. Please note that the examples I will use for illustration in this article are tailored to my personal job hunt in data science but can be exchanged by other roles.
At the top of your CV, implement an “About me” section consisting of 2–3 sentences where you briefly describe yourself, state…
An Explorative Data Analysis (EDA) is crucial when working on data science projects. Knowing your data inside and out can simplify decision making concerning the selection of features, algorithms, and hyperparameters. One essential part of the EDA is the detection of outliers. Simply said, outliers are observations that are far away from the other data points in a random sample of a population.
Because in data science, we often want to make assumptions about a specific population. Extreme values, however, can have a significant impact on conclusions drawn from data or machine learning models. …
There are many free resources on the web to become a better data scientist. One of them being podcasts. Podcasts are a useful source to learn from professionals with experience in the field, technical hacks, or to get used to the data science jargon. Most episodes are between 30–60 minutes and, therefore, a great companion for your daily trip to work or university, while working out or just in between.
You can find some great collections of the best podcasts in data science, engineering, and business already on Medium. While I find some of these podcasts very technical, often covering…
Data Scientist / Idea sharing / Learning & Personal Growth