Machine-learning on dirty data in Python: a tutorial

Often in data science, machine-learning applications spend a significant energy preparing, tidying, and cleaning the data before the machine learning.

Here we give a set of Python tutorials on how some of these operations can be simplified with adequate machine-learning tools.

Machine learning with missing values

Machine learning with missing values

Dirty categories: learning with non normalized strings

Dirty categories: learning with non normalized strings