Published on: August 25, 2018

HAL8999 – 5/100

Today while going back through the Hands On Machine Learning book Ch2 I learned that the CategoricalEncoder referenced in the section on handling categorical attributes still isn’t in scikit-learn. I checked the reqirements.txt which shows scikit-learn=0.19.1. Checking my virtualenv, I should be good.

Turns out that the CategoricalEncoder isn’t going to be in scikit-learn until 0.20 so to get it you have to grab 0.20 from Github rather than just use pip.

Fucking hell… 

So, if you’re going to write a book, it’s probably a good idea to use the stable branch of your libraries rather than the bleeding edge dev branch.

It will be a good exercise to convert the book’s example code to work with the standard OneHotEncoder but I’ve always been a fan of “just works” as a design principle.

