One of these things is not like the other…

Categories: Journal, Machine Learning
Comments: Comments Off
Published on: August 20, 2018

HAL8999 – 3/100

  • Chapter 2 of Hands on ML
    • Cost functions
    • virtualenv setup
    • code to get the dataset

The chapter follows a rudimentary machine learning project from business case to final product. California census data is analyzed to build a model which will predict media housing price in a district based on other factors using a linear regression model with a Root Mean Square Error (RMSE) function to measure performance i.e. as a cost function.

\(\displaystyle RMSE(X, h) = \sqrt{\frac{1}{m}\sum_{i=1}^{m}(h(x^{(i)}) – y^{(i)})^{2}}\)

The function h is the “hypothesis” function which operates on the feature vector \(x^{(i)}\). RMSE isn’t the only cost function by any stretch of the imagination but it seems to get a lot of use.

From this point the author goes through the dev environment setup process I went through a few days ago and it’s pretty clear from the instructions that the work is being done on a Mac. 

The code to download the housing tarball is a little sloppy and would have downloaded it every time I ran the cell so I added a simple test to only do the download if the file didn’t already exist.

