{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Supervised Learning II\n", "by Mauricio Araya\n", "## (A.K.A. Supervised Learning Lab)\n", "\n", "Credit: Guillermo Cabrera, Matthew Graham, and Scikit Learn\n", "\n", "(Yes... I will force you to code now...)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.- Regularized Regression... or ridding alone\n", "\n", "### Objective\n", "* Understand the effect of the regularization parameters\n", "* Compare Ridge and Lasso\n", "* Warm up!\n", "\n", "### Regularization Reminder (slide)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise\n", "### a) Forging your own Galaxy Photometry/Redshift dataset\n", "* Use my Tuesday's notebook to load the `sdss_gal.csv`\n", "* Downsample the data (use $n=10000$ for example)\n", "* Divide into training and test/validation\n", "* Select the 'u-g' feature (and hack it to be a matrix)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "galaxy_feat = pd.read_csv('sdss_gal.csv', low_memory=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### b) Use the scikit-learn to perform a ridge regression \n", "* Use now a polynomial model (degree = 10 for example)\n", "* Plot your data and the curve line (use the `.predict()` function, not manually)\n", "* Plot the parameters of the regression using a bar plot " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from sklearn.linear_model import Ridge,Lasso\n", "from sklearn.preprocessing import PolynomialFeatures" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### c) See how regularization works\n", "* Pack the code into a function that recieves the degree of the polynomial and the alpha parameter for regularizing Ridge.\n", "* This function should overplot the regression line to the data, and in a separate plot the parameters of the regression using a bar plot.\n", "* Use interact to play with the two values ($d \\in [2,15]$, $\\alpha \\in [0,1]$)\n", "* Do the same for LASSO regularization" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def train_and_plot(d=10,a=0.0):\n", " pass" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.- Linear Regression... without training wheels\n", "\n", "### Objective\n", "\n", "As an exercise lets do a linear regression without training wheels:\n", "* Left wheel: $x \\in \\mathbb{R}$ --> $\\mathbf{x} \\in \\mathbb{R}^n$ variable\n", "* Right wheel: `scikit-learn` package --> just `pandas`, `numpy` and `scipy`\n", "\n", "### Theory\n", "What is a linear model?\n", "$$ y = f(\\mathbf{x};\\mathbf{w}) = \\sum_j w_j \\phi_j(\\mathbf{x}) = \\mathbf{w}^\\textrm{T} \\boldsymbol{\\phi}(\\mathbf{x}) $$\n", "\n", "Examples:\n", "* Polinomial: $\\phi_j(\\mathbf{x}) = \\|\\mathbf{x}\\|^j$\n", "* Gaussian: $\\phi_j(\\mathbf{x}) = \\exp\\left\\{\\frac{- \\|\\mathbf{x} - \\boldsymbol{\\mu}_j\\|^2}{2s^2}\\right\\}$\n", "* Sigmoidal: $\\phi_j(\\mathbf{x}) = \\sigma\\left(\\frac{\\|\\mathbf{x} - \\boldsymbol{\\mu}_j\\|}{s}\\right)= \\frac{1}{1 + \\exp\\left( \\frac{-\\|\\mathbf{x} - \\boldsymbol{\\mu}_j\\|}{s}\\right)}$" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "28373cb2f3aa4eb297d6b28e9e3154d8", "version_major": 2, "version_minor": 0 }, "text/html": [ "
Failed to display Jupyter Widget of type interactive
.
\n", " If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean\n", " that the widgets JavaScript is still loading. If this message persists, it\n", " likely means that the widgets JavaScript library is either not installed or\n", " not enabled. See the Jupyter\n", " Widgets Documentation for setup instructions.\n", "
\n", "\n", " If you're reading this message in another frontend (for example, a static\n", " rendering on GitHub or NBViewer),\n", " it may mean that your frontend doesn't currently support widgets.\n", "
\n" ], "text/plain": [ "interactive(children=(FloatSlider(value=0.0, description='mu', max=2.0, min=-2.0), FloatSlider(value=1.0, description='s', max=10.0, min=0.1), Output()), _dom_classes=('widget-interact',))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "Failed to display Jupyter Widget of type interactive
.
\n", " If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean\n", " that the widgets JavaScript is still loading. If this message persists, it\n", " likely means that the widgets JavaScript library is either not installed or\n", " not enabled. See the Jupyter\n", " Widgets Documentation for setup instructions.\n", "
\n", "\n", " If you're reading this message in another frontend (for example, a static\n", " rendering on GitHub or NBViewer),\n", " it may mean that your frontend doesn't currently support widgets.\n", "
\n" ], "text/plain": [ "interactive(children=(FloatSlider(value=0.0, description='mu', max=2.0, min=-2.0), FloatSlider(value=1.0, description='s', max=10.0, min=0.1), Output()), _dom_classes=('widget-interact',))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "Failed to display Jupyter Widget of type interactive
.
\n", " If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean\n", " that the widgets JavaScript is still loading. If this message persists, it\n", " likely means that the widgets JavaScript library is either not installed or\n", " not enabled. See the Jupyter\n", " Widgets Documentation for setup instructions.\n", "
\n", "\n", " If you're reading this message in another frontend (for example, a static\n", " rendering on GitHub or NBViewer),\n", " it may mean that your frontend doesn't currently support widgets.\n", "
\n" ], "text/plain": [ "interactive(children=(IntSlider(value=1, description='w1', max=5, min=-5), FloatSlider(value=1.0, description='w2', max=1.0, min=0.001), FloatSlider(value=1.0, description='w3', max=1.0, min=0.001), Output()), _dom_classes=('widget-interact',))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "