Assignment 1 - Getting Started

CSCI 447
Machine Learning
Spring 2019

Schedule | Assignments | Course Syllabus | Moodle (447) | Moodle (547)

ASSIGNMENT 2

The goal of this assignment is to get the experience of coding the linear and logistic regression algorithms. Due date is Wed. 2/6/19, midnight. If you encounter problems, again, let me know. You can use the numpy library for the linear algebra functions, but I expect you to write your own code for the regression models, and not use the scikit library.

1: Run linear regression on the graduate admissions data.
Use python in a Jupyter notebook to do this. Whether you choose to do this using AWS resources or on a local machine is up to you. The graduate admissions data has 7 independent variables (ignoring the serial number "feature") and 1 dependent/output variable. There are 500 observations. When running linear regression with the data, you should divide the dataset into training and testing sets.

The output of this should be the coefficients for a linear equation. Use these to make predictions with your test data set and report the results - that is, the mean squared error obtained against the test set.

2: Transform the graduate admissions data so that the output variable is binary - accept or reject.
If you use 0.73 as the cutoff value between accept and reject, you will get close to a 50/50 split between the two classifications.

3: Run logistic regression on the transformed data.
As with linear regression, use python in a Jupyter notebook. Once again, you can choose to do this on AWS or locally. And once again, you should be building your model on the training part of the data.

The output of this is once again a set of coefficients, but these are to be used within the sigmoid transformation function.Run this on your test data set and report the results.

Ideally, I would like to see this all in one notebook, just so it's easier to see all your work.

Submission: If you have done your work on your local machine, please submit the .ipynb file(s) and transformed graduate admissions data file to the Moodle dropbox. If you have done it on AWS, I should be able to access your work.

Page last updated: February 01, 2019