CSCI 447 |
1: Run linear regression on the graduate admissions data.
Use python in a Jupyter notebook to do this. Whether you choose to do this using AWS resources or on a local machine is up to you. The graduate admissions data has 7 independent variables (ignoring the serial number "feature") and 1 dependent/output variable. There are 500 observations. When running linear regression with the data, you should divide the dataset into training and testing sets.
The output of this should be the coefficients for a linear equation. Use these to make predictions with your test data set and report the results - that is, the mean squared error obtained against the test set.
2: Transform the graduate admissions data so that the output variable is binary - accept or reject.
If you use 0.73 as the cutoff value between accept and reject, you will get close to a 50/50 split between the two classifications.
3: Run logistic regression on the transformed data.
As with linear regression, use python in a Jupyter notebook. Once again, you can choose to do this on AWS or locally. And once again, you should be building your model on the training part of the data.
The output of this is once again a set of coefficients, but these are to be used within the sigmoid transformation function.Run this on your test data set and report the results.
Page last updated: February 01, 2019