CSCI 447
Machine Learning
Spring 2019

Montana Technological University
Computer Science & Software Engineering



ASSIGNMENT 4

The goal of this assignment is to get the experience of learning parameters (probabilities) and structure of a Bayesian network. Due date is Mon. 4/15, midnight. If you encounter problems, let me know. You can use the numpy and other libraries, but I expect you to write your own code for learning the parameters and structure. The output of your program need not be graphical, but it should indicate the structure of the network and the CPTs associated with each variable


1: Transform the graduate admissions data so that the data is discrete.
As before, if you use 0.73 as the cutoff value between accept and reject, you will get close to a 50/50 split between the two classifications, and the outcome variable is now discrete. Midway points for the other variables to make them into binary discrete variables are:
GRE: Range 290-340, midpoint 315
TOEFL: Range 92-120, midpoint 106
URating: Range 1-5, 1-2 good, 3-5 fair
SOP: Range 1-5, midpoint 3
LOR: Range 1-5, midpoint 3
CGPA: Range 6.8-9.92, midpoint 8.36
Research: Range 0-1, already discrete

2: Write code to generate structure from the graduate admissions data.
Use python in a Jupyter notebook to do this. Please use a local machine to do this since I don't have any AWS credit left. You should come up with some way to generate initial configurations - the most accepted being to generate a minimum spanning tree and then evaluating edge directionality from that, and finally generating incremental changes to the tree and evaluating these. Use the scoring function with the MDL penalty term that we went over in class to evaluate your structures. By doing this, you will be generating learned CPTs for each of your variables in the context of the structure you have defined.

The output of this should be a final structure and the conditional probabilities associated with each of the nodes in the network, and the score for this structure.

As you write your code, please use the Jupyter notebook capability of "commenting" sections of your code to explain the intent of each step.


Submission: Please submit the .ipynb file(s) and transformed graduate admissions data file to the Moodle dropbox for Assignment 4.



Page last updated: April 10, 2019