Celia Schahczenski
Museum 103, 496-4383

Office Hours: Mon & Fri 10:00-10:50am and Wed 4:00-4:50pm.
Also feel free to make an appointment at another time or to just drop by my office.

This class does not have a required text.
Material will be presented from the following:

Software: Python, Jupyter Notebook, R Studio, Weka

Meeting times and places:
Mon, Wed & Fri, 3:00-3:50pm, in CBB 112.


What is in this course?

In this course you will get a chance to mine data using decision trees, rule based systems, statistical approaches, instance based approaches, linear techniques and clustering. Using small sample datasets, you will explore simplified versions of the above algorithms, to get a solid understanding of what each involves. You will then apply data mining techniques to real life data, seeing how effective these techniques can be. You will get a chance to explore and report on data mining success stories, competitions and specialized techniques. A substantial portion of this class will be applying the data mining process, including data aquisition, data cleaning, transformation & integration, feature extraction, data mining and evaluation, to an area of your choosing, in order to gain insight on a question of your choice. You are free to use Python, R, Weka or other data mining packages.


Activity Percentage
Exams (Sept. 27 & Nov. 1) 20%
Assignments (approximately weekly, includes a reflection paper) 15%
Presentations (case studies, data sets, and specialized data mining) 15%
Data mining competition workshop (done in pairs) 10%
Project 40%

Catalog description of the course:

Provides grounding in data mining techniques and prepares students to design, use, and evaluate these techniques in a variety of application domains and for the purpose of decision support. Topics covered include decision trees, rule based systems, statistical approaches, neural networks, and instance based approaches.