Warning: include(/home/faculty/schahczenski/webSites_Common/contact.php): failed to open stream: No such file or directory in /opt/www/classes/csci347/csci347.php on line 10
Warning: include(): Failed opening '/home/faculty/schahczenski/webSites_Common/contact.php' for inclusion (include_path='.:/usr/share/php') in /opt/www/classes/csci347/csci347.php on line 10
Office Hours:
Warning: include(/home/faculty/schahczenski/webSites_Common/officeHours_fall.php): failed to open stream: No such file or directory in /opt/www/classes/csci347/csci347.php on line 16
Warning: include(): Failed opening '/home/faculty/schahczenski/webSites_Common/officeHours_fall.php' for inclusion (include_path='.:/usr/share/php') in /opt/www/classes/csci347/csci347.php on line 16
Texts:
This class does not have a required text.
Material will be presented from the following:
- Data Mining, 4th edition, by Ian H. Witten, Eibe Frank, Mark A. Hall and Christopher Pal
- Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
- An Introduction to R, 2nd edition, by W.N. Venables, D.M. Smith and the R Development Core Team
Software:
Python,
Jupyter Notebook,
R Studio,
Weka
Meeting times and places:
Mon, Wed & Fri, 3:00-3:50pm, in CBB 112.
Prerequisites:
- CSCI 114 (Programming with C#), 117 (Programming with Matlab) or CSCI 135 (Fundamentals of Computer Science), and
- CSCI 340 (Database Design) or BMIS 375 (Data Analytics), and
- M 141 (Math for Business & Social Science) or higher
What is in this course?
In this course you will get a chance to mine data using decision trees, rule based systems, statistical approaches, instance based approaches, linear techniques and clustering. Using small sample datasets, you will explore simplified versions of the above algorithms, to get a solid understanding of what each involves. You will then apply data mining techniques to real life data, seeing how effective these techniques can be. You will get a chance to explore and report on data mining success stories, competitions and specialized techniques. A substantial portion of this class will be applying the data mining process, including data aquisition, data cleaning, transformation & integration, feature extraction, data mining and evaluation, to an area of your choosing, in order to gain insight on a question of your choice. You are free to use Python, R, Weka or other data mining packages.
Grading:
Activity | Percentage |
---|---|
Exams (Sept. 27 & Nov. 1) | 20% |
Assignments (approximately weekly, includes a reflection paper) | 15% |
Presentations (case studies, data sets, and specialized data mining) | 15% |
Data mining competition workshop (done in pairs) | 10% |
Project | 40% |
Catalog description of the course:
Provides grounding in data mining techniques and prepares students to design, use, and evaluate these techniques in a variety of application domains and for the purpose of decision support. Topics covered include decision trees, rule based systems, statistical approaches, neural networks, and instance based approaches.