ITCS6162/ITCS8162

Knowledge Discovery in Databases - KDD


Prerequisites: ITCS6160, full graduate standing or content of the department.
Textbook: "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbauch, Vipin Kumar, Addison Wesley.


Course Outline

Lectures

Association Rules (PPT format)

Association Rules (Video by L. Powell)

Classification Trees(PPT format)

Classification Trees(Video by L. Powell)

Granular Computing (PPT format)

Reducts and Discretization (PPT format)

Reducts(Video by L. Powell)

Discretization(Video by L. Powell)

LERS/ERID (PPT format)

Mining Incomplete Data (PPT format)

Chase Algorithms (PPT format)

Action Rules and Meta-Actions (PPT format)

Sample Problems (Midterm Exam) (WORD format)

Midterm Exam (WORD format)

Clustering Methods (PPT format)

TV Trees (PPT format)

Clustering I (Textbook)(PPT format)

Clustering I(Video by L. Powell)

Clustering II (Textbook)(PPT format)

Clustering - Sample problems (WORD format)

Evaluation Methods (PPT format)

Temporal DB Mining (PPT format)


Sample Problems with Solutions (PDF format)

Sample Problems II (WORD format)

Solutions (WORD format)

Sample Problems III(WORD format)



FINAL


Sample Problems (Final Exam) (WORD format)


Study Report is required for ADKD Certificate Program students. It should be submitted to "ras@uncc.edu" anytime before the Final exam.


Project (maximum 4 students in a group: Use Lisp Miner (http://lispminer.vse.cz/) to extract 15-20 action rules from the dataset of your choice. The dataset should contain minimum 500 tuples. Large variety of datasets is available on http://kdd.ics.uci.edu/ or https://archive.ics.uci.edu/ml/datasets.html
Lisp Miner manual: https://webpages.uncc.edu/ras/Paper-AR.pdf
Your documentation should explain the process of action rules discovery used by Lisp Miner and with each discovered action rule, its meaning/interpretation should be provided.
Deadline for project submission: November 30, 2018


Rubrics will be used for grading the Study Report and the Project: Rubrics


Midterm: October 17
Final: December 12 (Wednesday), 2:00pm-4:30pm
Alternate Final (for students having two overlapping exams): December 13 (Thursday), 2:00-4:30pm, COED 038
Points for MS/PhD students: 30 points Test, 30 points Final, 30 points Project, 10 points Attendance
Points for ADKD students (not in MS/PhD Program): 30 points Test, 30 points Final, 30 points Project + Study Report, 10 points Attendance
Grades: A [90-100], B [80-89], C [65-79]




Instructor:       Zbigniew W. Ras

Office:
Location: Woodward Hall 430C
Telephone: 704-687-8574
Office Hours: Tuesday: 10:00am-12:00pm, 2:00-3:00pm
e-mail: ras@uncc.edu



GTA:       Yuehua Duan Office: Location: KDD Lab. (Woodward Hall 402)
Telephone: 704-687-8546
Office Hours: Thursday, 12:00-4:00pm
e-mail: yduan2@uncc.edu


Lisp Miner(by Jan Rauch)

Lisp Miner Manual(by Jan Rauch's Student)

Rough Set Exploration System (RSES)

Bratko's ORANGE

Random Forests

WEKA

iAQ

More software for data mining

Repository of large datasets

Medical Data

GMU KDD Software

LERS vs ERID (WORD format)

Extracting Rules from Incomplete Table (WORD format)

Lance & Williams Distance (WORD format)