《Data Mining》Course Syllabus
Author:管理员 Time:2017-04-11 Hit:132

Data Mining》Course Syllabus

Course Name

Data Mining


Prof. Juanying Xie

Course Type

Elective Course

Prerequisite Courses

Statistics, Machine learning,   Pattern Recognition, Database, Artificial Intelligence   


Computer Science

Learning Method

Mentoring, discussion, and programming


1st semester






1. Objective & Requirement

Data mining is a multidiscipline subject of computer science, mathematics and statistics etc. It has been widely used in many different fields including bioinformatics, biomedical sciences, biochemistry, biogeography, financial data analysis , medical data analysis and other fields related with computer science such as the very popular research fields of big data analysis and big search over cyber space et al.

Data mining provides many techniques to dig out the hidden patterns and unknown and potentially useful knowledge from the vast amount of data. This course will explore the concepts and techniques of knowledge discovery and data mining. We will give the encyclopedic coverage of all the related methods in this course, from the classic topics of clustering and classification, to database methods (e.g., association rules, data cubes) to more recent and advanced topics (e.g., SVD/PCA, wavelets, support vector machines). The objective of this course is to establish fundamental concepts of data mining and let students know how to analyze the data using some related methods.

The course will be taught in English. All of the graduate students including master and PhD students are welcome to choose this course as a selective course to study whose major are related to computer science or who are interested in the techniques of data mining and plan to use the techniques to their research to analyze the data in their research fields. The prerequisite courses of Data Mining are Statistics, Machine learning, Pattern Recognition, Database, and Artificial Intelligence etc.

2. Topics and the specific contents to be covered in this course

We will explore the concepts and techniques of knowledge discovery and data mining. We will give the encyclopedic coverage of all the related methods in data mining. The following core topics will be discussed in this course.

1. Introduction

1.1. Why Data Mining?

1.2. What Is Data Mining?

1.3. What Kinds of Data Can Be Mined?

1.4. What Kinds of Patterns Can Be Mined?

1.5. Which Technologies Are Used?

1.6. Which Kinds of Applications Are Targeted?

1.7. Major Issues in Data Mining

1.8. Summary

2. Getting to Know Your Data

2.1. Data Objects and Attribute Types

2.2. Basic Statistical Descriptions of Data

2.3. Data Visualization

2.4. Measuring Data Similarity and Dissimilarity

2.5. Summary

3. Data Preprocessing

3.1. Data Preprocessing: An Overview

3.2. Data Cleaning

3.3. Data Integration

3.4. Data Reduction

3.5. Data Transformation and Data Discretization

3.6. Summary

4. Data Warehousing and Online Analytical Processing

4.1. Data Warehouse: Basic Concepts

4.2. Data Warehouse Modeling: Data Cube and OLAP

4.3. Data Warehouse Design and Usage

4.4. Data Warehouse Implementation

4.5. Data Generalization by Attribute-Oriented Induction

4.6. Summary

5. Data Cube Technology

5.1. Data Cube Computation: Preliminary Concepts

5.2. Data Cube Computation Methods

5.3. Processing Advanced Kinds of Queries by Exploring Cube Technology

5.4. Multidimensional Data Analysis in Cube Space

5.5. Summary

6. Mining Frequent Patterns, Associations, and Correlations

6.1. Basic Concepts

6.2. Frequent Itemset Mining Methods

6.3. Which Patterns Are Interesting?—Pattern Evaluation Methods

6.4. Summary

7. Advanced Pattern Mining

7.1. Pattern Mining: A Road Map

7.2. Pattern Mining in Multilevel, Multidimensional Space

7.3. Constraint-Based Frequent Pattern Mining

7.4. Mining High-Dimensional Data and Colossal Patterns

7.5. Mining Compressed or Approximate Patterns

7.6. Pattern Exploration and Application

7.7. Summary

8. Classification

8.1. Basic Concepts

8.2. Decision Tree Induction

8.3. Bayes Classification Methods

8.4. Rule-Based Classification

8.5. Model Evaluation and Selection

8.6. Techniques to Improve Classification Accuracy

8.7. Summary

9. Classification

9.1. Bayesian Belief Networks

9.2. Classification by Backpropagation

9.3. Support Vector Machines

9.4. Classification Using Frequent Patterns

9.5. Lazy Learners (or Learning from Your Neighbors)

9.6. Other Classification Methods

9.7. Additional Topics Regarding Classification

9.8 Summary

10. Cluster Analysis

10.1. Cluster Analysis

10.2. Partitioning Methods

10.3. Hierarchical Methods

10.4. Density-Based Methods

10.5. Grid-Based Methods

10.6. Evaluation of Clustering

10.7. Summary

11. Advanced Cluster Analysis

11.1. Probabilistic Model-Based Clustering

11.2. Clustering High-Dimensional Data

11.3. Clustering Graph and Network Data

11.4. Clustering with Constraints

11.6. Summary

12. Outlier Detection

12.1. Outliers and Outlier Analysis

12.2. Outlier Detection Methods

12.3. Statistical Approaches

12.4. Proximity-Based Approaches

12.5. Clustering-Based Approaches

12.6. Classification-Based Approaches

12.7. Mining Contextual and Collective Outliers

12.8. Outlier Detection in High-Dimensional Data

12.9. Summary

13. Data Mining Trends and Research Frontiers

13.1. Mining Complex Data Types

13.2. Other Methodologies of Data Mining

13.3. Data Mining Applications

13.4. Data Mining and Society

13.5. Data Mining Trends

13.6. Summary

3. Textbook

Jiawei Han, Micheline Kamber, & Jian Pei.  Data Mining: Concepts and Techniques (3rd edition). Morgan Kaufman Publisher, 2012.

4. Reference

  1. Pang-Ning Tan, Michael Steinbach, & Vipin Kumar. Introduction to Data Mining, Pearson Eductation, Inc. 2006.

  2. David J. Hand, Heikki Mannila, Padhraic Smyth, Principles of Data Mining (Adaptive Computation and Machine Learning), MIT Press, 2001.

  3. Zaki Mohammed, Meira Wagner. Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press, 2014.

  4. Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques (2nd edition). Morgan Kaufmann Publishers, 2006.

5. Course Evaluation (Tentative)

Assignments                           30%

Course Project                        40%

Exam                     30%