Syllabus for DTM-535

Data Mining and Knowledge Management


COURSE DESCRIPTION

This course will serve to introduce students to data mining and knowledge management. Data mining (DM) is concerned with the discovery of “hidden” knowledge in large data sets. This knowledge represents one aspect of an organization’s intellectual capital and is often expressed in the form of trends or major themes that re-occur in the data. Knowledge management (KM) systems are designed to exploit the results of data mining and facilitate the analysis and evaluation of both tangible and intangible knowledge assets. In this course, students will explore data mining methods used for prediction and knowledge discovery. These methods include regression, nearest neighbor, clustering, K-means, decision trees, association rules, and neural networks. In addition, students will become familiar with the current theories, practices, tools, and techniques used to management knowledge assets.

COURSE TOPICS

COURSE OBJECTIVES

After completing this course, you should be able to:

CO1        Explain data mining concepts, principles, and tasks.

CO2        Implement efficient algorithms for preprocessing of large data sets.

CO3        Generate a plan to design and implement different phases of data mining process.

CO4        Explore large data to discover patterns using existing algorithms.

CO5        Apply data classification techniques to analyze data sets.

CO6        Identify and implement clustering algorithms pertaining to data mining.

CO7        Evaluate analytical problems in various areas of computational data analysis and knowledge management.

CO8        Use an open source data mining tool to analyze data sets.

COURSE MATERIALS

You will need the following materials to complete your coursework. Some course materials may be free, open source, or available from other providers. You can access free or open-source materials by clicking the links provided below or in the module details documents. To purchase course materials, please visit the University's textbook supplier.

Required Textbook

ISBN-13: 978-0321321367

Textbook Resources from Authors

Required Open Source Software

COURSE STRUCTURE

Data Mining and Knowledge Management is a three-credit, online course consisting of six modules. Modules include an overview, topics, learning objectives, study materials, and activities. Module titles are listed below.

ASSESSMENT METHODS

For your formal work in the course, you are required to participate in online discussion forums, complete written assignments, and finish programming projects. See below for details.

Consult the Course Calendar for due dates.

Promoting Originality

One or more of your course activities may utilize a tool designed to promote original work and evaluate your submissions for plagiarism. More information about this tool is available in this document.

Discussion Forums

You will be required to participate in ten graded online discussion assignments. There is also one ungraded but required Introductions Forum in Module 1.

Discussion forums are on a variety of topics associated with the course modules. The purpose of the discussion forums is to help make the connection between the course concepts and the goals of the course. In discussion posts, you express your opinions and thoughts, provide support and evidence for the position(s) you take on a subject, and have the opportunity to ask questions and expand on insights provided by your classmates. Active participation is vital to your overall success in this course.

Located within the Evaluation Rubrics section of the course website is the online discussion forum rubric used to aid in the grading of all online discussion assignments.

Written Assignments

You are required to complete six written assignments. The written assignments are on a variety of topics associated with the course modules.

Programming Assignments

You are required to complete three programming projects using R (an open source programming language and software environment for statistical computing and graphics) and RStudio (a set of integrated tools designed to help you be more productive with R).  Both R and RStudio are widely used for data mining and analysis. You are required to use them to generate reports on statistical properties of a data set, to implement a classification technique to classify different data sets, and to utilize a clustering technique on a different data set.

GRADING AND EVALUATION

Your grade in the course will be determined as follows:

All activities will receive a numerical grade of 0–100. You will receive a score of 0 for any work not submitted. Your final grade in the course will be a letter grade. Letter grade equivalents for numerical grades are as follows:

A

=

93–100

B

=

83–87

A–

=

90–92

C

=

73–82

B+

=

88–89

F

=

Below 73

To receive credit for the course, you must earn a letter grade of C or higher on the weighted average of all assigned course work (e.g., assignments, discussion postings, projects). Graduate students must maintain a B average overall to remain in good academic standing.

STRATEGIES FOR SUCCESS

First Steps to Success

To succeed in this course, take the following first steps:

Study Tips

Consider the following study tips for success:

ACADEMIC POLICIES

To ensure success in all your academic endeavors and coursework at Thomas Edison State University, familiarize yourself with all administrative and academic policies including those related to academic integrity, course late submissions, course extensions, and grading policies.

For more, see:

Copyright © 2017 by Thomas Edison State University. All rights reserved.