シラバス
※学期中に内容が変更になることがあります。

2020年度


31691012 

△Advanced Knowledge Discovery in Databases (E)
Advanced Knowledge Discovery in Databases (E)
2単位/Unit  秋学期/Fall  京田辺/Kyotanabe  講義/Lecture

  大崎 美穂

<概要/Course Content Summary>

This course provides the underlying concepts, principles, and theories of Machine Learning (ML) and Knowledge Discovery (KD). It consists of the following Parts I, II, III, and IV. In each part, individual ML/KD methods are explained, and also their relationships are discussed from a unifying view for systematic understanding. 
 
Part I: Unsupervised Learning 
This part focuses on Clustering and Component/Factor Analysis, and provides K-Means, Principal Component Analysis, Factor Analysis, Spectral Clustering, Kernel Principal Component Analysis etc. 
 
Part II: Supervised Learning 
This part focuses on the conceptual and theoretical background of Supervised Learning, and provides Model Complexity, Statistical Decision Theory, Curse of Dimensionality, and so on. 
 
Part III: Basic Methods for Regression and Classification 
This part focuses on basic supervised learning methods, and provides Linear Regression, Ridge Regression, Lasso, Linear Discriminant Analysis, Logistic Regression, K-Means-based Classification, and K-Nearest-Neighbors. Moreover, it provides standard objective functions including Likelihood, 0-1 Loss, and Hinge Loss, and their relationship. 
 
Part IV: Advanced Methods for Regression and Classification 
This part focuses on advanced supervised learning methods. It discusses how to make models nonlinear (Prototypes, Kernelization, and Neural Networks) and how to control the complexity of models (Bias-Variance Decomposition, Information Criteria, and Cross Validation). 
 
Part IV then provides Support Vector Machine, Neural Network (NN), Deep NN, Convolutional NN, Recurrent NN etc. Furthermore, Explainability is discussed, comparing NN-based models with clear-box models such as Autoregressive Model and Autoregressive Conditional Heteroscedastic Model. 

<到達目標/Goals,Aims>

Students are expected to understand the underlying concepts, principles, and theories of Machine Learning (ML) and Knowledge Discovery (KD), and to acquire an ability to apply the ML/KD methods to various real-world problems. 
 

<授業計画/Schedule>

(実施回/
Week)
(内容/
Contents)
(授業時間外の学習/
Assignments)
(実施回/ Week) 01  (内容/ Contents) ******************** 
Introduction 
1. Course Information 
2. What are ML/KD? 
3, Key Concepts of ML/KD (Model Structure, Objective Function, and Optimization) 
4. Key Concepts of ML/KD (Aspects of ML/KD) 
5. Key Concepts of ML/KD (Supervised and Unsupervised Learning) 
 
(授業時間外の学習/ Assignments) Textbook reading and preparation for the quizzes, QA, midterm report, and final report (will take about 1 to 2 hours). The same applies to the following classes. 
 
(実施回/ Week) 02  (内容/ Contents) ******************** 
Introduction (Cont'd) 
6. Key Concepts of ML/KD (A Unifying View of Individual Methods) 
Part I-1: Unsupervised Learning (Clustering) 
1. Variable Types and Terminology 
2. Clustering (K-means) 
3. Clustering (Gaussian Mixture) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 03  (内容/ Contents) ******************** 
Part I-2: Unsupervised Learning (Component/Factor Analysis) 
1. Component/Factor Analysis (PCA) 
2. Component/Factor Analysis (FA, ICA) 
3. Nonlinear Component/Factor Analysis (Spectral Clustering, Kernel PCA) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 04  (内容/ Contents) ******************** 
Part II-1: Supervised Learning (Types of Prediction, Model Complexity) 
0. Review of the Quizzes in Lectures 01 to 03 
1. What are Regression and Classification? 
2. Insight Given by Two Simple Approaches Linear Model and Nearest Neighbors (Model Complexity) 
3. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 05  (内容/ Contents) ******************** 
Part II-2: Supervised Learning (Statistical Decision Theory, Curse of Dimensionality) 
1. Insight Given by Two Simple Approaches, Linear Model and Nearest Neighbors (Statistical Decision Theory, Likelihood) 
2. Insight Given by Two Simple Approaches Linear Model and Nearest Neighbors (Curse of Dimensionality) 
3. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 06  (内容/ Contents) ******************** 
Part III-1: Basic Methods for Regression and Classification (Linear Regression, Objective Functions, Importance of Model Selection) 
1. Linear Regression (Least Squares, Orthgonal Projection) 
2. Objective Functions for Regression (RSS, MSE, and EPE) 
3. Why is Model Selection Important? 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 07  (内容/ Contents) ******************** 
Part III-2: Basic Methods for Regression and Classification (Methods to Control Model Complexity) 
0. Review of the Quizzes in Lectures 04 to 06 
1. Methods to Control Model Complexity (Feature Selection) 
2. Methods to Control Model Complexity (Feature Extraction) 
3. Methods to Control Model Complexity (Shrinkage by Ridge and Lasso) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 08  (内容/ Contents) ******************** 
Part III-3: Basic Methods for Regression and Classification (Classification Approaches, Linear-regression-based Classification, Linear Discriminant Analysis) 
1. Three Approaches to Classification 
2. Linear Classification (Linear Regression of an Indicator Matrix) 
3. Linear Classification (Linear Discriminant Analysis) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 09  (内容/ Contents) ******************** 
Part III-4: Basic Methods for Regression and Classification (Logistic Regression, Objective Functions, Optimization Methods) 
1. Linear Classification (Logistic Regression, Relation between LDA and LOGR, Relation between LOGR and NN) 
2. Objective Functions (Likelihood, 0-1 Loss, Hinge Loss) 
3. Optimization Methods (Gradient Descent) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 10  (内容/ Contents) ******************** 
Part III-5: Basic Methods for Regression and Classification (Prototype-based Nonlinear Methods) 
0. Review of the Quizzes in Lecture 07 to 09 
1. Nonlinear Regression and Classification (K-means, Gaussian Mixture, Learning Vector Quantization) 
2. Nonlinear Regression and Classification (K-Nearest-Neighbors, Bayes Error) 
3. How to Train Gaussian Mixture (EM Algorithm) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 11  (内容/ Contents) ******************** 
Part IV-1: Advanced Methods for Regression and Classification (Kernel-based Nonlinear Methods) 
1. Nonlinear Regression and Classification (Kernelization) 
2. Nonlinear Regression and Classification (Support Vector Machine) 
3. Nonlinear Regression and Classification (Various Kernel Methods) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 12  (内容/ Contents) ******************** 
Part IV-2: Advanced Methods for Regression and Classification (Neural-network-based Nonlinear Methods) 
1. Nonlinear Regression and Classification (Model Structure and Objective Function of Neural Network) 
2. Nonlinear Regression and Classification (Backpropagation, Automatic Differentiation) 
3. Shallow or Deep? 
4. Deep Neural Networks for Images (CNN, Dropout, Regularization) 
5. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 13  (内容/ Contents) ******************** 
Part IV-3: Advanced Methods for Regression and Classification (Neural-network-based Nonlinear Methods, Stochastic Time Series Modeling, Model Selection) 
1. Deep Neural Networks for Time Series (RNN, LSTM) 
2. Explanability of Deep Neural Networks 
3. Clear-Box Models for Time Series (AR, ARCH, AIC) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 14  (内容/ Contents) ******************** 
Part IV-4: Advanced Methods for Regression and Classification (Model Selection, Theory Behind Model Selection) 
0. Review of the Quizzes in Lectures 10 to 12 
1. What is Model Selection? (Generalization, Bias, Variance, Model Complexity, BV Decomposition) 
2. Theoretical Model Selection (Ideas Underlying AIC, BIC, and MDL) 
3. Empirical Model Selection (Cross Validation) 
4. Quiz on Today's Topics 
 
(授業時間外の学習/ Assignments)  
(実施回/ Week) 15  (内容/ Contents) ******************** 
Conclusion: Review of ML/KD 
0. Review of the Quizzes in Lectures 13 to 15 
1. Course Summary 
2. Review of What We Learned 
3. Final Report Assignment 
 
(授業時間外の学習/ Assignments)  

The schedule is subject to change depending on the students' understanding. 

<成績評価基準/Evaluation Criteria>

Quizzes in Classes 
 
25%   
QA in Classes 
 
25%   
Midterm Report 
 
25%   
Final Report 
 
25%   
Remarks 
 
  The detail on evaluation criteria is explained in the first class. 

<参照URL/URL>

The Elements of Statistical Learning (ESL) 
 
Download the textbook (ESL by T. Hastie) from this URL. 
 
Pattern Recognition and Machine Learning (PRML) 
 
Download the reference book (PRML by C. Bishop) from this URL. 
 

<備考/Remarks>

You can contact the instructor by e-class or visiting her office, if needed. 
 

 

お問合せは同志社大学 各学部・研究科事務室まで
 
Copyright(C) 2020 Doshisha University All Rights Reserved. 無断転載を禁止します。