77 lines
2.8 KiB
Plaintext
77 lines
2.8 KiB
Plaintext
1. Title: The Monk's Problems
|
|
|
|
2. Sources:
|
|
(a) Donor: Sebastian Thrun
|
|
School of Computer Science
|
|
Carnegie Mellon University
|
|
Pittsburgh, PA 15213, USA
|
|
|
|
E-mail: thrun@cs.cmu.edu
|
|
|
|
(b) Date: October 1992
|
|
|
|
3. Past Usage:
|
|
|
|
- See File: thrun.comparison.ps.Z
|
|
|
|
- Wnek, J., "Hypothesis-driven Constructive Induction," PhD dissertation,
|
|
School of Information Technology and Engineering, Reports of Machine
|
|
Learning and Inference Laboratory, MLI 93-2, Center for Artificial
|
|
Intelligence, George Mason University, March 1993.
|
|
|
|
- Wnek, J. and Michalski, R.S., "Comparing Symbolic and
|
|
Subsymbolic Learning: Three Studies," in Machine Learning: A
|
|
Multistrategy Approach, Vol. 4., R.S. Michalski and G. Tecuci (Eds.),
|
|
Morgan Kaufmann, San Mateo, CA, 1993.
|
|
|
|
4. Relevant Information:
|
|
|
|
The MONK's problem were the basis of a first international comparison
|
|
of learning algorithms. The result of this comparison is summarized in
|
|
"The MONK's Problems - A Performance Comparison of Different Learning
|
|
algorithms" by S.B. Thrun, J. Bala, E. Bloedorn, I. Bratko, B.
|
|
Cestnik, J. Cheng, K. De Jong, S. Dzeroski, S.E. Fahlman, D. Fisher,
|
|
R. Hamann, K. Kaufman, S. Keller, I. Kononenko, J. Kreuziger, R.S.
|
|
Michalski, T. Mitchell, P. Pachowicz, Y. Reich H. Vafaie, W. Van de
|
|
Welde, W. Wenzel, J. Wnek, and J. Zhang has been published as
|
|
Technical Report CS-CMU-91-197, Carnegie Mellon University in Dec.
|
|
1991.
|
|
|
|
One significant characteristic of this comparison is that it was
|
|
performed by a collection of researchers, each of whom was an advocate
|
|
of the technique they tested (often they were the creators of the
|
|
various methods). In this sense, the results are less biased than in
|
|
comparisons performed by a single person advocating a specific
|
|
learning method, and more accurately reflect the generalization
|
|
behavior of the learning techniques as applied by knowledgeable users.
|
|
|
|
There are three MONK's problems. The domains for all MONK's problems
|
|
are the same (described below). One of the MONK's problems has noise
|
|
added. For each problem, the domain has been partitioned into a train
|
|
and test set.
|
|
|
|
5. Number of Instances: 432
|
|
|
|
6. Number of Attributes: 8 (including class attribute)
|
|
|
|
7. Attribute information:
|
|
1. class: 0, 1
|
|
2. a1: 1, 2, 3
|
|
3. a2: 1, 2, 3
|
|
4. a3: 1, 2
|
|
5. a4: 1, 2, 3
|
|
6. a5: 1, 2, 3, 4
|
|
7. a6: 1, 2
|
|
8. Id: (A unique symbol for each instance)
|
|
|
|
8. Missing Attribute Values: None
|
|
|
|
9. Target Concepts associated to the MONK's problem:
|
|
|
|
MONK-1: (a1 = a2) or (a5 = 1)
|
|
|
|
MONK-2: EXACTLY TWO of {a1 = 1, a2 = 1, a3 = 1, a4 = 1, a5 = 1, a6 = 1}
|
|
|
|
MONK-3: (a5 = 3 and a4 = 1) or (a5 /= 4 and a2 /= 3)
|
|
(5% class noise added to the training set)
|