Rule based learning
Rule based learning is a related technique to decision trees as trees can be converted to rules and rules can be converted to trees. In this section we will learn three concepts: the 1R algorithm, the PRISM algorithm, and converting rules to trees and vice versa.
The 1R algorithm stands for One Rule and it is simply a 1-level decision tree. Why would you use only 1 rule? The answer is for simplicity and to be used as a baseline.
Pseudo-code for 1R:
For each attribute: For each value of the attribute, make a rule as follows: count how often each class appears find the most frequent class make the rule assign that class to this attribute-value Calculate the error rate of the rules Choose the rules with the smallest error rate
To see a worked example, please read the first 10 slides of Chapter 4 from Data Mining by I. Witten and E. Frank.
Prism is another rule based learning algorithm. Here is a worked example:
Trees and Rules
It is also important to understand that rule-based and tree-based classification models are similar. We can always covert a decision tree to an equivalent set of rules that make the same prediction. There will be one rule per leaf node in the tree. On the other hand, it can be harder to convert a set of if-then rules to an equivalent tree. For more details and an example, see Witten and Frank’s first 14 slides in chapter 3.
Data Mining Algorithms