Rule based learning

Rule based learning is a related technique to decision trees as trees can be converted to rules and rules can be converted to trees. In this section we will learn three concepts: the 1R algorithm, the PRISM algorithm, and converting rules to trees and vice versa.

1R Algorithm

The 1R algorithm stands for One Rule and it is simply a 1-level decision tree. Why would you use only 1 rule? The answer is for simplicity and to be used as a baseline.

Pseudo-code for 1R:

For each attribute:
	For each value of the attribute, make a rule as follows:
		count how often each class appears
		find the most frequent class
		make the rule assign that class to this attribute-value
	Calculate the error rate of the rules
Choose the rules with the smallest error rate

To see a worked example, please read the first 10 slides of Chapter 4 from Data Mining by I. Witten and E. Frank.

Prism

Prism is another rule based learning algorithm. Here is a worked example:

Trees and Rules

It is also important to understand that rule-based and tree-based classification models are similar. We can always covert a decision tree to an equivalent set of rules that make the same prediction. There will be one rule per leaf node in the tree. On the other hand, it can be harder to convert a set of if-then rules to an equivalent tree. For more details and an example, see Witten and Frank’s first 14 slides in chapter 3.

Further reading

ZeroR

OneR

Knowledge Representation

Data Mining Algorithms

Rule Based Learning - February 19, 2015 - Andrew Andrade