Using c4.5

C4.5 is a wide used supervised learning algorithm, which is famous as one of the classic decision tree induction algorithm, invented by Ross Quinlan. It can perform both classification task and regression task. C4.5 is particularly interested by some data analysis systems due to its ability to generate rules. Its variations can be found in most existing data mining systems, i.e. J48 in weka.
You can download the c4.5 release 8 from this page: rulequest. Yet this version doesn’t contain the batch test(analysis) mode.
So if you need to perform a procedure as follows:
1 Build a classifier on one dataset
2 Then make analysis on another dataset using the built classifier
You may be interested in using the following version of c4.5. The one offered by Ross can only build the classifier and test the classifier at one time, otherwise analysis the given data set iteratively one sample by one sample.
c4.5 of tssg-kdd version Download file
How to use it ?
Assume you are using linux terminal,
$ tar -xf c45.tar
$ cd c45/src
$ chmod 755 cit
$ cit
$ cd ../run
Now you are ready to run c4.5 software, to build a classifier ,
$ c4.5 -f german #german.data is the dataset
then a classifier called german.tree is produced. To use it to preform analysis on a data set call corea.data,
$mclassify german corea #mclassify classifier-name dataset-name

This entry was posted in Supervised learning. Bookmark the permalink.