TSSG-KDDG » Supervised learning http://kddg.tssg.org Thu, 15 Nov 2012 16:08:40 +0000 en-US hourly 1 http://wordpress.org/?v=3.9.1 Using c4.5 http://kddg.tssg.org/2009/08/31/using-c4-5/ http://kddg.tssg.org/2009/08/31/using-c4-5/#comments Mon, 31 Aug 2009 16:53:03 +0000 http://kddg.tssg.org/2009/08/31/using-c4-5/ Continue reading ]]> C4.5 is a wide used supervised learning algorithm, which is famous as one of the classic decision tree induction algorithm, invented by Ross Quinlan. It can perform both classification task and regression task. C4.5 is particularly interested by some data analysis systems due to its ability to generate rules. Its variations can be found in most existing data mining systems, i.e. J48 in weka.
You can download the c4.5 release 8 from this page: rulequest. Yet this version doesn’t contain the batch test(analysis) mode.
So if you need to perform a procedure as follows:
1 Build a classifier on one dataset
2 Then make analysis on another dataset using the built classifier
You may be interested in using the following version of c4.5. The one offered by Ross can only build the classifier and test the classifier at one time, otherwise analysis the given data set iteratively one sample by one sample.
c4.5 of tssg-kdd version Download file
How to use it ?
Assume you are using linux terminal,
$ tar -xf c45.tar
$ cd c45/src
$ chmod 755 cit
$ cit
$ cd ../run
Now you are ready to run c4.5 software, to build a classifier ,
$ c4.5 -f german #german.data is the dataset
then a classifier called german.tree is produced. To use it to preform analysis on a data set call corea.data,
$mclassify german corea #mclassify classifier-name dataset-name

]]>
http://kddg.tssg.org/2009/08/31/using-c4-5/feed/ 0
Incremental learning http://kddg.tssg.org/2009/08/26/incremental-learning/ http://kddg.tssg.org/2009/08/26/incremental-learning/#comments Wed, 26 Aug 2009 13:28:23 +0000 http://kddg.tssg.org/2009/08/26/incremental-learning/ Continue reading ]]> In industrial life, data usually become available gradually, this
fact requires data analysis systems to have the capability to learn
information incrementally. Learning from new data without forgetting
prior knowledge is known as incremental learning. Its requirement
become challenge since most fundamental supervised learning
algorithms are lack of the ability to incremental learning, in most
of these cases the involved data analysis systems would rebuild the
new classifiers on the new data set, unfortunately, these procedures
normally lead to the phenomenon known as “catastrophic forgetting”,
the previously learned information lost, the result could be even
worse if the old data are no longer available.

]]>
http://kddg.tssg.org/2009/08/26/incremental-learning/feed/ 0