All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home

Class weka.clusterers.ClusterEvaluation

java.lang.Object
   |
   +----weka.clusterers.ClusterEvaluation

public class ClusterEvaluation
extends Object
Class for evaluating clustering models.

Valid options are:

-t
Specify the training file.

-T
Specify the test file to apply clusterer to.

-d
Specify output file.

-l
Specifiy input file.

-p
Output predictions. Predicitons are for the training file if only the training file is specified, otherwise they are for the test file.

-x
Set the number of folds for a cross validation of the training data. Cross validation can only be done for distribution clusterers and will be performed if the test file is missing.

Version:
$Revision: 1.7 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)

Constructor Index

 o ClusterEvaluation()
Constructor.

Method Index

 o clusterResultsToString()
return the results of clustering.
 o crossValidateModel(String, Instances, int, String[])
Performs a cross-validation for a distribution clusterer on a set of instances.
 o evaluateClusterer(Clusterer, String[])
Evaluates a clusterer with the options given in an array of strings.
 o evaluateClusterer(Instances)
Evaluate the clusterer on a set of instances.
 o getClusterAssignments()
Return an array of cluster assignments corresponding to the most recent set of instances clustered.
 o main(String[])
Main method for testing this class.
 o setClusterer(Clusterer)
set the clusterer
 o setDoXval(boolean)
set whether or not to do cross validation
 o setFolds(int)
set the number of folds to use for cross validation
 o setSeed(int)
set the seed to use for cross validation

Constructors

 o ClusterEvaluation
 public ClusterEvaluation()
Constructor. Sets defaults for each member variable. Default Clusterer is EM.

Methods

 o setClusterer
 public void setClusterer(Clusterer clusterer)
set the clusterer

Parameters:
clusterer - the clusterer to use
 o setDoXval
 public void setDoXval(boolean x)
set whether or not to do cross validation

Parameters:
x - true if cross validation is to be done
 o setFolds
 public void setFolds(int folds)
set the number of folds to use for cross validation

Parameters:
folds - the number of folds
 o setSeed
 public void setSeed(int s)
set the seed to use for cross validation

Parameters:
s - the seed.
 o clusterResultsToString
 public String clusterResultsToString()
return the results of clustering.

Returns:
a string detailing the results of clustering a data set
 o getClusterAssignments
 public double[] getClusterAssignments()
Return an array of cluster assignments corresponding to the most recent set of instances clustered.

Returns:
an array of cluster assignments
 o evaluateClusterer
 public void evaluateClusterer(Instances test) throws Exception
Evaluate the clusterer on a set of instances. Calculates clustering statistics and stores cluster assigments for the instances in m_clusterAssignments

Parameters:
test - the set of instances to cluster
 o evaluateClusterer
 public static String evaluateClusterer(Clusterer clusterer,
                                        String options[]) throws Exception
Evaluates a clusterer with the options given in an array of strings. It takes the string indicated by "-t" as training file, the string indicated by "-T" as test file. If the test file is missing, a stratified ten-fold cross-validation is performed (distribution clusterers only). Using "-x" you can change the number of folds to be used, and using "-s" the random seed. If the "-p" flag is set it outputs the classification for each test instance. If you provide the name of an object file using "-l", a clusterer will be loaded from the given file. If you provide the name of an object file using "-d", the clusterer built from the training data will be saved to the given file.

Parameters:
clusterer - machine learning clusterer
options - the array of string containing the options
Returns:
a string describing the results
Throws: Exception
if model could not be evaluated successfully
 o crossValidateModel
 public static String crossValidateModel(String clustererString,
                                         Instances data,
                                         int numFolds,
                                         String options[]) throws Exception
Performs a cross-validation for a distribution clusterer on a set of instances.

Parameters:
clustererString - a string naming the class of the clusterer
data - the data on which the cross-validation is to be performed
numFolds - the number of folds for the cross-validation
options - the options to the clusterer
Returns:
a string containing the cross validated log likelihood
Throws: Exception
if a clusterer could not be generated
 o main
 public static void main(String args[])
Main method for testing this class.

Parameters:
args - the options

All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home