All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home

Class weka.filters.SplitDatasetFilter

java.lang.Object
   |
   +----weka.filters.Filter
           |
           +----weka.filters.SplitDatasetFilter

public class SplitDatasetFilter
extends Filter
implements OptionHandler
This filter takes a dataset and outputs a subset of it. If a class attribute is assigned, the dataset will be stratified when fold-based splitting. Valid options are:

-R inst1,inst2-inst4,...
Specifies list of instances to select. First and last are valid indexes. (default fold-based splitting)

-V
Specifies if inverse of selection is to be output.

-N number of folds
Specifies number of folds dataset is split into (default 10).

-F fold
Specifies which fold is selected. (default 1)

-S seed
Specifies a random number seed for shuffling the dataset. (default 0, don't randomize)

Version:
$Revision: 1.3 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)

Constructor Index

 o SplitDatasetFilter()

Method Index

 o batchFinished()
Signify that this batch of input to the filter is finished.
 o getFold()
Gets the fold which is selected.
 o getInstancesIndices()
Gets ranges of instances selected.
 o getInvertSelection()
Gets if selection is to be inverted.
 o getNumFolds()
Gets the number of folds in which dataset is to be split into.
 o getOptions()
Gets the current settings of the filter.
 o getSeed()
Gets the random number seed used for shuffling the dataset.
 o inputFormat(Instances)
Sets the format of the input instances.
 o listOptions()
Gets an enumeration describing the available options.
 o main(String[])
Main method for testing this class.
 o setFold(int)
Selects a fold.
 o setInstancesIndices(String)
Sets the ranges of instances to be selected.
 o setInvertSelection(boolean)
Sets if selection is to be inverted.
 o setNumFolds(int)
Sets the number of folds the dataset is split into.
 o setOptions(String[])
Parses the options for this object.
 o setSeed(long)
Sets the random number seed for shuffling the dataset.

Constructors

 o SplitDatasetFilter
 public SplitDatasetFilter()

Methods

 o listOptions
 public Enumeration listOptions()
Gets an enumeration describing the available options.

Returns:
an enumeration of all the available options
 o setOptions
 public void setOptions(String options[]) throws Exception
Parses the options for this object. Valid options are:

-R inst1,inst2-inst4,...
Specifies list of instances to select. First and last are valid indexes. (default fold-based splitting)

-V
Specifies if inverse of selection is to be output.

-N number of folds
Specifies number of folds dataset is split into (default 10).

-F fold
Specifies which fold is selected. (default 1)

-S seed
Specifies a random number seed for shuffling the dataset. (default 0, no randomizing)

Parameters:
options - the list of options as an array of strings
Throws: Exception
if an option is not supported
 o getOptions
 public String[] getOptions()
Gets the current settings of the filter.

Returns:
an array of strings suitable for passing to setOptions
 o getInstancesIndices
 public String getInstancesIndices()
Gets ranges of instances selected.

Returns:
a string containing a comma-separated list of ranges
 o setInstancesIndices
 public void setInstancesIndices(String rangeList) throws Exception
Sets the ranges of instances to be selected. If provided string is null, ranges won't be used for selecting instances.

Parameters:
rangeList - a string representing the list of instances. eg: first-3,5,6-last
Throws: Exception
if an invalid range list is supplied
 o getInvertSelection
 public boolean getInvertSelection()
Gets if selection is to be inverted.

Returns:
true if the selection is to be inverted
 o setInvertSelection
 public void setInvertSelection(boolean inverse)
Sets if selection is to be inverted.

Parameters:
inverse - true if inversion is to be performed
 o getNumFolds
 public int getNumFolds()
Gets the number of folds in which dataset is to be split into.

Returns:
the number of folds the dataset is to be split into.
 o setNumFolds
 public void setNumFolds(int numFolds) throws Exception
Sets the number of folds the dataset is split into. If the number of folds is zero, it won't split it into folds.

Parameters:
numFolds - number of folds dataset is to be split into
Throws: Exception
if number of folds is negative
 o getFold
 public int getFold()
Gets the fold which is selected.

Returns:
the fold which is selected
 o setFold
 public void setFold(int fold) throws Exception
Selects a fold.

Parameters:
fold - the fold to be selected.
Throws: Exception
if fold's index is smaller than 1
 o getSeed
 public long getSeed()
Gets the random number seed used for shuffling the dataset.

Returns:
the random number seed
 o setSeed
 public void setSeed(long seed) throws Exception
Sets the random number seed for shuffling the dataset. If seed is negative, shuffling won't be performed.

Parameters:
seed - the random number seed
Throws: Exception
if the seed is smaller than 0
 o inputFormat
 public boolean inputFormat(Instances instanceInfo) throws Exception
Sets the format of the input instances.

Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Returns:
true because outputFormat can be collected immediately
Throws: Exception
if the input format can't be set successfully
Overrides:
inputFormat in class Filter
 o batchFinished
 public boolean batchFinished() throws Exception
Signify that this batch of input to the filter is finished. Output() may now be called to retrieve the filtered instances.

Returns:
true if there are instances pending output
Throws: Exception
if no input structure has been defined
Overrides:
batchFinished in class Filter
 o main
 public static void main(String argv[])
Main method for testing this class.

Parameters:
argv - should contain arguments to the filter: use -h for help

All Packages  Class Hierarchy  This Package  Previous  Next  Index  WEKA's home