ABLE, Version 1.1b

com.ibm.able.beans.bayes
Class NaiveBayes

java.lang.Object
  |
  +--com.ibm.able.beans.bayes.NaiveBayes

public class NaiveBayes
extends java.lang.Object
implements java.io.Serializable

See Also:
Serialized Form

Constructor Summary
NaiveBayes()
           
NaiveBayes(int ncls, int nftr, short[] nval, double m, double[] cpriors, double[][][] ppriors)
          construct NaiveBayes with the explicitly specified parameters ncls - number of class labels nftr - number of features nval - number of values per each feature (assuming nominal - discrete finite-valued - features) cpriors - prior probability distribution over class labels m - equivalent sample size ppriors - prior estimates of the probabilities P(f|C) (used for Bayesian parameter estimation with equivalent sample size method)
 
Method Summary
 void buildHypothesis(int[][] data, int[] labels, int ninst, int ncls, int nftr, short[] nval)
          Build a hypothesis using explicit parameters This function learns a naive Bayes model given a set of labeled instances.
 double classify(int[][] data, int[] labels, int ninst, boolean[] selectedFeatures)
          Classify a record Input: data - training data in a table format where each row represents an instance, and each column represents an attribute (feature).
 int classifyExample(int[] instance, boolean[] selectedFeatures)
          This function selects the maximum-likelihood class label given a data instance Input: instance - feature vector (discrete finite feature values represented by integers) selectedFeatures - boolean array specifying selected features (by default, null -all features included)
 double[] findClassProbability(int[] instance, boolean[] selectedFeatures)
          This function returns the posterior probability distribution over class labels, given a data instance using Bayes rule as follows: find P(class|instance)=P(instance|class)P(class)/sum_i P(instnace|class_i) Input: instance - feature vector (discrete finite feature values represented by integers) selectedFeatures - boolean array specifying selected features (by default, null -all features included)
 double getAccuracy()
           
 double getAvgLikelihood()
           
 double getAvgLogLikelihood()
           
 double[] getClassPriors()
           
 double[] getClassProb()
           
 int[][] getConfusionMatrix()
           
 double[][][] getCPT()
           
 double[] getEqSampleSize()
           
 int getNClasses()
           
 int getNFeatures()
           
 short[] getNFValues()
           
 void initializeNB(int ncls, int nftr, short[] nval, double eqss, double[] cpriors, double[][][] ppriors)
          internal function that implements the class construction with explicit list of parameters ncls - number of class labels nftr - number of features nval - number of values per each feature (assuming nominal - discrete finite-valued - features) cpriors - prior probability distribution over class labels eqss - equivalent sample sizes for each class (by deafult, each class was seen at least once) ppriors - prior estimates of the probabilities P(f|C) (used for Bayesian parameter estimation with equivalent sample size method)
 double likelihood(int[] instance, int classlabel)
          Compute the likelihood of an instance given a class label
 void setClassPriors(double[] cpriors)
           
 void setCPT(double[][][] cptpriors)
           
 void setNClasses(int ncls)
           
 void setNFeatures(int nftr)
           
 void setNFValues(short[] nfv)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NaiveBayes

public NaiveBayes()

NaiveBayes

public NaiveBayes(int ncls,
                  int nftr,
                  short[] nval,
                  double m,
                  double[] cpriors,
                  double[][][] ppriors)
construct NaiveBayes with the explicitly specified parameters ncls - number of class labels nftr - number of features nval - number of values per each feature (assuming nominal - discrete finite-valued - features) cpriors - prior probability distribution over class labels m - equivalent sample size ppriors - prior estimates of the probabilities P(f|C) (used for Bayesian parameter estimation with equivalent sample size method)
Method Detail

initializeNB

public void initializeNB(int ncls,
                         int nftr,
                         short[] nval,
                         double eqss,
                         double[] cpriors,
                         double[][][] ppriors)
internal function that implements the class construction with explicit list of parameters ncls - number of class labels nftr - number of features nval - number of values per each feature (assuming nominal - discrete finite-valued - features) cpriors - prior probability distribution over class labels eqss - equivalent sample sizes for each class (by deafult, each class was seen at least once) ppriors - prior estimates of the probabilities P(f|C) (used for Bayesian parameter estimation with equivalent sample size method)

getNClasses

public int getNClasses()

getNFeatures

public int getNFeatures()

getNFValues

public short[] getNFValues()

getEqSampleSize

public double[] getEqSampleSize()

getCPT

public double[][][] getCPT()

getClassPriors

public double[] getClassPriors()

getAvgLikelihood

public double getAvgLikelihood()

getAvgLogLikelihood

public double getAvgLogLikelihood()

getAccuracy

public double getAccuracy()

getClassProb

public double[] getClassProb()

getConfusionMatrix

public int[][] getConfusionMatrix()

setNClasses

public void setNClasses(int ncls)

setNFeatures

public void setNFeatures(int nftr)

setNFValues

public void setNFValues(short[] nfv)

setCPT

public void setCPT(double[][][] cptpriors)

setClassPriors

public void setClassPriors(double[] cpriors)

buildHypothesis

public void buildHypothesis(int[][] data,
                            int[] labels,
                            int ninst,
                            int ncls,
                            int nftr,
                            short[] nval)
Build a hypothesis using explicit parameters This function learns a naive Bayes model given a set of labeled instances. Given class value C=c, for each attribute (feature) f, it computes an estimate of the parameter P(f|C=c) using the m-estimate of probability, or so-called equivalent sample size method (see Mitchel, Machine Learning): P(f=a|C=c)= (n + mp)/(N+m), where N is the total number of instances for which C=c, n is the number of these instances for which also f=a, p is a prior estimate of the probability P(f=a|C=c), and m is a constant called the equivalent sample size. The m-estimate can be interpreted as an empirical probability estimate (i.e. frequency) assuming m additional "prior" instances distributed according to p, combined with the "current" N instances. When there is no prior information about p, we assume uniform distribution over all possible values for each feature, i.e. p=1/nFValues. Input: data - training data in a table format where each row represents an instance, and each column represents an attribute (feature). labels - the class label. ninst - the number of training instances ncls - number of class labels nftr - number of features nval - number of values per each feature (assuming nominal - discrete finite-valued - features) (We always assume that the class labels and the feature number indexes start from 0).

classify

public double classify(int[][] data,
                       int[] labels,
                       int ninst,
                       boolean[] selectedFeatures)
Classify a record Input: data - training data in a table format where each row represents an instance, and each column represents an attribute (feature). labels - the class label. ninst - the number of training instances selectedFeatures - boolean array specifying selected features (by default, null -all features included) We always assume that the class labels and the feature number indexes start from 0.

classifyExample

public int classifyExample(int[] instance,
                           boolean[] selectedFeatures)
This function selects the maximum-likelihood class label given a data instance Input: instance - feature vector (discrete finite feature values represented by integers) selectedFeatures - boolean array specifying selected features (by default, null -all features included)

findClassProbability

public double[] findClassProbability(int[] instance,
                                     boolean[] selectedFeatures)
This function returns the posterior probability distribution over class labels, given a data instance using Bayes rule as follows: find P(class|instance)=P(instance|class)P(class)/sum_i P(instnace|class_i) Input: instance - feature vector (discrete finite feature values represented by integers) selectedFeatures - boolean array specifying selected features (by default, null -all features included)

likelihood

public double likelihood(int[] instance,
                         int classlabel)
Compute the likelihood of an instance given a class label

ABLE, Version 1.1b

ABLE: Produced by Joe, Don, and Jeff who say, 'Thanks for your support.'