Stacked Generalization

“Stacking generates the members of the stacking ensemble using several learning algorithms and subsequently uses another algorithm to learn how to combine their outputs.” It combines the classification results of several classifiers, and combines them.

Stacking is most commonly implemented using logistic regression. Suppose there are K classifiers, and l output labels. Then the stacking generalization is this logistic model:

\(P ( y=c | x) = \frac{1}{\exp\left( - \sum_{k=1}^{K} w_{kc} x_{kc} + b_c \right) + 1}\)

Here we demonstrate the use of stacking of two classifiers.

Import the package, and employ the subject dataset as the training dataset.

>>> import shorttext
>>> subdict = shorttext.data.subjectkeywords()

Train a C-LSTM model.

>>> wvmodel = shorttext.utils.load_word2vec_model('/path/to/GoogleNews-vectors-negative300.bin.gz')
>>> clstm_nnet = shorttext.classifiers.frameworks.CLSTMWordEmbed(len(subdict))
>>> clstm_classifier = shorttext.classifiers.VarNNEmbeddedVecClassifier(wvmodel)
>>> clstm_classifier.train(subdict, clstm_nnet)

A test of its classification:

>>> clstm_classifier.score('linear algebra')
{'mathematics': 1.0, 'physics': 3.3643366e-10, 'theology': 1.0713742e-13}
>>> clstm_classifier.score('topological soliton')
{'mathematics': 2.0036438e-11, 'physics': 1.0, 'theology': 4.4903334e-14}

And we train an SVM, with topic vectors as the input vectors. The topic model is LDA with 128 topics.

>>> # train the LDA topic model
>>> lda128 = shorttext.classifiers.LDAModeler()
>>> lda128.train(subdict, 128)
>>> # train the SVM classifier
>>> from sklearn.svm import SVC
>>> lda128_svm_classifier = shorttext.classifiers.TopicVectorSkLearnClassifier(lda128, SVC())
>>> lda128_svm_classifier.train(subdict)

A test of its classification:

>>>  lda128_svm_classifier.score('linear algebra')
{'mathematics': 1.0, 'physics': 0.0, 'theology': 0.0}
>>> lda128_svm_classifier.score('topological soliton')
{'mathematics': 0.0, 'physics': 1.0, 'theology': 0.0}

Then we can implement the stacked generalization using logistic regression by calling:

>>> stacker = shorttext.stack.LogisticStackedGeneralization(intermediate_classifiers={'clstm': clstm_classifier, 'lda128': lda128_svm_classifier})
>>> stacker.train(subdict)

Now the model is ready. As a result, we can do the stacked classification:

>>> stacker.score('linear algebra')
{'mathematics': 0.55439126, 'physics': 0.036988281, 'theology': 0.039665185}
>>> stacker.score('quantum mechanics')
{'mathematics': 0.059210967, 'physics': 0.55031472, 'theology': 0.04532773}
>>> stacker.score('topological dynamics')
{'mathematics': 0.17244603, 'physics': 0.19720334, 'theology': 0.035309207}
>>> stacker.score('christology')
 {'mathematics': 0.094574735, 'physics': 0.053406414, 'theology': 0.3797417}

The stacked generalization can be saved by calling:

>>> stacker.save_compact_model('/path/to/logitmodel.bin')

This only saves the stacked generalization model, but not the intermediate classifiers. The reason for this is for allowing flexibility for users to supply their own algorithms, as long as they have the score functions which output the same way as the classifiers offered in this package. To load them, initialize it in the same way:

>>> stacker2 = shorttext.stack.LogisticStackedGeneralization(intermediate_classifiers={'clstm': clstm_classifier, 'lda128': lda128_svm_classifier})
>>> stacker2.load_compact_model('/path/to/logitmodel.bin')
class shorttext.stack.stacking.LogisticStackedGeneralization(intermediate_classifiers={})

This class implements logistic regression as the stacked generalizer.

It is an intermediate model that takes the results of other classifiers as the input features, and perform another classification.

This class saves the stacked logistic model, but not the information of the primary model.

The classifiers must have the score() method that takes a string as an input argument.

loadmodel(nameprefix)

Load the model with the given prefix.

Load the model with the given prefix of their paths. Note that the intermediate classifiers are not loaded, and users are required to load them separately.

Parameters:nameprefix (str) – prefix of the model files
Returns:None
savemodel(nameprefix)

Save the logistic stacked model into files.

Save the stacked model into files. Note that the intermediate classifiers are not saved. Users are advised to save those classifiers separately.

If neither train() nor loadmodel() was run, it will raise ModelNotTrainedException.

Parameters:nameprefix (str) – prefix of the files
Returns:None
Raise:ModelNotTrainedException
score(shorttext)

Calculate the scores for all the class labels for the given short sentence.

Given a short sentence, calculate the classification scores for all class labels, returned as a dictionary with key being the class labels, and values being the scores. If the short sentence is empty, or if other numerical errors occur, the score will be numpy.nan.

If neither train() nor loadmodel() was run, it will raise ModelNotTrainedException.

Parameters:shorttext (str) – a short sentence
Returns:a dictionary with keys being the class labels, and values being the corresponding classification scores
Return type:dict
train(classdict, optimizer='adam', l2reg=0.01, bias_l2reg=0.01, nb_epoch=1000)

Train the stacked generalization.

Parameters:
  • classdict (dict) – training data
  • optimizer (str) – optimizer to use Options: sgd, rmsprop, adagrad, adadelta, adam, adamax, nadam. (Default: ‘adam’, for adam optimizer)
  • l2reg (float) – coefficients for L2-regularization (Default: 0.01)
  • bias_l2reg (float) – coefficients for L2-regularization for bias (Default: 0.01)
  • nb_epoch (int) – number of epochs for training (Default: 1000)
Returns:

None

class shorttext.stack.stacking.StackedGeneralization(intermediate_classifiers={})

This is an abstract class for any stacked generalization method. It is an intermediate model that takes the results of other classifiers as the input features, and perform another classification.

The classifiers must have the score() method that takes a string as an input argument.

More references:

David H. Wolpert, “Stacked Generalization,” Neural Netw 5: 241-259 (1992).

M. Paz Sesmero, Agapito I. Ledezma, Araceli Sanchis, “Generating ensembles of heterogeneous classifiers using Stacked Generalization,” WIREs Data Mining and Knowledge Discovery 5: 21-34 (2015).

add_classifier(name, classifier)

Add a classifier.

Add a classifier to the class. The classifier must have the method score() which takes a string as an input argument.

Parameters:
  • name (str) – name of the classifier, without spaces and any special characters
  • classifier (any class with a method score()) – instance of a classifier, which has a method score() which takes a string as an input argument
Returns:

None

convert_label_to_buckets(label)

Convert the label into an array of bucket.

Some classification algorithms, especially those of neural networks, have the output as a serious of buckets with the correct answer being 1 in the correct label, with other being 0. This method convert the label into the corresponding buckets.

Parameters:label (str) – label
Returns:array of buckets
Return type:numpy.ndarray
convert_traindata_matrix(classdict, tobucket=True)

Returns a generator that returns the input matrix and the output labels for training.

Parameters:
  • classdict (dict) – dictionary of the training data
  • tobucket (bool) – whether to convert the label into buckets (Default: True)
Returns:

array of input matrix, and output labels

Return type:

tuple

delete_classifier(name)

Delete a classifier.

Parameters:name (str) – name of the classifier to be deleted
Returns:None
Raise:KeyError
register_classifiers()

Register the intermediate classifiers.

It must be run before any training.

Returns:None
register_classlabels(labels)

Register output labels.

Given the labels, it gives an integer as the index for each label. It is essential for the output model to place.

It must be run before any training.

Parameters:labels (list) – list of output labels
Returns:None
score(shorttext, *args, **kwargs)

Calculate the scores for each class labels.

Not implemented. NotImplemntedException raised.

Parameters:
  • shorttext (str) – short text to be scored
  • args (dict) – arguments to be parsed
  • kwargs (dict) – arguments to be parsed
Returns:

dictionary of scores for all class labels

Return type:

dict

Raise:

NotImplementedException

train(classdict, *args, **kwargs)

Train the stacked generalization.

Not implemented. NotImplemntedException raised.

Parameters:
  • classdict (dict) – training data
  • args (dict) – arguments to be parsed
  • kwargs (dict) – arguments to be parsed
Returns:

None

Raise:

NotImplementedException

translate_shorttext_intfeature_matrix(shorttext)

Represent the given short text as the input matrix of the stacking class.

Parameters:shorttext (str) – short text
Returns:input matrix of the stacking class
Return type:numpy.ndarray

Reference

“Combining the Best of All Worlds,” Everything About Data Analytics, WordPress (2016). [WordPress]

David H. Wolpert, “Stacked Generalization,” Neural Netw 5: 241-259 (1992).

M. Paz Sesmero, Agapito I. Ledezma, Araceli Sanchis, “Generating ensembles of heterogeneous classifiers using Stacked Generalization,” WIREs Data Mining and Knowledge Discovery 5: 21-34 (2015).

Home: Homepage of shorttext