Stacked Generalization

“Stacking generates the members of the stacking ensemble using several learning algorithms and subsequently uses another algorithm to learn how to combine their outputs.” It combines the classification results of several classifiers, and combines them.

Stacking is most commonly implemented using logistic regression. Suppose there are K classifiers, and l output labels. Then the stacking generalization is this logistic model:

\(P ( y=c | x) = \frac{1}{\exp\left( - \sum_{k=1}^{K} w_{kc} x_{kc} + b_c \right) + 1}\)

Here we demonstrate the use of stacking of two classifiers.

Import the package, and employ the subject dataset as the training dataset.

>>> import shorttext
>>> subdict = shorttext.data.subjectkeywords()

Train a C-LSTM model.

>>> wvmodel = shorttext.utils.load_word2vec_model('/path/to/GoogleNews-vectors-negative300.bin.gz')
>>> clstm_nnet = shorttext.classifiers.frameworks.CLSTMWordEmbed(len(subdict))
>>> clstm_classifier = shorttext.classifiers.VarNNEmbeddedVecClassifier(wvmodel)
>>> clstm_classifier.train(subdict, clstm_nnet)

A test of its classification:

>>> clstm_classifier.score('linear algebra')
{'mathematics': 1.0, 'physics': 3.3643366e-10, 'theology': 1.0713742e-13}
>>> clstm_classifier.score('topological soliton')
{'mathematics': 2.0036438e-11, 'physics': 1.0, 'theology': 4.4903334e-14}

And we train an SVM, with topic vectors as the input vectors. The topic model is LDA with 128 topics.

>>> # train the LDA topic model
>>> lda128 = shorttext.classifiers.LDAModeler()
>>> lda128.train(subdict, 128)
>>> # train the SVM classifier
>>> from sklearn.svm import SVC
>>> lda128_svm_classifier = shorttext.classifiers.TopicVectorSkLearnClassifier(lda128, SVC())
>>> lda128_svm_classifier.train(subdict)

A test of its classification:

>>>  lda128_svm_classifier.score('linear algebra')
{'mathematics': 1.0, 'physics': 0.0, 'theology': 0.0}
>>> lda128_svm_classifier.score('topological soliton')
{'mathematics': 0.0, 'physics': 1.0, 'theology': 0.0}

Then we can implement the stacked generalization using logistic regression by calling:

>>> stacker = shorttext.stack.LogisticStackedGeneralization(intermediate_classifiers={'clstm': clstm_classifier, 'lda128': lda128_svm_classifier})
>>> stacker.train(subdict)

Now the model is ready. As a result, we can do the stacked classification:

>>> stacker.score('linear algebra')
{'mathematics': 0.55439126, 'physics': 0.036988281, 'theology': 0.039665185}
>>> stacker.score('quantum mechanics')
{'mathematics': 0.059210967, 'physics': 0.55031472, 'theology': 0.04532773}
>>> stacker.score('topological dynamics')
{'mathematics': 0.17244603, 'physics': 0.19720334, 'theology': 0.035309207}
>>> stacker.score('christology')
 {'mathematics': 0.094574735, 'physics': 0.053406414, 'theology': 0.3797417}

The stacked generalization can be saved by calling:

>>> stacker.save_compact_model('/path/to/logitmodel.bin')

This only saves the stacked generalization model, but not the intermediate classifiers. The reason for this is for allowing flexibility for users to supply their own algorithms, as long as they have the score functions which output the same way as the classifiers offered in this package. To load them, initialize it in the same way:

>>> stacker2 = shorttext.stack.LogisticStackedGeneralization(intermediate_classifiers={'clstm': clstm_classifier, 'lda128': lda128_svm_classifier})
>>> stacker2.load_compact_model('/path/to/logitmodel.bin')
class shorttext.stack.stacking.StackedGeneralization(intermediate_classifiers: dict[str, AbstractScorer] | None = None)[source]

Bases: ABC

Abstract base class for stacked generalization.

An intermediate model that takes output from other classifiers as input features and performs another level of classification.

The classifiers must have the score() method that takes a string as input.

Reference:

David H. Wolpert, “Stacked Generalization,” Neural Netw 5: 241-259 (1992).

M. Paz Sesmero et al., “Generating ensembles of heterogeneous classifiers using Stacked Generalization,” WIREs Data Mining and Knowledge Discovery 5: 21-34 (2015).

__init__(intermediate_classifiers: dict[str, AbstractScorer] | None = None)[source]

Initialize the stacking class.

Args:

intermediate_classifiers: Dictionary mapping names to classifier instances.

register_classifiers() None[source]

Register the intermediate classifiers.

Must be called before training.

register_classlabels(labels: list[str]) None[source]

Register output labels.

Args:

labels: List of output class labels.

Must be called before training.

add_classifier(name: str, classifier: AbstractScorer) None[source]

Add a classifier to the stack.

Args:

name: Name for the classifier (no spaces or special characters). classifier: Classifier instance with a score() method.

delete_classifier(name: str) None[source]

Delete a classifier from the stack.

Args:

name: Name of the classifier to delete.

Raises:

KeyError: If classifier name not found.

translate_shorttext_intfeature_matrix(shorttext: str) Annotated[ndarray[tuple[Any, ...], dtype[float64]], '2D Array'][source]

Convert short text to feature matrix for stacking.

Args:

shorttext: Input text.

Returns:

Feature matrix of shape (n_classifiers, n_labels).

convert_label_to_buckets(label: str) Annotated[ndarray[tuple[Any, ...], dtype[int64]], '1D Array'][source]

Convert label to one-hot bucket representation.

Args:

label: Class label.

Returns:

One-hot array with 1 at the label’s position.

convert_traindata_matrix(classdict: dict[str, list[str]], tobucket: bool = True) Generator[tuple[Annotated[ndarray[tuple[Any, ...], dtype[float64]], '2D Array'], Annotated[ndarray[tuple[Any, ...], dtype[int64]], '1D Array']], None, None][source]

Yield training data matrices.

Args:

classdict: Training data dictionary. tobucket: Whether to convert labels to buckets. Default: True.

Yields:

Tuples of (feature_matrix, label_array).

abstractmethod train(classdict: dict[str, list[str]], *args, **kwargs) None[source]

Train the stacked generalization model.

Args:

classdict: Training data. *args: Additional arguments. **kwargs: Additional keyword arguments.

Raises:

NotImplementedError: Abstract method.

abstractmethod score(shorttext: str, *args, **kwargs) dict[str, float][source]

Calculate classification scores for all labels.

Args:

shorttext: Input text. *args: Additional arguments. **kwargs: Additional keyword arguments.

Returns:

Dictionary mapping class labels to scores.

Raises:

NotImplementedError: Abstract method.

class shorttext.stack.stacking.LogisticStackedGeneralization(intermediate_classifiers: dict[str, AbstractScorer] | None = None)[source]

Bases: StackedGeneralization, CompactIOMachine

Stacked generalization using logistic regression.

Uses neural network with sigmoid output to combine predictions from intermediate classifiers.

Note:

Saves the stacked model but not the intermediate classifiers.

train(classdict: dict[str, list[str]], optimizer: Literal['sgd', 'rmsprop', 'adagrad', 'adadelta', 'adam', 'adamax', 'nadam'] = 'adam', l2reg: float = 0.01, bias_l2reg: float = 0.01, nb_epoch: int = 1000) None[source]

Train the stacked generalization model.

Args:

classdict: Training data. optimizer: Optimizer for training. Options: sgd, rmsprop, adagrad, adadelta, adam, adamax, nadam. Default: adam. l2reg: L2 regularization coefficient. Default: 0.01. bias_l2reg: L2 regularization for bias. Default: 0.01. nb_epoch: Number of training epochs. Default: 1000.

score(shorttext: str) dict[str, float][source]

Calculate classification scores for all labels.

Args:

shorttext: Input text.

Returns:

Dictionary mapping class labels to scores.

Raises:

ModelNotTrainedException: If model not trained.

savemodel(nameprefix: str) None[source]

Save the stacked model to files.

Note: Intermediate classifiers are not saved. Save them separately.

Args:

nameprefix: Prefix for output files.

Raises:

ModelNotTrainedException: If model not trained.

loadmodel(nameprefix: str) None[source]

Load the stacked model from files.

Note: Intermediate classifiers are not loaded. Load them separately.

Args:

nameprefix: Prefix for input files.

Reference

“Combining the Best of All Worlds,” Everything About Data Analytics, WordPress (2016). [WordPress]

David H. Wolpert, “Stacked Generalization,” Neural Netw 5: 241-259 (1992).

M. Paz Sesmero, Agapito I. Ledezma, Araceli Sanchis, “Generating ensembles of heterogeneous classifiers using Stacked Generalization,” WIREs Data Mining and Knowledge Discovery 5: 21-34 (2015).

Home: Homepage of shorttext