Word Embedding Models in API¶
A lot of embedding models take a few minutes to load, and it would be desirable for such a model to be loaded in the memory first. It is why such an API has been developed.
Model Preloading¶
To preload the model, use the script WordEmbedAPI provided. In the command-line shell / Terminal, type:
`
> WordEmbedAPI /path/to/GoogleNews-vectors-negative300.bin.gz
`
After a few minutes, it will be loaded.
For details about using WordEmbedAPI, please refer to: Console Scripts .
Class for Preloaded Model¶
After the model is loaded, it can be used like other word-embedding models using RESTfulKeyedVectors:
`
>>> import shorttext
>>> wmodel = shorttext.utils.wordembed.RESTfulKeyedVectors('http://localhost', port='5000')
`
This model can be used like other gensim KeyedVectors.
-
class
shorttext.utils.wordembed.
RESTfulKeyedVectors
(url, port='5000')¶ RESTfulKeyedVectors, for connecting to the API of the preloaded word-embedding vectors loaded by WordEmbedAPI.
This class inherits from
gensim.models.keyedvectors.KeyedVectors
.-
closer_than
(entity1, entity2)¶ Parameters: - entity1 (str) – word 1
- entity2 (str) – word 2
Returns: list of words
Return type: list
-
distance
(entity1, entity2)¶ Parameters: - entity1 (str) – word 1
- entity2 (str) – word 2
Returns: distance between two words
Return type: float
-
distances
(entity1, other_entities=())¶ Parameters: - entity1 (str) – word
- other_entities (list) – list of words
Returns: list of distances between entity1 and each word in other_entities
Return type: list
-
get_vector
(entity)¶ Parameters: entity – word Type: str Returns: word vectors of the given word Return type: numpy.ndarray
-
most_similar
(**kwargs)¶ Parameters: kwargs – Returns:
-
most_similar_to_given
(entity1, entities_list)¶ Parameters: - entity1 (str) – word
- entities_list (list) – list of words
Returns: list of similarities between the given word and each word in entities_list
Return type: list
-
rank
(entity1, entity2)¶ Parameters: - entity1 (str) – word 1
- entity2 (str) – word 2
Returns: rank
Return type: int
-
save
(fname_or_handle, **kwargs)¶ Parameters: - fname_or_handle –
- kwargs –
Returns:
-
similarity
(entity1, entity2)¶ Parameters: - entity1 (str) – word 1
- entity2 (str) – word 2
Returns: similarity between two words
Return type: float
-
Home: Homepage of shorttext