public class Wordspace extends Object implements WordspaceI
DenseVector
NOTE: in order to speed-up the computation and to reduce the memory occupation, vectors, instead of being associated to words, are associated to their MD5. This, in some remote cases, can lead to word-collision. If it happens when the wordspace is loaded a WARNING message is provided.
Constructor and Description |
---|
Wordspace() |
Wordspace(String matrixPath) |
Modifier and Type | Method and Description |
---|---|
void |
addWordVector(String word,
Vector vector)
Stores the vector associated to a word.
|
char[][] |
getDictionaryDanilo()
Returns the complete set of words in the vocabulary (words having an associated vector in this wordspace)
|
String |
getMatrixPath() |
Vector |
getVector(String word)
Returns the vector associated to the given word
|
void |
setMatrixPath(String matrixPath)
Sets the path of the file where the word vectors are stored and
loads them.
|
public Wordspace()
public Wordspace(String matrixPath) throws IOException
IOException
public void addWordVector(String word, Vector vector)
WordspaceI
getVector
addWordVector
in interface WordspaceI
word
- the wordvector
- the vector associated to word
public Vector getVector(String word)
WordspaceI
getVector
in interface WordspaceI
word
- the word whose corresponding vector must be retrievedword
, null if word
is not in the vocabulary of this wordspacepublic char[][] getDictionaryDanilo()
WordspaceI
getDictionaryDanilo
in interface WordspaceI
public String getMatrixPath()
public void setMatrixPath(String matrixPath) throws IOException
The expected format is:
number_of_vectors space_dimensionality
word_i [TAB] 1.0 [TAB] 0 [TAB] vector values comma separated
Example:
3 5
dog::n [TAB] 1.0 [TAB] 0 [TAB] 2.1,4.1,1.4,2.3,0.9
cat::n [TAB] 1.0 [TAB] 0 [TAB] 3.2,4.3,1.2,2.2,0.8
mouse::n [TAB] 1.0 [TAB] 0 [TAB] 2.4,4.4,2.4,1.3,0.92
matrixPath
- the matrixPath to setIOException
Copyright © 2015 Semantic Analytics Group @ Uniroma2. All rights reserved.