tools/embeddings.lua
embeddings.lua
options:
-h [<boolean>]
(default:false
)
This help.-md [<boolean>]
(default:false
)
Dump help in Markdown format.-config <string>
(default:''
)
Load options from this file.-save_config <string>
(default:''
)
Save options to this file.
Data options¶
-dict_file <string>
(required)
Path to outputted dict file frompreprocess.lua
.-embed_file <string>
(default:''
)
Path to the embedding file. Ignored if-lang
is used.-save_data <string>
(required)
Output file path/label.-save_unknown_dict <string>
(default:''
)
Path to file for saving vocabs not found in embedding.
Embedding options¶
-lang <string>
(default:''
)
Wikipedia Language Code to autoload embeddings.-embed_type <string>
(accepted:word2vec-bin
,word2vec-txt
,glove
; default:word2vec-bin
)
Embeddings file origin. Ignored if-lang
is used.-normalize [<boolean>]
(default:true
)
Boolean to normalize the word vectors, or not.-approximate [<boolean>]
(default:false
)
If set, will also look for variants (case, joiner annotate) to match dictionary and word embedding.-report_every <number>
(default:100000
)
Print stats every this many lines read from embedding file.
Logger options¶
-log_file <string>
(default:''
)
Output logs to a file under this path instead of stdout - if file name ending with json, output structure json.-disable_logs [<boolean>]
(default:false
)
If set, output nothing.-log_level <string>
(accepted:DEBUG
,INFO
,WARNING
,ERROR
,NONE
; default:INFO
)
Output logs at this level and above.