tools/embeddings.lua
embeddings.lua options:
-h [<boolean>](default:false)
This help.-md [<boolean>](default:false)
Dump help in Markdown format.-config <string>(default:'')
Load options from this file.-save_config <string>(default:'')
Save options to this file.
Data options¶
-dict_file <string>(required)
Path to outputted dict file frompreprocess.lua.-embed_file <string>(default:'')
Path to the embedding file. Ignored if-langis used.-save_data <string>(required)
Output file path/label.-save_unknown_dict <string>(default:'')
Path to file for saving vocabs not found in embedding.
Embedding options¶
-lang <string>(default:'')
Wikipedia Language Code to autoload embeddings.-embed_type <string>(accepted:word2vec-bin,word2vec-txt,glove; default:word2vec-bin)
Embeddings file origin. Ignored if-langis used.-normalize [<boolean>](default:true)
Boolean to normalize the word vectors, or not.-approximate [<boolean>](default:false)
If set, will also look for variants (case, joiner annotate) to match dictionary and word embedding.-report_every <number>(default:100000)
Print stats every this many lines read from embedding file.
Logger options¶
-log_file <string>(default:'')
Output logs to a file under this path instead of stdout - if file name ending with json, output structure json.-disable_logs [<boolean>](default:false)
If set, output nothing.-log_level <string>(accepted:DEBUG,INFO,WARNING,ERROR,NONE; default:INFO)
Output logs at this level and above.