a, -article analyze a file containing a single article (debug option) q, -quiet suppress reporting progress info html produce HTML output, subsumes -links json write output in json format instead of the default format c, -compress compress output files using bzip Maximum bytes per output file (default 1M) h, -help show this help message and exitĭirectory for extracted files (or '-' for dumping to stdout) The program performs template expansion by preprocesssng the whole dump and If the program is invoked with the -json flag, then each file willĬontain several documents formatted as json ojects, one per line, with Number of files of similar size in a given directory.Įach file will contain several documents in the format: Įxtracts and cleans text from a Wikipedia database dump and stores output in a a cache is kept of parsed templates (only useful for repeated extractions).
0 Comments
Leave a Reply. |