Added new TOOL #3, a PDF Text Indexer into the complete SOTDS Suite!
Tool #03 SOTDS PDF Indexer
by Stone Oakvalley Studios
http://www.stone-oakvalley-studios.com/0004_01_dashboard_index.php
post@stone-oakvalley-studios.com
Introduction:
-------------
It will extract all possible text from the PDF's and process them through the SOTDS Constructor. Allows for user to create their own SOTDS .DAT datafiles and compile a ready-to-use database for the SOTDS Searcher/Suite. Note that the PDF's must contain OCR'd text already, as SOTDS does not perform this.
The tool has no command line support or GUI. Only path requesters, message requesters and input requesters. The magic is performed by SOTDS Constructor tool automatically.
Instructions:
-------------
1: Choose a path that contains PDF files.
2: Choose "single word" or "sentence mode".
If "single word" every single word on every page will be indexed.
If "sentence mode" user can decide a "sentence" of 2 to max 16 words.
3: Choose a name for the database.
Progress will be shown in command line/console window. Once done, the SOTDS Searcher will automatically launch.
Requirements:
-------------
- PDF files should contain OCR'd text naturally before this tool make any sense. This tool does not do the OCR for you!
- This tool uses the "pdfinfo.exe" and "pdftotext.exe" which are part of the "xpdfbin-win-3.03.zip" package.
###################################################################################
The Xpdf software and documentation are copyright 1996-2011 Glyph & Cog, LLC.
Email: derekn@foolabs.com - WWW: http://www.foolabs.com/xpdf/
The PDF data structures, operators, and specification are copyright 1985-2006 Adobe Systems Inc.
###################################################################################
Please review these related article links:
|