TET PDF IFilter supports all relevant flavors of PDF input:
In addition to Western text TET PDF IFilter fully supports Chinese, Japanese, and Korean (CJK) text. All CJK encodings are recognized; horizontal and vertical writing modes are supported. Automatic detection of the locale ID (language and region identifier) of the text improves the results of Microsoft’s word breaking and stemming algorithms, which is especially important for East Asian text.
Right-to-left languages such as Hebrew and Arabic are also supported. Contextual character forms are normalized and the text is delivered in logical order.
TET PDF IFilter treats PDF documents as containers which may contain much more information than only plain pages. TET PDF IFilter indexes all relevant items in PDF documents:
The advanced metadata implementation in TET PDF IFilter supports the Windows property system for metadata. It indexes XMP metadata as well as standard or custom document info entries. Metadata indexing can be configured on several levels:
TET PDF IFilter optionally integrates metadata in the full text index. As a result, even full text search engines without metadata support (e.g. SQL Server) can search for metadata.
TET PDF IFilter supports various Unicode postprocessing steps which can be used to improve the extracted text:
Average Star Rating: 0.0 out of 5 (0 vote)
If you finish the payment today, your order will arrive within the estimated delivery time.You must be logged in to post a review.
Reviews
There are no reviews yet.