RSS feed Get our RSS feed

News by Topic

BizReport : Research archives : October 13, 2015

Top 3 tips to improve document search

Document search is gaining steam as more businesses move their data and information online. Document search can help to ensure company statements are on-target and that good information is available to teams scattered across the country. Our expert offers her top 3 tips for a solid document search strategy.

by Kristina Knight

Search Tip #1: In search requests containing two or more Boolean connectors, use parentheses for grouping.

"The key Boolean connectors are: and, or, not, w/, pre/. Most people are familiar with and, or, not. W/ finds a word or phrase within X words of another word or phrase: preponderance w/15 evidence. Pre/ finds a word or phrase within X words before another word of phrase: preponderance pre/8 evidence," said Elizabeth Thede, Director of Sales at dtSearch. "When a search request has more than one Boolean connector, use parentheses to clarify. Without such clarification, Germany w/3 France or Italy could be either (Germany w/3 France) or Italy or Germany w/3 (France or Italy). Likewise, alphabet or noodle and not soup has a very different meaning as alphabet or (noodle and not soup) than (alphabet or noodle) and not soup. The parentheses rule has an exception for search requests containing a series of terms linked only by or connectors, or a series of terms linked only by and connectors. See also search for list of words. But the safer way to proceed is to use parenthesis any time you see two or more connectors."

Search Tip #2: In all searches, use quotation marks around phrases.

"This tip is particularly important when a phrase includes one of the and, or, not Boolean connectors. For example, the phrase clear and convincing evidence includes the connector and. To search for the whole phrase as a phrase, use quotation marks: ("clear and convincing evidence" and not "preponderance of the evidence") w/55 verdict," said Thede.

Search Tip #3: Store OCR (optical character recognition) output in "searchable image" PDF format.

"If you are working with paper documents or images containing text, use a program like Adobe Acrobat to OCR into the "searchable image" (or "image with hidden text") PDF format. This format preserves the complete original scanned image, while adding the OCR'ed text for search engines," said Thede. "With a "searchable image" PDF, dtSearch can use Adobe Reader to display the full original document or other image, including handwritten notes, drawings and the like. At the same time, dtSearch can, through its Adobe Reader hit-highlighting plug-in, highlight hits "beneath" the image of the document. In this way, "searchable image" PDF becomes as close as you can get in the OCR world to having your cake and eating it too."

Image via Shutterstock

Tags: Boolean connectors, business tips, document search, dtSearch, search tips

Subscribe to BizReport



Copyright © 1999- BizReport. All rights reserved.
Republication or redistribution of BizReport content is expressly prohibited without the prior written consent.
BizReport shall not be liable for any errors in the content, or for any actions taken in reliance thereon.