Scanning Tax Documents: High Quality Input Yields High Quality Results
Posted by Ed Jennings on Thu, Jan 28, 2010
If you’re going to go paperless, start by creating high quality PDFs of scanned documents.
For best results, scan original forms in black and white, at 600 dpi, in duplex. It’s as simple as that. Whether you’re scanning documents for archival, scanning before preparation, or using tax document automation software to organize and extract data for you, these four requirements are at the core of document quality.
Here’s why:
- Original Forms - Every time a document is copied, the image quality is degraded. The further removed you get from the original document, the “noisier” the document becomes. Noisy documents are hard to read.
- Black and White - Though color and grayscale may look more readable to the human eye, scanning your documents in black and white will ultimately provide clearer images.
- 600 dpi – If you’re using tax document automation software, higher image resolution will improve the accuracy of document organization and data extraction. High quality input generates high quality results. Likewise, low quality input will lead to lower accuracy in forms recognition and extracted data. Because software is “reading” the documents and extracting relevant data, it is crucial that the scanned images you submit for processing are of the highest quality and are as clear as possible. Submitting documents with higher resolution will also provide up to 50% faster turnaround.
- Duplex – Duplex is just a fancy word for scanning both sides of the page at once. Scan both sides and you won’t miss any important information on the backs of pages.
Get these four points down and you’re pretty much covered. If you want to take it a step further—particularly if you’re using software to organize and extract data from the scanned documents—here are some more tips to get the highest quality images:
- Give Scanned Documents the “Readability Test”: If your eyes are straining to read a document, software will likely do the same. Noisy, hard to read documents will make it difficult for the software to read field labels and data contained therein. If software can’t read the data, it can’t extract it and populate it into the tax return.
- Look out for Distortion—Scan the Document at its Original Size. For best results, the size of the scanned document should correspond roughly to the size of the original document. If the size of a document is greatly reduced or expanded when scanned, it increases the chances that the scanned image will be distorted. Distortion may also be caused by scanning documents that are folded or crumpled, and by documents that were caught in a paper jam while being scanned.
- Scan Each Document to Its Own Page. If more than one document is scanned to the same page, only the dominant form, as determined by the software, will be bookmarked. Visibility of the less dominant form(s) will be lost, as only one bookmark will be generated per page. For scan-and-populate users, data will not be extracted from either form.
- Avoid Submitting Documents with Faint or Faded Text. Not all original tax documents are suitable for processing by tax document automation software. Going back to the “Readability Test”, faint, faded text is hard to read and therefore may not be properly classified. If a document cannot be classified, the data cannot be extracted by scan-and-populate software.
- Avoid Black Backgrounds, Ink Bleeding, and Smudging. Black backgrounds can be created by leaving the tray cover open during single page flatbed scanning. Black backgrounds, ink bleeding, and smudging are considered “noise” and can slow processing.
- Avoid Submitting Clipped or Cut Forms. Clipped or cut forms may be missing important data that the software needs to identify and classify the form, and to extract the tax data.
- Scan Multi-Page Documents Together. Multi-page documents, like brokerage statements and K-1’s, should be submitted in logical order, where possible. Most software will not reassemble a multi-page document that has been scattered throughout an input PDF.
Now that you know what to look for, it will be easy to spot documents that won’t make the cut before being processed by tax document automation software. Download our free tutorial to learn more about best practices for scanning tax documents.