The Data Company

About Litigation Services

 
     
  Contact Us
News
Subscribe
FAQ
 
DataLawyer Login
 
   

Maximizing Your Data Repository Searches: Coding Choices

In the "Paper Chase" article, we discussed the basic elements of document management and touched on indexing and coding. It is through your indexing and coding system that your document repository turns into a useful and powerful tool.

Scanning and coding documents have obvious cost savings over handling paper. With paper, you encounter repeated costs each time you need to copy the document set — for trial team members, consultants, opposing counsel, and so forth. Add to that the labor costs of keeping each set updated as new documents are added… well, your costs skyrocket quickly.

In terms of cost, you have several choices to make as you contemplate a digital repository. Simply scanning your documents into images is only a slight improvement over paper. You don't have to lug around boxes, but you don't yet have an efficient way to access and search those files.

To make your file set manageable, you need quickly to understand what's in each file and be able to jump immediately to the most valuable information. Each file needs an electronic "label" or "code" so the computer can sort and organize the library. Coding is organized into three levels, from basic to complex.

Coding Level Choices

Level 1 Coding

In this basic coding work, we can organize the digital files by referencing "load" files — basic instructions about a document that assist in organizing and sorting scanned images. We can attach load files to document sets for you. Without these load files, only the last person to work with a file will know where it's stored, and the file can't be included in a searchable database. (Ever try to find a file on someone else's computer — next to impossible isn't it?)

In Level 1 Coding, we simply fill in some basic fields about each document such as document date, type, author, recipient, or other simple identifiers. These fields are how you search through the database.

The benefit to you is obvious. You could ask for all of the documents from a specific author, on or around a specific date, to a specific recipient. Viola! All documents meeting those criteria are available for you to se, by selecting from a list of docs presented. You can review each one, or further refine your selections to find the exact document you want.

Level 2 Coding

In this level, we're now linking a more elaborate set of key words to your scanned images. You need highly experienced and trained coders who can review every single document and link them to helpful and intuitive key terms, issues, mentions and phrases. We work directly with our clients at the start of a project to set the criteria and standards for search terms. Only those manually entered key words and phrases are available later as search terms.

The additional coding makes your database even more valuable. Searching in a level 2 database will retrieve every instance of the term or phrase, and then take you to its context. Gone are the days of sending a paralegal to the warehouse for hours or days to pull out every piece of paper with a particular chemical name!

Level 3 Coding

Level 3 Coding creates the most complex and complete database search capabilities. As your documents are scanned, they are also processed by optical character recognition (OCR) software, which identifies each number and letter shape on the page (as opposed to creating a static image of the page). The OCR program converts this data into a text file so that the search engine can recognize each word.

Because OCR is seeking letter and number shapes, the cleaner and crisper your originals the more accurate the text file. If your documents are blurry, third- or fourth-generation photocopies or faxes, then the OCR accuracy rate begins to drop somewhat.

The benefit here is complete mastery over all content, associations, references and relationships of all people, places and descriptions in your database. The likelihood of discovering the trends you suspect are extremely positive, if the story can be told in the documents you have in the database.

Documents versus Pages

Most exhibits will be multi-page documents. Although they can range in length from one page to hundreds, standard scanning and coding projects estimate three pages per document. A database of 100,000 documents should contain 333,000 pages of information.

Depending on resolution, each document image will take up varying amounts of hard drive space. The higher the resolution for scanning (meaning that the image is much clearer, has good detail for OCR and can later be enlarged as an exhibit for court), then the more hard drive space your database will use. Quality vs. efficiency. It's always a give and take arrangement.

Budget Considerations

Document management service providers such as The Data Company all take various aspects of your project into consideration when providing budget estimates. The following items are among the many factors that The Data Company reviews when providing a customized quote for your projects. If you are interested in receiving a quote, please contact us today at sales@thedataco.com or 800-331-3874 and we'd be happy to provide you with more information on pricing.

Scanning Projects

The nature and format of your documents will weigh very heavily in your cost and time estimates. How "clean" are your documents? Naturally, the more we must "touch" each item, the more expensive and the longer the project will take.

  • Light litigation — very few staples and paper-clips — the documents are basically ready to auto-feed
  • Medium litigation — several staples and clips or the documents are bound with tabs or organized by file folder
  • Heavy litigation — many staples, binder clips and sticky notes to remove then reattach before and after scanning
  • Hands on glass — complex jobs consisting of documents that cannot be broken apart, specific pages from many books, and other items that require our scanners to manually place each item to be scanned on the glass platen

Hardware and Software Hosting

To create your custom repository, we must take time to establish a secure server location, criteria, accessibility and preferences, which would all be covered by a one-time case setup fee. Other budget considerations would include the following factors:

  • Individual user set-up — establish passwords and privileges
  • Software lease — based on number of simultaneous users
  • Hosting space — based on the number of pages and the file storage space
  • Level 1 coding — based on a "per field, per document" estimate
  • Level 2 coding — based on a "per field, per page" estimate
  • Level 3 coding — based on a per page OCR rate
  • Creating load files for pre-scanned sets of images — based on an hourly rate
  • Blowbacks — printing sets of images from your database based on a per page rate


 
 
  Privacy Policy      Disclaimer      © 2004 The Data Company, Inc.