By Jean-François Legault

With less than 30 percent of all information ever appearing as ink on paper, the “paper trail” often turns out to be a “bitstream.” This sheer volume of data held by organizations makes it clear that electronically stored information plays an essential part in litigation today. Once the information as been preserved, what’s next? Well, it would make no sense for anyone to read through all of upper management’s e-mails or review all the documents stored on an organization’s network. The solution? Applying search terms to the electronically stored information to identify responsive files and documents.

Successful searches of electronic data must produce information that is useful not only in what it tells you but in a volume that can be reviewed. The most efficient to achieve this is by constructing a list of terms that can be used to search through digital evidence to identify the most relevant documents for review.

Selecting search terms may seem easy enough: pick terms that describe what we are looking for and search whatever electronic documents we’ve recovered. But careful selection is critical unless you want to review responsive yet irrelevant documents. Here are some elements to take into account when building thinking about searching electronically stored information:

  • Determine what is to be searched: emails, documents, deleted files? Careful determination will reduce the number of hits to review and allow you to focus on what matters. Keep in mind that a focused search may provide you with focused results but may also prevent you from finding critical elements if the scope is too narrow.
  • Be careful of generic terms: they will likely produce a large volume of irrelevant documents to review to determine relevance. The term “confidential” may be critical to the review but the organization may be including an automatically generated disclaimer at the bottom of all its e-mails that contains the sentence, “The content of this message is CONFIDENTIAL.”
  • Be mindful of language: what may be targeted in English may be generic in French. Also, think of building your list of search terms in English and French (and any other language you think appropriate).
  • Short words may produce a tall amount of work: short words, such as abbreviations, might produce thousands of search hits. These terms might be contained in random text patterns such as those contained in remnants of deleted documents or binary system files found on the computer system.
  • Be wary of “embedded” words: short words may be contained in others. For example, if we’re searching for the word “car” as part of scheme involving the use of company or rental cars, the term would flag documents containing “North Carolina,” “South Carolina,” “carriage,” “carries,” “Carnegie Hall,” and thousands more.
  • Corporate culture: organizations make up their own language. Organizations have words derived from internal acronyms or inside language that only employees might use to describe elements specific to the organization.
  • Some words may mean nothing to you, but they mean something to the computer. For example, if you searching for any documents relating to the “Atlantic” region of Canada, you should keep in mind that any system file containing a reference to the “Atlantic” time zone will be identified which could mean a lot of useless files to review.

In closing, it should be noted that the keyword selection process should be a joint effort by those involved in the case. This insures that adequate terms are selected and that they meet the objectives of all involved.

Jean-François Legault is a Senior Manager with Deloitte & Touche’s Forensic & Dispute Services practice in Montreal. He specializes in computer forensic investigations, the prevention and detection of computer based fraud as well as intellectual property infringement. As such, he has been recognized as an expert witness in computer forensics in both civil proceedings and in labour arbitration cases. He is a member of the Association of Certified Fraud Examiners’ Faculty and has a Bachelor of Business Administration and a Masters of Science in Management with a specialization in Management Information Systems.  He currently holds CISSP/ISSAP/ISSMP, CISA, CISM and ITIL designations.

This article first appeared on the www.slaw.ca website on January 16, 2010 and is reprinted with the permission of the author.