Tuesday, January 12, 2010

Approaches For Triaging Foreign-Language Documents

Posted by: Joe Thorpe January 12, 2010

One of the many complications encountered with litigation involving international parties is dealing with large volumes of foreign language documents. Typical approaches range anywhere from asking ones international client for translation support, hiring bilingual reviewers to the case team, using Machine Translation (MT) to translate all of the documents and outsourcing documents for full translation.

In this post, I will discuss advantages and limitations of each of the above and add a few more options for your consideration as well.

In an earlier post, I referred to cross lingual concept searching and categorization. This critical process should be run in advance of any translation or review in order to reduce the volume of documents (and costs associated with that effort).

Asking client to provide staff for foreign language document review and translation support: this is a very good option if your international client has staff to spare. Client’s employees will already have some unique understanding as to their employers products and services, industry unique nomenclature and perhaps some idea as to the issues in question. At least some of these employees will need to be bilingual with good command of English in order to communicate well with the case team. It's less likely that these people will be trained in US law so the roles would be limited to that of helping case team identify potentially responsive/relevant documents for the US case team to evaluate.

In the event that the international client cannot provide any (or enough) staff for this function, you may want to consider outsourcing. Bilingual and native speakers can be made available either on site or by remote access to work with the case team. When remote, these people are usually billable by the quarter hour and can be utilized cost-effectively.

Using MT to Translate All of the Documents: efficacy of machine translation is determined by a wide range of factors. Generally speaking, European languages translated to English are far more accurate than Asian and Middle Eastern language machine translations to English. If the documents are converted directly from native text (computer created by Word processor, spreadsheet, presentation software etc.) the results will be much more readable than if they were scanned documents first converted by OCR. Scans of handwritten documents cannot be recognized by MT. (click here to see an example of enhanced MT)

Documents translated by machine will never be confused with documents originally written in English. Sentence structure, grammar and word usage simply will not be right. That's not even mentioning idiomatic problems which are abound. That being said, as often as not, the reader will get a gist of what is being said in the document; certainly useful in helping to decide documents which can be eliminated from the review. Also, useful in determining which documents require further treatment.

Post edited MT: this can range from lightly post edited to fully edited and ranges in cost from $.04 a word to $.10 a word in my experience. Lightly post edited helps tremendously in getting the context of a document and fully post edited MT is hard to differentiate from English originated text.

Abstracts: these are summaries that can be a simple title and a one line description at a cost of approximately 5 dollars per document to more full summaries ranging in cost from $10-$15 per document. These are particularly useful where documents are handwritten or otherwise not good candidates for MT.

Human Translation: by far the most expensive approach (costs usually range from $.25-$.35 per word) and given the above options, should only be used for documents expected to be used as evidence.