Featured News - Current News - Archived News - News Categories

Use of Computer-Assisted Coding Is Endorsed to Comb Through Huge Number of Documents

by digitsadmin2
Mon, Mar 26th 2012 02:00 pm
Mark Hamblett

New York Law Journal
February 29, 2012

·Moore v. Publicis Groupe & MSL Group, 11 Civ. 1279

The use of predictive, computer-assisted coding in electronic discovery for large data-rich cases has taken a leap forward in a decision by Southern District Magistrate Judge Andrew J. Peck.

Magistrate Judge Peck has released what is believed to be the first judicial opinion endorsing the use of computer-assisted coding, whereby tools using sophisticated algorithms enable a computer to recognize relevance among a mountain of documents.

The judge explained the technique in a 26-page opinion on Feb. 24 in Da Silva Moore v. Publicis Groupe & MSL Group, 11 Civ. 1279, a case involving five female plaintiffs who are suing advertising conglomerate Publicis Groupe and its U.S. public relations subsidiary, MSL GROUP, for systemic company-wide discrimination, pregnancy discrimination and a "glass ceiling" that limits women to entry-level positions.

Quoting from his October article ("Search Forward: Will manual document review and keyword searches be replaced by computer-assisted coding?") in Law Technology News, a Law Journal affiliate, Magistrate Judge Peck said in his opinion that the use of computer-assisted coding is meant as an alternative to manual review of voluminous documents by junior law firm staffers.

The computer-assisted method involves reviewing and coding a "seed set" of documents by a senior lawyer or a small team at the firm. The computer then identifies properties of those documents that it uses to code other documents.

"As the senior reviewer continues to code more sample documents, the computer predicts the reviewer's coding. (Or the computer codes some documents and asks the senior reviewer for feedback.)," Magistrate Judge Peck said. "When the system's predictions and the reviewer's coding sufficiently coincide, the system has learned to make confident predictions for the remaining documents."

"Typically," he said, "the senior lawyer (or team) needs to review only a few thousand documents to train the computer."

In a case involving some three million e-mails, where "linear manual review is simply too expensive," the judge said the parties inDa Silva Moore agreed to use the technology but fought over a protocol for searching electronically stored information (ESI), including which custodians' e-mails should be searched. They also disagreed with MSL's proposal to review and produce only the top 40,000 documents.

At a Jan. 4 conference with the parties, Magistrate Judge Peck rejected the proposal as a "pig in a poke" and told MSL that "if stopping at 40,000 is going to leave a tremendous number of likely highly responsive documents unproduced," the proposed cutoff "doesn't work."

The parties agreed on specific ESI sources and agreed to use a 95 percent "confidence level" to create a random sample of the e-mail collection, with a sample of 2,399 documents to determine relevant documents for the "seed set" that will be used to train the predictive coding software.

The "seed set" was augmented by MSL coding certain documents through "judgmental sampling" and by MSL reviewing keyword searches with Boolean connectors such as "training and Da Silva Moore" or "promotion and Da Silva Moore" and then coding the top 50 hits from those searches.

MSL agreed to provide all those documents, excepting privileged ones, to the plaintiffs for their review of MSL's relevance coding. Plaintiffs will provide their own keywords for the review and coding of an additional 4,000 documents.

"All of this review to create the seed set was done by senior attorneys, not paralegals, staff attorneys or junior associates," Magistrate Judge Peck said.

He said the parties got to work on stabilizing the training of the software, with the judge reminding the lawyers at a Feb. 8 conference that "the idea is not to make this perfect, it's not going to be perfect. The idea is to make it significantly better than the alternatives without nearly as much cost."

'Fifty-fold Savings'

An estimate of the potential savings of technology-assisted review, he said, was provided in a study by Professor Gordon Cormack of Waterloo University and Maura Grossman, litigation counsel at Wachtell, Lipton, Rosen & Katz who is co-chair of the E-Discovery Working Group of the state Unified Court System and was named to a discovery subcommittee in the Southern District by Judge Shira Scheindlin.

Mr. Cormack and Ms Grossman wrote that "technology-assisted reviews require, on average, human review of only 1.9% of the documents, a fifty-fold savings over exhaustive manual review."

The plaintiffs in Da Silva Moore had several objections, including that there was no way to know that MSL's predictive coding approach produced accurate results, but Magistrate Judge Peck dismissed that concern as premature.

"The issues regarding relevance standards might be significant if MSL's proposal was not totally transparent," he said. "Here, however, plaintiffs will see how MSL has coded every email used in the seed set (both relevant and non-relevant), and the court is available to quickly resolve any issues."

The judge said the decision to use computer-assisted review was easy because the parties agreed to it.

"The court recognizes that computer-assisted review is not a magic, Staples-Easy-Button solution appropriate for all cases," he said. "The technology exists and should be used when appropriate, but it is not a case of machine replacing humans: it is the process used and the interaction of man and machine that the courts need to examine."

Magistrate Judge Peck said that lawyers have in the past turned to keyword searches to cull e-mail, and, while they have a place in computer-assisted searches, they can be unproductive by themselves.

He said lawyers sometimes choose keywords in a way equivalent to a game of "Go Fish," where "the requesting party guesses which keywords might produce evidence to support its case without having much, if any knowledge of the responding party's 'cards.'"

Another problem with keywords, he said, is they are often ineffective and over-inclusive, producing far too many irrelevant documents.

Magistrate Judge Peck said cooperation among the parties is critical to making computer-assisted review work, but there were several other lessons from Da Silva Moore case, including that "it is unlikely that courts will be able to determine or approve a party's proposal as to when review and production can stop until the computer-assisted software has been trained and the results are quality-control verified."

"Only at that point can the parties and the Court see where there is a clear drop off from highly relevant to marginally relevant to not likely to be relevant documents," he said. 

The magistrate judge also said that "staging discovery by starting with the most likely to be relevant sources (including custodians), without prejudice to the requesting party seeking more after conclusion of that first stage review, is a way to control discovery costs."

Magistrate Judge Peck also said it is helpful for parties to disclose, early on, who their key custodians are and how they propose to search for records, an approach he said makes it more likely the other side will agree to that approach, at least in the first phase, where there is no prejudice.

Finally, the judge said it was very helpful that the parties' vendors appeared and spoke at conferences where development of an ESI protocol was discussed.

"At e-discovery programs," he said, "this is sometimes jokingly referred to as 'bring your geek to court day.'"

He said lawyers should realize that computer-assisted review is a tool that "should be seriously considered" for use in large-data-volume cases where it may save the producing party, or even both parties, "significant amount of legal fees in document review."

"Counsel no longer have to worry about being the 'first' or 'guinea pig' for judicial acceptance of computer-assisted review," he said.

Janette Wipper, Deepika Bains and Siham Nurhussein of Sanford Wittels & Heisler in San Francisco represent the plaintiffs.

Brett M. Anders, Victoria Woodin Chavey and Jeffrey W. Brecher of Jackson Lewis in Melville represented MSL Group.

1st Department Adopts 'Zubulake' on Bearing Costs in Discovery

Brendan Pierson
New York Law Journal
February 29, 2012

Justice Acosta 
NYLJ/Rick Kopstein

The cost of finding and producing both electronically stored information and physical documents in response to discovery requests must initially fall on the party responding to the request, though courts may shift that cost at their discretion, a unanimous panel of the Appellate Division, First Department, has ruled.

The Feb. 28 decision in U.S. Bank National Association v. GreenPoint Mortgage Funding Inc., 600352/09, signed by Justice Rolando T. Acosta, is the second decision by the First Department this year adopting e-discovery standards set forth by Southern District Judge Shira Scheindlin in 2003 in Zubulake v. UBS Warburg LLC, 220 FRD 212.

In the first such decision, Voom HD Holdings v. EchoStar Satellite LLC, 600292/08, the appellate court adopted theZubulake standard in the context of spoliation of electronic data (NYLJ, Feb. 1). In the Feb. 28 ruling, Justice Acosta wrote that the court was "persuaded that Zubulake should be the rule in this Department."

Justices David B. Saxe, John W. Sweeny Jr., Leland G. DeGrasse and Sheila Abdus-Salaam concurred.

The panel's ruling reversed two decisions by Manhattan Supreme Court Justice Bernard J. Fried.

The lawsuit was filed in the trial court in February 2009 by U.S. Bank National Association against GreenPoint Mortgage Funding Inc., a now-defunct lender that specialized in issuing mortgages to people with little or no documentation of income and assets and sold notes backed by those mortgages to investors. U.S. Bank is the indenture trustee for the holders of mortgage-backed notes issued by GreenPoint in 2005 and 2006, initially valued at $1.83 billion.

Two years after the notes were sold, $530 million worth of the underlying mortgage loans had either been written off as worthless or were severely delinquent, according to U.S. Bank's lawsuit. The bank alleges that GreenPoint breached the warranties it made when it issued the notes, and that it failed to honor its agreement to cure losses in the notes' value.

Along with its complaint, U.S. Bank served GreenPoint with a request for discovery. Instead of responding to that request, GreenPoint sent a letter to the court in April 2009 asking for a ruling on whether its production of requested materials should be conditioned on U.S. Bank agreeing to pay the cost of production. In December 2009, GreenPoint filed a motion for a protective order requiring, among other things, that both parties pay for the cost of discovery they requested of the other party.

In April 2010, Justice Fried rejected GreenPoint's proposed protective order, saying it went too far in requiring a party requesting discovery to pay the legal fees incurred by the opposing party in determining whether or not requested materials were privileged.

However, the judge said that he would not disturb "the well-settled rule" that a party requesting discovery bears the cost of production. U.S. Bank, seeking a clearly adverse order from which it could appeal, asked for an order clarifying that the parties would have to bear each other's production costs, excluding attorney fees. Justice Fried issued such an order in October 2010, and U.S. Bank appealed. The case was argued before the First Department on Nov. 25, 2011.

In overturning the lower court, Justice Acosta began by noting that the increasing importance of electronic discovery, which can be very costly, had made the question of which party bore the cost more pressing.

New York's Civil Practice Laws and Rules are silent on the subject, he wrote, as are the rules of the Commercial Division for Supreme Court in Nassau County, "previously recognized by this Court as the most sophisticated rules concerning discovery."

The courts that have addressed the issue "have not done so with one voice," the judge wrote.

'Most Practical Framework'

Justice Acosta said the First Department was now adopting the standard put forth under Zubulake, under which a party producing information must initially bear the cost, though it can ask the court to shift the cost to the requesting party under certain circumstances—for example, if a request proves unduly burdensome.

"We are now persuaded that the courts adopting the Zubulake standard are moving discovery, in all contexts, in the proper direction," Justice Acosta wrote.

"Zubulake presents the most practical framework for allocating all costs in discovery, including document production and searching for, retrieving and producing" electronically stored information.

The panel rejected GreenPoint's argument that the court should adopt the opposite standard on the grounds that it encourages parties to "self-regulate" their discovery requests.

"First, requiring the producing party to bear its own cost of discovery, including the searching, retrieving and producing of [electronically stored information], supports 'the strong public policy favoring resolving disputes on their merits,'" Justice Acosta wrote, quoting Zubulake. "The alternative of having the requestor pay 'may ultimately deter the filing of potentially meritorious claims' particularly in circumstances where the requesting party is an individual."

"Finally, the adoption of the Zubulake standard is consistent with the long-standing rule in New York that the expenses incurred in connection with disclosure are to be paid by the respective producing parties and said expenses may be taxed as disbursements by the prevailing litigant," the judge wrote.

He said the proper course of action for GreenPoint would be to move to strike discovery requests that it believes are unduly burdensome, and, if it does not succeed, ask the court to shift costs to U.S. Bank.

Constance M. Boland of Nixon Peabody, who represented U.S. Bank, could not immediately be reached for comment.

James A. Murphy of Murphy & McGonigle, lead counsel for GreenPoint, said, "While we felt that the New York rule was superior, we are certainly accustomed to dealing with the Zubulake approach in our federal cases."

GreenPoint was also represented by Michael T. Conway of LeClairRyan and Matthew P. Previn of Buckley Sandler.

Mark A. Berman of Ganfer & Shore, a Law Journal columnist who is not involved with this litigation, said the decision "gives clarity so that you're better able to advise clients whose obligation it is in the first instance for e-discovery costs. When the client comes in and you have a dispute, they would like to know, who's going to pay this? Until now, you often said the law was unclear."


Digital Forensics Newsletter