Keeping Current with eDiscovery Search Trends

HaystackID Blog | December 2, 2025

Editor’s Note: Courts continue to shape the evolving balance between emerging technologies and long-standing practices in eDiscovery. Recent rulings in Mosaic LLM, Tecfidera, and Soqui collectively highlight a critical reality: keyword searches, inclusive email threading, and contextual message production remain deeply embedded in discovery workflows—even as AI gains traction. In Mosaic, the court reaffirmed the importance of relevance and proportionality in keyword disputes, limiting overbroad search demands. Tecfidera clarified that inclusive email productions must still meet usability standards under Rule 34, especially when opposing parties rely on metadata for analysis. Meanwhile, Soqui reinforced that context matters in chat-based communications, requiring the production of surrounding messages for clarity and completeness.

Together, these decisions offer actionable guidance for legal teams navigating complex data environments. For professionals in cybersecurity, information governance, and litigation support, staying current on these developments is essential for defensible discovery strategies and effective advocacy.

Keeping Current with eDiscovery Search Trends

By Phil Favro, Contributing Author for HaystackID

The movement toward Artificial Intelligence for eDiscovery search seems inexorable for many lawyers and their clients. While no case offers judicial imprimatur regarding the use of AI for eDiscovery search, anecdotal reports confirm that many parties are using AI to identify discoverable information before producing it to opposing parties. Whether AI presently offers a cost-effective, enhanced search tool over traditional search methodologies is yet to be decided. What is not debatable, though, is that AI search is here to stay, and continuing technological improvements seem certain to enhance its functionality.

Despite the growing prevalence of AI, many parties and their counsel continue to use traditional methodologies to handle eDiscovery search. Indeed, orders addressing search issues are replete with discussions on the nature, propriety, and extent of search term usage, along with the importance of working collaboratively with litigation adversaries on developing search terms. All of these issues were present in In re Mosaic LLM Litigation, where a court recently resolved various search term disputes between the parties in the context of an AI copyright infringement lawsuit.[1] Mosaic and other recent cases suggest that the use of search terms shows no signs of abating, even with the advent of AI for search.

Beyond search terms, courts still regularly adjudicate other key search and production issues that have little to do with AI. Recent decisions on the production of inclusive email strings (In re Tecfidera Antitrust Litigation) and contextual messages from collaboration tool chat strings (Soqui v. England Logistics, Inc.) are instructive on those respective issues, along with the nature and process parties are using to handle ESI productions. Staying abreast of these ESI search developments is helpful, particularly for counsel in terms of workflow development and advocacy on disputed issues.

Mosaic and Search Terms

The search term dispute from Mosaic has played out in countless cases over the past 25 years. In this multidistrict litigation arising from the defendants’ use of the plaintiffs’ copyrighted works in connection with the training of the defendants’ AI technologies, the defendants sought an order compelling the plaintiffs to run searches for discoverable documents using 99 search terms. These 99 terms would be in addition to 30 terms on which the parties already agreed and in response to which the plaintiffs had already made document productions.

The Defendants’ Arguments

The defendants argued that their proposed terms sought relevant and proportional documents that would support their copyright infringement defenses. In particular, search terms one through 27 sought “documents regarding Plaintiffs’ use of, and commentary about, third-party AI tools,” while terms 28 through 99 “concern licensing agreements for AI training data and related commentary.” In response, the plaintiffs argued that terms one through 27 sought irrelevant information while terms 28 through 99 requested documents that were disproportionate to the needs of the case.

The Court’s Findings and Holding

Regarding terms one to 27, the court agreed with the plaintiffs and found that the information the defendants sought was irrelevant to the defendants’ fair use defense. The court reasoned that “documents regarding Plaintiffs’ use of or commentary about other generative AI models” were unrelated to the nature and extent of the defendants’ use of plaintiffs’ copyrighted works. In addition, the defendants’ search terms did not properly target documents that would reflect “market harm” to the plaintiffs from the defendants’ use of their copyrighted works. As a result, the court denied the defendants’ motion with respect to terms one through 27.

For terms 28 to 99, while tacitly acknowledging that they requested relevant information, the court found the terms were overly broad, and the documents they sought were disproportionate under the circumstances. The plaintiffs already produced some of the information in response to the 30 agreed-upon terms. Moreover, much of the information the terms sought was unrelated to the licensing agreements at issue and would obligate the plaintiffs to search through “tens of thousands of unresponsive and irrelevant documents.”

Nevertheless, the court rejected the plaintiffs’ proposal that they be allowed to run their own search terms to identify documents regarding the licensing agreements in question. Instead, the court ordered plaintiffs to produce documents in response to one search term—“(Licens! OR agree! OR collab! OR partner! OR permiss!) w/10 data! AND train! OR ‘training data’ OR dataset!)”—and allowed the defendants to develop 10 additional search terms to identify discoverable documents regarding “AI training data and related commentary.” The court also ordered the parties to meet and confer regarding any disputes over the defendants’ proposed terms and directed the plaintiffs to share with the defendants hit count metrics to substantiate any objections to the proposed terms.

Takeaways from Mosaic

Mosaic does not offer any groundbreaking lessons on the use of search terms for discovery. Instead, it makes the same points that courts and eDiscovery cognoscenti have spotlighted for many years: adopt quality assurance measures, take reasonable positions on the issues, and consider cooperation and compromise. Perhaps most importantly, Mosaic makes clear that search terms are still widely used in discovery, even in cases involving claims over the use of AI. If Mosaic is a bellwether, then there should be little expectation that search terms will be fully supplanted by AI anytime in the near future.

Tecfidera and Email Threading

Parties have argued for years over the propriety of email threading. Producing parties typically maintain that producing inclusive email threads reduces review and production costs. Requesting parties counter that producing non-inclusive emails could deprive them of key details such as metadata or recipient information. These arguments played out in the Tecfidera case as the parties to that antitrust litigation differed over whether they should be able to use email threading to produce only inclusive emails in discovery.[2] The defendant argued that producing inclusive email threads would reduce its review and production costs. In contrast, the plaintiffs expressed concern that an inclusive email production would impair the documents’ searchability, particularly the plaintiffs’ “data visualization and automated timeline tools.”

The Court’s Findings and Holding

The court sided with the plaintiffs, finding the defendant’s proposal did not meet the requirements of Federal Rule of Civil Procedure 34(b)(2)(E)(ii). The court determined that the defendant’s proposal—even if it included a customized field to account for non-inclusive metadata—would not disclose emails in the form in which they were ordinarily maintained. Nor would such a production be “reasonably usable” since it would prevent plaintiffs from using key features for searching and analyzing the emails. Finally, the court found that the defendant did not support its argument that email threading would be proportional. The defendant did not offer specific cost, hour, or resource metrics that would bolster its position that it would “incur increased costs in hosting, review, and production” if it did not limit its production to inclusive email strings. Without that information, the court was unable to evaluate whether the defendant’s “burden outweighs the benefit Plaintiffs expect from availability of the additional metadata.”

Takeaways from Tecfidera

While the court did not decide whether producing inclusive email strings is appropriate in discovery, Tecfidera remains instructive on email threading. In particular, Tecfidera highlights the expediency of reaching agreements with adversaries on the use of email threading and—if such an agreement cannot be reached—the importance of substantiating its use in motion practice. On this latter point, courts have repeatedly emphasized the need to support proportionality arguments with hard information on cost, hour, and resource metrics. Parties who neglect to do so should expect courts to deny their requests for relief.

Soqui and Contextual Messages

The advent and widespread use of collaboration tools like Google Chat, Slack, and Microsoft Teams has led to disputes about the nature and extent of responsive messages that producing parties must turn over in litigation. While the law is far from settled on what producing parties must disclose beyond the precise messages reflecting relevant information, certain cases, such as Lubrizol Corp. v. Int’l Bus. Machines Corp. have held that producing parties must divulge messages surrounding relevant chats for contextual purposes.[3] In Soqui v. England Logistics, Inc., the parties disagreed over this very same issue, with the court reaching the same result as Lubrizol.[4]

The Defendants’ Arguments

The plaintiff in Soqui sought an order compelling the defendant to produce certain Microsoft Teams messages that were not included with its production of discoverable Teams messages. The defendant produced only those Teams messages that hit on its search terms and declined to turn over other messages surrounding the produced chats. When it produced the messages, the defendant did so in “chronological order,” but not in conversation threads. In response to the plaintiff’s motion, the defendant argued that the production of surrounding or contextual messages would be unduly burdensome because it would be too time-consuming to parse through, identify, and then produce the contextual messages surrounding the produced chats.

The Court’s Findings and Holding

The court disagreed, finding that the defendant’s production of Teams chats devoid of the surrounding messages was not reasonably usable and that the production of this information was generally proportional. The court reasoned that “[r]esponsive documents include messages containing the entire conversation regarding these topics—not just isolated, individual messages containing a search term (but otherwise entirely out of context).” The court adopted the Lubrizol chat message production protocol and ordered the defendant to produce “the entirety of any Teams conversation containing twenty or fewer total messages that has at least one responsive message” and “the ten messages preceding or following any responsive Teams message in a Teams conversation containing more than twenty total messages.”

Takeaways from Soqui

Soqui reflects the reluctance some courts feel toward preventing the discovery of contextual chat messages. While such messages may arguably be beyond the scope of discovery under Federal Rule of Civil Procedure 26(b)(1), courts have nonetheless observed that it is “a rare document that contains only relevant information; and irrelevant information within an otherwise relevant document may provide context necessary to understand the relevant information.”[5] Against this backdrop, Soquie highlights the need for producing parties to proactively consider the issue and determine how to respond to discovery requests for such information. Producing parties may wish to reach accommodations with requesting parties on the production of context chat messages to better ensure production obligations are reasonable and reciprocal.

About Phil Favro

Phil Favro is the founder of Favro Law PLLC, where he counsels clients on ESI, AI, and discovery issues and serves as a special master, mediator, and expert witness. Phil is nationally recognized for his expertise on ESI, discovery, and information governance, with courts acknowledging his credentials. See, e.g., Oakley v. MSG Networks, Inc., No. 17-CV-6903 (RJS), 2025 WL 2061665 (S.D.N.Y. July 23, 2025). This background makes Phil particularly well-suited to counsel clients and advise courts on information-related issues. As a special master, Phil is acclaimed for his collaborative approach, working with parties to find stipulated solutions to complex issues. For disputes that require adjudication, he is renowned for the clarity and vigor of his written dispositions, which are available on legal search engines.

HaystackID® solves complex data challenges related to legal, compliance, regulatory, and cyber requirements. Core offerings include Global Advisory, Cybersecurity, Core Intelligence AI™, and ReviewRight® Global Managed Review, supported by its unified CoreFlex™ service interface. Recognized globally by industry leaders, including Chambers, Gartner, IDC, and Legaltech News, HaystackID helps corporations and legal practices manage data gravity, where information demands action, and workflow gravity, where critical requirements demand coordinated expertise, delivering innovative solutions with a continual focus on security, privacy, and integrity. Learn more at HaystackID.com.

Assisted by GAI and LLM technologies.

SOURCE: HaystackID

[1] In re Mosaic LLM Litig., No. 24-CV-01451-CRB (LJC), 2025 WL 3078831 (N.D. Cal. Nov. 4, 2025).

[2] In re Tecfidera Antitrust Litig., No. 24 CV 7387, 2025 WL 2734539 (N.D. Ill. Sept. 25, 2025).

[3] Lubrizol Corp. v. Int’l Bus. Machines Corp., No. 1:21-CV-00870-DAR, 2023 WL 3453643 (N.D. Ohio May 15, 2023).

[4] Soqui v. England Logistics, Inc., No. 2:24-CV-00261, 2025 WL 3080571 (D. Utah Nov. 4, 2025).

[5] Yellow Rose Prods., Inc. v. Pandora Media, LLC, No. 2:22-CV-809-MCS-MAR, 2024 WL 661162 (C.D. Cal. Jan. 24, 2024).

Keeping Current with eDiscovery Search Trends

Keeping Current with eDiscovery Search Trends

Mosaic and Search Terms

Tecfidera and Email Threading

Soqui and Contextual Messages

About Phil Favro

About HaystackID®

Worldwide Reach. Local Expert Touch.

North America | Latin America | Western Europe | Middle East | Asia-Pacific