Don’t Rush Past Relevance: Assessing the Discoverability of AI Prompts and Outputs
Much of the buzz about artificial intelligence (AI) in law has focused on its utility as a discovery tool rather than a potential source of discovery. While AI’s impact on discovery processes, such as reviewing and coding documents, generating privilege logs, and summarizing documents, is undeniable, AI interactions themselves – inputs and outputs – are poised to reshape the discovery landscape.
This article examines whether and when such interactions may be discoverable in litigation. Drawing on analogous cases involving discovery of internet searches and browser history, the authors offer practical guidance for assessing the cost, burden, and privacy implications of preservation and collection.
Background: The growing reliance by organizations and their employees on AI tools such as Microsoft Copilot and Google Gemini has generated massive new data streams that could become targets of discovery in litigation. But is the discoverability of a custodian’s AI prompts – or the outputs generated – a forgone conclusion?
Despite the novelty of AI, the bedrock principles of relevance and proportionality still apply. Case law involving search history and internet activity is instructive in this context. At the same time, AI introduces new complexities, especially around user privacy, that parties and courts should carefully consider when assessing the discoverability of AI inputs or outputs.
Key Considerations in Assessing the Discoverability of AI Sources
Relevance: The introduction of AI into the discovery universe does not alter the threshold inquiry into relevance. Put simply, the relevance of AI inputs or outputs can depend only on whether they tend to make a fact more or less probable than it would be without the evidence. Take, for example, a custodian who has prompted an AI tool to revise or edit sections of a relevant document. Is an opposing party entitled to know exactly which edits the AI tool suggested in response, and which ones the author accepted or rejected? Is the only relevant version of the document the final one that incorporates the custodian’s accepted (and rejected) edits? Or does the opposing party have a good argument that the edits the AI suggested, and/or the ones the author accepted or rejected, are relevant?
A party considering whether to preserve, collect, or produce a custodian’s interactions with an AI tool should consider whether the use of AI itself is directly at issue, and therefore, relevant. For example, if an employee or student files a lawsuit challenging a disciplinary action for alleged improper or unauthorized use of AI tools, evidence concerning that party’s use of AI would likely be relevant to the responding party’s defense. Similarly, the use of AI may be relevant in malpractice or negligence cases involving claims of unreasonable or improper reliance on AI, or in cases involving a defense premised on a justifiable reliance on AI. And even where the use of AI is not directly at issue, interactions with an AI tool may tend to prove or disprove an alleged state of mind when an author’s or custodian’s subjective understanding or knowledge is at issue. Further, in a fraud case, AI interactions may reveal the author’s knowledge of the truth or falsity of an allegedly fraudulent statement: perhaps the author rejected a clarifying suggestion that would have corrected the misstatement or specifically instructed the AI tool to avoid mentioning certain material facts. Or the author may have cited an external source to the AI tool, or provided an explanation to the AI tool, indicating that they reasonably believed the statement was true.
That said, AI interactions often are not relevant. For example, a case may hinge on proving a party had actual notice of a fact appearing in a document, rendering the author’s interactions with AI in creating the document irrelevant. In most corporate civil cases, the AI interactions with an employee will not pass even the broad relevance standards applicable to the Federal Rules.
These principles mirror those that courts routinely apply in cases involving the discovery of web searches and browser history. In those cases, courts assess whether internet activity is directly at issue in the case, or whether it tends to show a party’s state of mind or subjective knowledge. For example, in Helget v. City of Hays, No. 13-2228-KHV-KGG, 2014 WL 1308893, at *3 (D. Kan. Mar. 31, 2014), the court found that the defendant employer had a duty to preserve internet usage logs because plaintiff’s unauthorized computer use was a ground for termination which put the internet history of plaintiff and other employees at issue. Similarly, in Nacco Materials Handling Grp., Inc. v. Lilly Co., 278 F.R.D. 395, 398 (W.D. Tenn. 2011), the court found that the defendant had a duty to preserve internet history at the time plaintiff served the complaint—which alleged a violation of the Computer Fraud and Abuse Act—because, at that moment, the defendant “knew or should have known that electronic evidence residing in its computers would be relevant to the litigation.” Id. at 403.[1] Courts in criminal cases also have found searches and browser history relevant and admissible as evidence of a defendant’s state of mind or motive. See, e.g., United States v. Segui, No. 19-CR-188(KAM), 2019 WL 8587291, at *10 (E.D.N.Y. Dec. 2, 2019) (evidence of defendant’s Google searches that “mirror or relate directly to the very specific threats conveyed” in threatening email “is relevant to whether Mr. Segui specifically intended to convey a threat to harm [the victim]” and “therefore, meets the relatively low bar of relevance regarding Mr. Segui's intent, motive, knowledge, and absence of mistake.”); State v. McGrath, 169 Idaho 656, 665, 501 P.3d 346, 355 (2021) (search history of step-father and step-daughter pornography was relevant to show defendant’s motive to sexually abuse his step-daughter). However, courts in most civil litigations do not require preservation or production of internet search history. See Marshall v. Dentfirst, P.C., 313 F.R.D. 691, 696 (N.D. Ga. 2016) (denying motion for sanctions and finding no duty to preserve search history in employment discrimination case, even where defendant cited plaintiff’s online shopping as a basis for her termination, where defendant had no document retention policy regarding internet browsing history and did not retain browsing history on company-wide server); see also Marten Transp., Ltd. v. Plattform Advert., Inc. No. 14-CV-2464-JWL-TJJ, 2016 WL 492743, at *4–5 (D. Kan. Feb. 8, 2016) (declining to issue sanctions where a party deleted internet browsing history because it “did not know or have reason to know” the history would be relevant at the time of deletion, as the allegation that the allegedly infringing post had been made by the party’s employee was not raised until a year after it was deleted).
Proportionality: Proportionality is a critical lens for evaluating discovery of AI interactions. Specifically, parties should consider whether the cost and effort of preserving, collecting, and producing any potentially relevant AI interactions are reasonable and proportional to the needs of the case. “Due to the ever-increasing volume of electronically stored information and the multitude of devices that generate such information, perfection in preserving all relevant electronically stored information is often impossible... This rule recognizes that ‘reasonable steps’ to preserve suffice; it does not call for perfection.” Fed. R. Civ. P. 37(e) Advisory Committee's Notes to 2015 Amendment; see also Marten Transp., 2016 WL 492743, at *4. As companies increasingly adopt AI chats, the volume and associated burden for preserving, collecting, reviewing, and producing such interactions will often eclipse any associated benefit. Parties may therefore rely on evidence regarding the volume of AI data, as well as any technical limitations to preservation, collection, and/or production, to argue that the discovery of AI interactions is disproportional to the needs of the case.
Privacy Considerations: Users are increasingly relying on AI chatbots for advice and guidance on personal issues such as travel, mental health issues, and intimate relationships. The personal conversations users often have with AI chatbots place privacy concerns at the forefront of the AI discoverability analysis, arguably even more so than in cases involving Google searches, which employees can just as easily conduct on their personal devices as they can on their work devices. At least for the time being, however, employees often lack access to the same caliber of AI tools on their personal devices as they have on work devices and may therefore be more likely to use the latter for personal purposes. The likely presence of personal and private information represents another reason why the burdens of preserving AI chats may outweigh any benefit. See Robert Keeling and Ray Mangum, The Burden of Privacy in Discovery, Judicature, Vol. 105, No. 2 (2021).
Looking Ahead
Companies should consider the following factors in evaluating their obligation to preserve or collect AI data in a particular case:
- Relevance: Is the AI use itself directly at issue to a claim or defense in the case? Could the court view AI interactions as relevant to state of mind issues such as knowledge, motive, or bad faith?
- Reasonableness, Proportionality, and Burden: Is preservation or collection of AI data technically feasible? Is the likelihood of discovering relevant evidence from AI sources proportional to the needs of the case, considering the costs and burdens associated with preservation, collection, or production? Or would imposing such an obligation amount to an unattainable standard of perfection?
- Privacy: Are custodians or employees likely to use company AI for personal purposes, thereby placing their privacy interests at risk? Similar to internet search history, would the burden and intrusion of collecting and reviewing such AI interactions outweigh the likely benefit of the information?
Conclusion
AI is both a powerful discovery tool and an emerging discovery target. While the technology is new, the legal framework for assessing discoverability remains grounded in relevance, proportionality, and privacy. But when it comes to the discoverability, AI prompts and outputs should be evaluated like internet search history – subject to careful, case-specific analysis rather than blanket preservation or production. Companies should thus analyze their AI discovery obligations through this lens and be proactive in addressing these issues in both discovery negotiations and motion practice.
For additional information on this topic, please contact Robert Keeling, Rana Dawson, or Emma Hall.
The views expressed in this article are those of the authors and do not necessarily represent the views of the Firm or any of its clients.