
From the Ground Up: Managing Data Challenges with AI in Construction Projects

Editor’s Note: The construction industry generates enormous amounts of complex data, making effective data management a significant challenge—especially when litigation or regulatory compliance comes into play. In this article, HaystackID® explores how AI-driven tools, such as HaystackID’s Core Intelligence AI™, enhance how construction firms handle eDiscovery. With insights from industry experts, this article provides readers with innovative strategies for managing sensitive information, reducing costs, and maintaining compliance with regulatory standards like the Cybersecurity Maturity Model Certification (CMMC). Sam Morgan, Director of Legal Solutions at HaystackID, aptly noted that success starts with a well-vetted blueprint that turns data complexity into a manageable process. Read the full article to learn how to build your eDiscovery blueprint.
From the Ground Up: Managing Data Challenges with AI in Construction Projects
By HaystackID Staff
Firms face a towering challenge when managing complex construction data, and an inability to manage the chaos could result in a collapse. Project management platforms, CAD drawings, BIM models, and email communications create a sprawling digital landscape that demands advanced solutions to ensure cost control, data security, and legal defensibility.
Legal events in construction are complicated. Whether it’s a subpoena, a government inquiry, or a workplace incident, each event involves massive volumes of diverse data. During a recent HaystackID® webcast, “Protect Sensitive Data and Control Costs: An eDiscovery Blueprint for the Construction Industry,” a panel of experts shared how construction firms can handle these nuanced challenges using advanced tools and strategies.
From Hard Hats to Hard Drives: Handling Complex Data in a Regulated Industry
Every construction, from planning and design to execution and completion, leaves a digital footprint. Sources include project management software, subcontractor communications, and even video footage from construction sites. Wearable devices, like smart helmets or wristbands, can provide critical data on worker locations, activities, and safety incidents, offering key evidence in disputes related to timelines, site conditions, or labor claims.
“In construction, we have enormous data sizes, and it goes back to the project owner who conceptualizes and begins things years before they ever go to design, engineering gets involved, and ground gets broken,” said Mark MacDonald, CEDS, Vice President of Enterprise and Strategic Accounts at HaystackID, during the webcast. “It typically starts way before we ever consider litigation or a legal event that might be tied to one of these projects, whether it’s a subpoena, a government inquiry, it could be an injury on the work site.”
Managing and protecting all these different data types takes concerted effort, coordination, and resources, especially when factoring in regulatory compliance like the Cybersecurity Maturity Model Certification (CMMC). Finalized in October 2024, CMMC 2.0 establishes cybersecurity standards for companies in the defense industrial base (DIB) that handle federal contract information (FCI) and controlled unclassified information (CUI). CMMC is a tiered framework with three cybersecurity maturity levels, ranging from basic cyber hygiene to advanced practices that protect sensitive national security data.
“If you’re a builder or an attorney representing builders with federal contracts and critical infrastructure projects, especially with government agencies, then you’ve probably heard of CMMC,” said MacDonald. “There are certain types of data you must be cautious of and ensure that your vendors and law firms are compliant. If there are designs for any of the [data] that come under this, like shipping, power plants, or a naval yard, whatever it might be, that is encrypted at rest. The vendors and the sites you use to process and review data are locked down to the nth degree.”
Beyond regulatory compliance, construction data is also highly sensitive. Firms must carefully manage proprietary information about designs, financial records, and privileged documents to maintain confidentiality and avoid litigation risk.
Building Order from Data Chaos with GenAI
AI offers a way to tame this data chaos. AI-driven tools like technology-assisted review (TAR) and generative AI (GenAI) offer significant advantages, helping firms quickly identify and protect sensitive information while reducing costs.
“GenAI allows us to automate first-level review with accuracy and deeper insights. It’s not just about identifying relevant documents; it’s about classifying data more effectively,” noted Young Yu, Vice President of Advanced Analytics and Strategic Solutions at HaystackID, during the webcast.
For example, legal teams can eliminate thousands of irrelevant documents upfront by running a domain analysis, like emails from domains like Amazon or social media.
“It’s a very similar approach. When you’re testing your prompt criteria, you want examples of documents responsive to each of your issues or your buckets,” explained Yu, who added that firms should also test against non-responsive documents.
“You don’t want to lean too far the other way. The validation at the end, especially for GenAI where precedent hasn’t been set, the larger that validation sample, the better,” Yu said.
Yu and Esther Birnbaum, Executive Vice President of Legal Data Intelligence at HaystackID, recently collaborated on a high-stakes document review project in the financial services sector to test how GenAI could perform compared to traditional human-led review, starting with a dataset of 500 documents that human reviewers had already coded through active learning. This human-coded set served as their baseline, providing a “ground truth” to measure GenAI’s performance.
In the initial pass, they kept it simple—using plain language prompts and copying issue tags directly from the review protocol. GenAI achieved an impressive 85% recall and between 90-92% precision without extensive prompt engineering. This meant that GenAI correctly identified the most responsive documents that human reviewers had marked with minimal setup. Recognizing areas where recall was lower for certain issues, they refined the prompts to make them more specific. This targeted adjustment improved recall to 90% in the second pass while maintaining nearly the same level of precision. For the final pass, they combined the best versions of each prompt and tested the same 500-document sample. The results yielded 92% recall and 90% precision, all accomplished in about four and a half hours of billable time.
“The results we saw were incredible. This was not in any way an easy review. And I think my response after we ran the full set was, ‘I never want to do a human first-level review again.’ I felt pretty strongly about our results,” Birnbaum said.
Firms can use tools like HaystackID’s GenAI-powered Core Intelligence AI™ to help manage their complex data environments. By automating key tasks such as document classification, privilege review, and PII/PHI identification, Core Intelligence AI reduces the need for manual review, cutting costs while improving accuracy. For example, a construction firm handling a large-scale government project could leverage Core Intelligence AI’s automated summaries and contextual analysis to quickly identify relevant documents from vast datasets, ensuring compliance with CMMC standards and accelerating responses to government inquiries or litigation events.
AI in eDiscovery is still a relatively new concept, but the potential is enormous, and interest is growing fast. An April 2024 Thomson Reuters survey found that 43% of legal professionals said they planned to use GenAI within the next three years. Using these tools could mean substantial savings, faster resolutions, and more defensible legal processes in the construction industry.
The key to success lies in the expertise behind the tools, and in the case of tools like ChatGPT, your output is only as good as your prompt.
“Anyone can buy a Formula One Ferrari, but that does not make you Mario Andretti. Just because the tools exist, you still need the expertise to drive them, train them, and teach others how to use them,” said MacDonald.
Partnering for Success in Construction eDiscovery
Managing data in the construction industry requires more than just advanced technology—it demands a trusted partner with deep expertise and tailored solutions. HaystackID’s Construction Discovery Solutions leverage AI-driven tools like Core Intelligence AI, advanced forensic collection services, and secure cloud platforms to help legal teams streamline their eDiscovery workflows, reduce costs, and improve compliance. Whether handling privilege reviews, identifying sensitive information, or navigating regulatory requirements like CMMC, HaystackID’s solutions offer a reliable path to defensibility and efficiency.
“With construction discovery or any type of complex eDiscovery, you will have data size and volume challenges. That’s why firms need a well-vetted, thought-out blueprint that will be their planning document going forward [throughout the discovery process],” said Sam Morgan, Director of Legal Solutions at HaystackID, during the webcast.
With the right partner and tools, construction firms can confidently face even the most complex data challenges and litigation demands. Learn how we can help your firm build an eDiscovery blueprint by booking time with our experts.
About HaystackID®
HaystackID® specializes in solving complex data challenges related to legal, compliance, regulatory, and cyber events. Core offerings include Global Advisory, Data Discovery Intelligence, the HaystackID Core® Platform, and AI-enhanced Global Managed Review powered by ReviewRight®. Recognized globally by industry leaders like Chambers, Gartner, IDC, and Legaltech News, HaystackID prioritizes security, privacy, and integrity in its innovative solutions for leading companies and legal practices worldwide.
Assisted by GAI and LLM technologies.
SOURCE: HaystackID