[Webcast Transcript] Hatch-Waxman Matters and eDiscovery: Turbo-Charging Pharma Collections and Reviews

Editor’s Note: On September 16, 2020, HaystackID shared an educational webcast designed to inform and update legal and data discovery professionals on the complexities of eDiscovery support in pharmaceutical industry matters through the lens of the Hatch-Waxman Act. While the full recorded presentation is available for on-demand viewing via the HaystackID website, provided below is a transcript of the presentation as well as a PDF version of the accompanying slides for your review and use.

Hatch-Waxman Matters and eDiscovery: Turbo-Charging Pharma Collections and Reviews

Navigating Hatch-Waxman legislation can be complex and challenging from legal, regulatory, and eDiscovery perspectives. The stakes are high for both brand name and generic pharmaceutical manufacturers as timing and ability to act swiftly in application submissions and responses many times mean the difference between market success or undesired outcomes.

In this presentation, expert eDiscovery technologists and authorities will share information, insight, and proven best practices for planning and supporting time-sensitive pharmaceutical collections and reviews so Hatch-Waxman requirements are your ally and not your adversary on the road to legal and business success.

Webcast Highlights

+ NDA and ANDA Processes Through the Lens of Hatch-Waxman
+ ECTD Filing Format Overview For FDA (NDA/ANDA Submissions)
+ Information Governance and Collections Under Hatch-Waxman
+ Dealing with Proprietary Data Types and Document Management Systems at Life Sciences Companies
+ Streamlining the Understanding of Specific Medical Abbreviations and Terminology
+ Best Practices and Proprietary Technology for Document Review in Pharmaceutical Litigation

Presenting Experts

+ Michael Sarlo, EnCE, CBE, CCLO, RCA, CCPA – Michael is a Partner and Sr. EVP of eDiscovery and Digital Forensics for HaystackID.

+ John Wilson, ACE, AME, CBE – As CISO and President of Forensics at HaystackID, John is a certified forensic examiner, licensed private investigator, and infotech veteran with more than two decades of experience.

+ Albert Barsocchini, Esq. – As Director of Strategic Consulting for NightOwl Global, Albert brings more than 25 years of legal and technology experience in discovery, digital investigations, and compliance.

+ Vazantha Meyers, Esq. – As VP of Managed Review for HaystackID, Vazantha has extensive experience in advising and helping customers achieve their legal document review objectives.

Presentation Transcript


Hello, and I hope you’re having a great week. My name is Rob Robinson. On behalf of the entire team at HaystackID, I’d like to thank you for attending today’s webcast titled Hatch-Waxman Matters and eDiscovery, Turbo-Charging Pharma Collections and Reviews. Today’s webcast is part of HaystackID’s monthly series of educational presentations conducted on the BrightTALK, and designed to ensure listeners are proactively prepared to achieve their computer forensics, eDiscovery, and legal review objectives during investigations and litigation, and our expert presenters for today’s webcast include four of the industry’s foremost subject matter experts and authorities on eDiscovery, all with extensive experience in pharmaceutical matters. 

Our first presenter that I’d like to introduce you to is Michael Sarlo. Michael is a Partner and Senior Executive Vice President of eDiscovery and Digital Forensics for HaystackID. In this role, Michael facilitates all operations related to eDiscovery, digital forensics, and litigation strategy both in the US and abroad for a HaystackID. 

Our second presenter is digital forensics and cybersecurity expert John Wilson. As Chief Information Security Officer and President of Forensics at HaystackID, John’s a certified forensic examiner, licensed private investigator, and information technology veteran of more than two decades of experience working with the US government in both public and private companies. 

Our next presenting expert, Vazantha Meyers serves, as Vice President of Discovery for HaystackID, and Vazantha has extensive experience in advising and helping customers achieve their legal document review objectives. She’s recognized as an expert in all aspects of traditional and technology-assisted review. Additionally, Vazantha graduated from Purdue University and obtained her JD from Valparaiso University School of Law. 

Our final presenting expert is Albert Barsocchini. As Director of Strategic Consulting for NightOwl Global, newly merged with HaystackID, Albert brings more than 25 years of legal and technology experience in discovery, digital investigations, and compliance to his work supporting clients in all things eDiscovery. 

Today’s presentation will be recorded and provided for future viewing and a copy of the presentation materials are available for all attendees, and in fact, you can access those materials directly beneath the presentation viewing window on your screen by selecting the Attachments tab on the far left of the toolbar beneath the viewing window, and also a recorded version of this presentation will be available directly from the HaystackID and BrightTALK network websites upon completion of today’s presentation, and a full transcript will be available via the HaystackID blog. At this time, with no further ado, I’d like to turn the microphone over to our expert presenters, led by Mike Sarlo, for their comments and considerations on the Hatch-Waxman Matters and eDiscovery presentation. Mike? 

Michael Sarlo

Thanks for the introduction, Rob, and thank you all for joining our monthly webinar series. We’re going to be covering a broad array of topics around pharmaceutical litigation in general, the types of data types, in particular around Electronic Common Technical Documents (eCTDs), which we’ll learn more about. We’re going to start out with really looking at Hatch-Waxman as a whole and new drug application and ANDA processes related to Hatch-Waxman. We’re going to get into those eCTDs and why those are important for pharmaceutical-related matters on a global scale. I’m going to start to talk about more information governance and strategies around really building a data map, which is also more of a data map that is a fact map. These matters have very long timelines when you start to look at really just the overall lifecycle of an original patent of a new drug going through a regulatory process, and then actually hitting market and then having that patent expire. We’ll learn more about that, then we’re going to get into some of the nitty-gritties of really how we enhance document reviews at HaystackID for pharmaceutical matters and scientific matters in general, and then finish off with some best practices and just a brief overview of our proprietary testing mechanism and placement platform ReviewRight. 

So, without further ado, I’m going to kick it off to Albert. 

Albert Barsocchini

Thank you very much, Michael. So, I’m going to start off with a 30,000-foot level view of Hatch-Waxman, and I always like to start off with a caveat any time I’m talking about pharma related matters. Pharma is a very complex process, complex laws, and very nuanced, and especially Hatch-Waxman. So, my goal today is really just to give you the basic things you need to know about Hatch-Waxman, and it’s very interesting. In fact, in 1984, generic drugs accounted for 19% of retail prescriptions, and in 2018, they accounted for 90% and that’s because of Hatch-Waxman. In a recent report, the President’s cancer panel found that the US generic drug market saved the US healthcare system an estimated $253 billion overall in 2018, including $10 billion in savings for cancer drugs. So, Hatch-Waxman really has been very important to the generic drug market and to us, in public, for being able to get drugs at an affordable price. 

So, how did Hatch-Waxman start? And it started with a case called Roche v. Bolar. So, Roche made a drug, it was a sleeping pill, Dalmane, I don’t know if anybody’s taken it, I haven’t. Anyway, it was very popular, it made them literally millions and billions of dollars, and so what, and normally they have a certain patent term, and what a generic drug company likes to do is to make a bioequivalent of that, and to do that, they want to try to be timed, so at the termination of a patent, the generic drugs can start marketing their product. So, in this case, Bolar started the research and development before the Roche patent expired, and because of that, they were per se infringing on the Roche patent, and so a lawsuit pursued and Bolar lost. 

Now, a couple of terms that I think are important, and I’m going to throw them out now just because there are so many nuanced pharma terms. One is branded biologic, and biosimilar generic, and then there’s branded synthetic, and bioequivalent generic. Now, branded drugs are either synthetic, meaning they’re made from a chemical process or biological, meaning they’re made from a living source. We’re going to be talking today about synthetics and what is important is that synthetic branded drugs can be exactly replicated into more affordable generic versions, bioequivalents, but because biologics involve large complex molecules, because they’re talking about living sources, that’s where biosimilar comes in. So, today, we’re going to just focus on the bioequivalents, on synthetic drugs, and just as another point, there was a… in signing the law by President Obama, I think it was around 2010, the Biosimilar Act became law, which is another law very similar to the Hatch-Waxman. So, anyway, because of the Roche case, we came out in 1983 with the Hatch-Waxman Act, and the reason they wanted this was because what was happening is since a generic company could not start to research and development until after a patent expired, this in essence gave the new drug application additional years of patent, and which means millions of more dollars, and so Congress came in, and they thought this wasn’t fair, and so they decided that they were going to allow generic companies to start the research and development process before the patent expired, and this prevented that from happening in terms of giving the original patent holder more years on the patent, and also allowed generics to get on the market quicker and get to the public at cheaper prices, and that’s just trying to strike a balance, and as you can see, between the pharmaceutical formulations, the original patents, and the new generic versions, and so it’s a delicate balance, but they seem to have achieved it because of the fact that generics are now so prevalent in the market. 

And one thing about this act, generic drug companies are not required to conduct their own independent clinical trials to prove safety and efficacy but can instead rely on research of the pioneer pharmaceutical companies, and they can start development before the original patent expires. So, that’s already a headstart because they don’t have to produce their own data, they can rely on the data of the original patent holder, and that allowed this exploration in the patent process for generic drugs. 

So, one of the important areas that is part of this whole act is the so-called “Orange Book”. So, before you can have an abbreviated new drug application, called ANDA, for approving that generic drug, you must first have a new drug application or an NDA. Now the NDA is a pioneering brand name drugs company seeking to manufacture a new drug, and they must prepare, file, and have approved its drug by the FDA. Additionally, as part of this new drug application process, the pioneering drug company submits the information on the new drug safety and efficacy [obtained] from the trials. Now, the NDA applicant must also identify all patents that could reasonably be asserted, if a person not licensed by the owner engaged in the manufacture, use, or sell the drug, and the patents covering approved drugs, or use thereof, are published in what’s called the “Orange Book”. So, a generic company will be going to this “Orange Book”, which is like a pharma bible, to see what patents are in effect, and this helps them target certain patents they want to create a generic version of, so it’s a very important starting point and this process can start while the original patent hasn’t even gone to market. 

And so, you can see things start to heat up pretty early, and one of the things that we notice in this whole process is that when a patent is filed, the clock ticks on the patent, and so it may be another six years before that patent goes to market, and so because of that, there is a… it can be very unfair, and so there’s a lot of extensions that occur for the patent holder. 

Now, what happens in this particular situation with an ANDA is that we’re going to have a Paragraph IV certification, and briefly, in making a Paragraph IV certification, the generic drugmaker says the patent is at least one of the following. It’s either invalid, not infringed, or unenforceable, and that’s really the Reader’s Digest version on their Paragraph IV certification after the story gets much more complicated and adversarial, and that’s why I always give the warning that this is a very complex dance that’s occurring with Hatch-Waxman, but ANDA really is a very, I would say, important piece of this whole puzzle, and once the ANDA information is put together, it’s filed by what’s called the Electronic Common Technical Document, eCTD, and it’s a standard format for submitting application amendments, supplements, and reports and we’re going to talk about this a little later on in the presentation. Very similar to electronic court filings, but there’s a lot more to it, but it is something that is part of the process when you start the whole process. 

Now the patent owner, their patents and a pharma patent is good for about 20 years after the drug’s invention, and the Hatch-Waxman Act gave patent extensions to name-brand drug companies to account for delays in the approval process, and that is taken into the fact that, as pointed out earlier, that when the patent is filed, research is still in development, and it may be another six years, so realizing that, they decided to extend the 20-year patent and so it can be extended for another five years, and there are also other extensions that can occur during this time. So, with that, the patent owner is also concerned about these generic drug companies and so they’re always looking over their shoulder and looking for where there may be threats to their patent, and so once a patent owner files an action for infringement, in other words, we have the ANDA, we have the certification, it’s published, and then the patent owner has a certain amount of time, within 45 days of receiving notice of the Paragraph IV certification, to file their infringement action. At that point, there’s a 30-month period that protects the patent owner from the harm that could otherwise ensue from the FDA granting marketing approval to the potentially infringing product. 

But that’s really the start of where the race begins, and it’s very important to realize that during this race, what’s going to happen is that there could be other types of generic drug applicants that want to get in on it and they want to get in on it for a very specific reason because if their certification is granted, they get a 180-day exclusivity, which means that they could go to market for their generic product, and in countries like Europe and other countries, this can be worth hundreds of millions of dollars, this exclusivity. So, you’re going to have this 45-day period where the original patent holder will file their response to it, and then everything gets locked down for 30 months, and then there’s a lot of information that has to be exchanged from all the data during the research process, and all these certifications, and so it’s a very compressed time period. 

And what Michael is going to show in these next couple slides is that compressed time period means that you have to have your ducks in order, you have to have robust collection planning, you have to have legal review teams using the latest technology, and trying to digest this patent information that has a lot of terms that can be very difficult to assimilate, and for anybody that’s not familiar with patent litigation. HaystackID has been through a lot of this, so we have a good, solid basis and understanding of this whole process, and a very interesting process that we specifically designed for Hatch-Waxman. 

So, without further ado, I’m going to hand this over to Michael, and he’s going to go through it just to show you some of that compressed timelines and then get into the whole electronic filing process. Michael? 

Michael Sarlo

Thanks for that. Appreciate it. Thank you, Albert. That was a great overview. So, as Albert mentioned, really the timeline and lifecycle of a new drug is incredibly long. Really, the drug discovery itself, finding a compound that may have some clinical efficacy, that can take anywhere from three to six years, and at that same time you’re doing testing and you’re preparing to then file an IND, which is an investigational new drug application, so a lengthy process from an administrative standpoint, and really, as we get toward litigation, the lifecycle of litigation oftentimes begins at year zero, and if an IND is approved, you’ll get into Phase I, II, and III clinical studies. At that point, assuming you’re meeting your target metrics for the IND and the study’s end goals, you can choose to submit an NDA, and that review of an NDA can take quite some time, years often, and at the end of that process, the FDA might come back and say, well, we actually want some more information and wants you to go do this or do that, which is usually pretty devastating for organizations. It really can add on years of timeframe, and if they do accept it, then you’re at a point where it’s approved and you can start to go to market and the marketing process is highly regulated, and there are specific verticals you could market, and actually, marketing would be attended to oftentimes an NDA. 

So, right here alone, we have several different data points that might all be relevant for a Hatch-Waxman matter. On the flip side, a generic manufacturer has a much shorter timeframe, and they’re much less invested from a time standpoint. Typically speaking, they’re looking at a couple of years to develop something, to do some testing, they file an NDA, and then there’s this marketing period where they get 18 to 36 months before the marketplace becomes so crowded just due to so many generics, and at that point, usually, they move on or there’s this big stockpile, and all this is important because as we start to talk about these different applications and abbreviations, it’s important to understand the mechanisms, since most people here are on this presentation for eDiscovery purposes, of how this data is organized, and really, it started out with what’s called the Common Technical Document format, which is really a set of specifications for an application dossier for the registration of medicines designed to be used across Europe, Japan, and the United States. This was the paper format version. So, really, there are many other countries who also would adhere to the modern eCTD Common Technical Document, and really what’s the goal here, is that you can choose to streamline the regulatory approval process for any application so that the application itself can adhere to many different regulatory requirements, and these cost a lot of money, millions of dollars to put these together, millions of dollars to assemble these. You’re talking tens of thousands of pages, and these have a long lifecycle, and on January 1, 2008, actually, there was more of a scanning format for submitting an eCTD to the FDA, and at that point, they actually mandated a certain format, which became the eCTD format for these submissions. 

These are broken up into five different modules, and we’ll get into that, but the prevalence and rise of the eCTD format really began in 2008, and as you can see in the above graphic, on the right here, they became highly prevalent around 2017/2018. That’s really all there is, and that’s because as of 2017, NDAs, the FDA required that they would all be in eCTD format. The same thing for ANDAs, and then also, BLAs, and then INDs in 2018 – that actually got a little bit pushed, but we don’t need to get into that here. What’s important is that all subsequent submissions to these applications, including any amendments, supplements, reports, they need to be in digital format. This is important because a common strategy when you’re trying to… I’m a large pharmaceutical company, I’m trying to get all the value I possibly can out of my invention, this drug, we’ve spent probably millions, hundreds of millions of dollars on going to market, and something that could be making us billions of dollars, is oftentimes to really go through these, more of these NDA like processes for off label uses, for new populations that were outside the original study groups that the drug was approved for, and this is where it becomes incredibly complex, and there’s this concept of exclusivity around new novel treatments relating to use of a previous compound, and this is one of the major components of that of the Hatch-Waxman dance, how big pharma really has found many different mechanisms to extend these patents beyond their term life. 

It’s also important to note that master files, Trial Master Files, these are all of your trial data, human clinical trials, all that stuff actually would get appended to these files, and just in general you think about how fast we’re approving vaccines for coronavirus, you can see why there’s concern, that our system isn’t doing due diligence when you realize that these lifecycles of any normal drug is oftentimes 15 years. Trial Master Files, we commonly handle them the same way as an eCTD package, but there is actually a new format that more international standards are trying to move to, which is the electronic Trial Master File and having more set defined specifications regarding what the structure of that looks like is something that’s going on. 

What an eCTD is, is a collection of files. So, when we think eDiscovery, we often… we do production, let’s say now, in today’s world, it’s usually a Concordance load file, and you get an Opticon and DAT file. The eCTD file, you have to think about it very much in the same way. There’s an XML transform file, think about that more like your DATs, your load files. This is going to basically have all of the metadata. It’s going to contain all the structure of the application. It’s going to have more metadata about folders. It’s also going to track when additions and changes for when documents were removed from any eCTD and this is very important. So, there’s a whole industry that services creating these. It’s very much like where someone in a niche industry and eDiscovery, everything related to drug development from a technology standpoint has very similar functions that almost cross-correlate to eDiscovery. You have your folks who are supporting the scientists as they build out these applications, and one thing is these platforms are calibrated, and they’re calibrated by a third party. It’s very important that timing and timestamps as far as when something was touched, when it was looked at, and when it was deleted, so that metadata can be incredibly important. Outside the context of Hatch-Waxman, thinking about maybe a shareholder lawsuit against some executives at a pharmaceutical company who might have been accused of having access to a failed trial prior to the general public, you see these accusations quite a bit in small pharma companies, and they dump some shares and there’s an investigation, and you can see now why this type of information of who accessed what, when, and when something was added or removed might be important. 

The same thing goes for trial data itself. It’s highly audited, who accessed it, when. That type of data is really highly confidential, even to the company that is conducting the trial. It’s usually a third party that’s handling that, and so all this history is in there, and we have metadata about each module, and you’ll see here on the right-hand side, we have a structure here. 

It looks pretty basic. There are folders, there are files. There are also more stylesheet files, schema files that are similar to XML that will more control the formatting and should be thought of as extended metadata. Likewise, we’re also going to see files and folders, PDFs, Word docs, scientific data, big databases like Tableau, things like that. So, as you start getting into all of the extra stuff that goes with an application, these can become massive, and this is usually something that spans both paper sources and digital sources, so it’s really important to basically work on these to parse them appropriately for eDiscovery purposes. 

It should be something you have a lookout for if you ever see these modules, these little “Ms” in a folder structure that you get from your client; you should stop and say, wait a minute, this looks like it has some structure, what is this, and you’ll see it’s an eCTD, and oftentimes, because of their interlinked nature between what can be a paper file that was just scanned and thrown in a folder, and/or a digital file, and then all of these additions and adds, and these are also something that these filings go back and forth between the regulators and the organization that’s putting through an application. So, they might submit something, they say, okay, we want to see more of this or that, or we want more information here. They add it to the existing eCTD. So, in that way, you can also get a separate revision history that oftentimes wraps around the discourse between the regulator and the drug company. HaystackID deals with these often and is first to market in eDiscovery to have a solution to view, parse, review, and produce eCTDs or files from eCTDs right out of Relativity, and we’d be happy to do a demo for anybody. Just shoot us an email and it’s highly useful and has been really impactful in several large cases for us where we dealt with a lot of NDAs or INDs. 

We’ll say one thing, too, here is it’s important to realize that many different organizations may be a part of this process. 

So, now, here’s a screenshot as well for you. You see a little Relativity tree over here where we break out and parse everything. We also give you full metadata, both for your eDiscovery files, your PDFs, your Word docs, all of that, that may not be contained in the eCTD. So, this is important to note too. You can’t just load this as a load file and then not actually process the data. The data needs to be processed and it needs to be linked at the same time. And here in this application, a really unique feature is your ability to sort, filter, and search based on revisions and changes. So, if we have a case, we’re just interested in the final eCTD that resulted in an approval, we can get right to that, maybe cutting out 50% of the application. If we have a case where we’re interested about the actual approval process and the application process, then we can start to look at that and look at anything that was deleted, anything that was changed – a highly useful tool. 

Right, I’m going to kick it off to my colleague, John Wilson. I probably will jump in and cut him off a few times as well, because that’s what I do, then we’re going to talk more about information governance for these matters that have an incredibly long lifecycle, like legal hold and just preparing to respond to a Paragraph IV notice as more of a large pharmaceutical organization. 

John Wilson

Thanks, Mike. So, as Mike just said, there is a significant timeline involved with these projects, and the other side of the coin is you have a short time fuse for actually responding to requests and doing the appropriate activities. So, those two things are fighting each other because you’ve got this long history of information that you’ve got to deal with, and so, as soon as you receive the Paragraph IV acknowledgment letter, you should definitely have triggered your legal hold process. There are very short timeframes for receiving and acknowledging that letter, as well as the opposing sides have typically 45 days to take action and then decide if they’re going to sue or get involved. 

So, again, short timeframes, a lot of data, and data that spans a lot of different systems because you’re talking about a lot of historical information. The pharmaceutical companies need to be prepared to challenge all their generic manufacturers ahead of the patent expirations, because that is their – if that is their prerogative because waiting until it’s filed, you’re going to have a hard getting it all together in that short order. The INDs, the NDAs, the timelines, again, you have 20 years on the patent and the timelines of when the original work was done when the IND and the NDA were filed can be over 15 years and you’ve got to deal with paper documents, you’ve got to deal with lab notebooks and digital documents across a lot of different spectrums. A lot of the information may not even be documents. A lot of it may be logging data from your clinical trials that’s in a database system, and lab notebooks that are actual physical notebooks and they’re very fragile and you can have hundreds and hundreds of them. So, how do you identify them, find them? Where are they located? Get them all brought into your legal hold. There’s a lot of challenges around that. 

So, be prepared. Preparedness is certainly the key here. Also, because you’re talking about a lot of disparate data types, how do you parse all that properly into a review so that you can actually find the information you need and action your review. So, you’ve got to actually take a lot of preparation, you’ve got to plan out and create a data map. There’s a lot of historical data systems here involved, typically, so you’ve got to really understand your fact timeline in relation with your data maps. So, lab notebooks, how were they kept 15-20 years ago, how are they kept today? Clinical trials, how is that data stored? Is it in a database? Is it in log sheets or is it in a ticker tape that’s been clipped and put into the lab notebooks? Understanding all of those different aspects is why the timeline becomes really important. You’ve got to be able to tie that whole timeline back to all the different data sources at the relevant timeframes. 

So, always assume you’re going to have a mix of paper and digital when you’re dealing in these requests, because so much of the data is so much older and the timelines go far back. It’s really important that you identify who your key players in the drug developments are, the key milestones within the timeline, because your benchmark points through your process, when did you go to clinical trials? When did you file your IND? When did you file your NDA? All of those key milestones are going to be really important because you may have a lot of key people that you may have to deal with that may no longer be around because these things happened 15 years ago, 20 years ago, so understanding who those individuals are, who the inventors are, and what files they may have, how you’re going to track those, how you’re going to get those produced for your requests. 

Also, in a lot of these matters, a smaller pharmaceutical company may have gone out and used five, six, 10 other companies that were supporting distribution or packaging, all sorts of different aspects relative to that pharmaceutical, so how are you going to get the information from those companies. What if they don’t exist anymore? Do you have retention of your own information around it? There are a lot of moving parts. Really, that fact timeline data map becomes really critical to make sure that you’ve addressed all of that. 

Then like the lab notebooks, not only are they, a lot of times, paper, they can be very fragile. You have a lot of information. Sometimes it’s old logs off thermal printers that have been cut out and pasted into the lab notebooks. Sometimes those lab notebooks are on rice paper and very think and fragile, so understanding how those are all going to be handled and that they have to be handled with care, how you’re going to get them, how you’re going to get them all scanned. They can be very challenging to actually scan a lot of that content. 

Michael Sarlo

Let me actually say one thing too is that some organizations will not let those lab notebooks out of their sight. They’re considered the absolute crown jewels, like [hyperbaric states], and big pharmaceutical companies have a strong line and track on this stuff, so they are managing it, so if you’re a third party, you’re a law firm, you’re a vendor, you may be under some heavy constraints as it relates to getting access to those lab notebooks, scanning them or even taking photos. as John said, usually they’re very old. Then actually having to track down, in some cases, people who kept their own notes and these can be dead people with how long these go on for. 

Just something to keep in mind there. Go ahead, John. 

John Wilson

Then the last part is document management systems, pharmaceutical health sciences companies have used document management systems for a long time. A lot of those documents management systems are very dated. Some of them have been updated, but you may have to span five different document management systems, because the information may be across all of them, and understanding how that specific system functions, how you’re going to get the data, how you’re going to correlate the data and load it into a review, they’re very typically non-typical data repositories, very frequently not typically. They are very frequently specialized systems that house all that data. 

So, really just driving home the last point is really the collection planning becomes very critical to support these investigations and you can wind up with all sorts of data types. A lot of them don’t get thought about until too late to properly address, like voicemail and faxes and things of those natures, or items that are in other document management or document control systems within the organization that are more data-driven and become much harder to find your relative sources in a typical review type format. 

Also, backup tapes, do you have to go into archives? Do you have to get into backup tapes for some of the data, because that may be the only place some of it’s stored, or offsite storage facilities like an Iron Mountain or places of that nature where you’ve got to go into a warehouse with 8,000 boxes and find the six boxes for this particular product. How are you going to get those documents? How are you going to get them scanned? How are you going to get them identified when you’ve got a 45-day window and you’ve got 8,000 boxes that you need six of? All of those things have to go into the larger-scale collection plan and data map to help support these investigations. 

Really, the last comment is, keeping in mind, a lot of these investigations are global. You have a company that was doing R&D here in the US and they might have been doing manufacturing in India or Norway or Germany, a lot of different places. They may have been doing clinical trials somewhere else, so you’ve got to take into consideration all these global locations and global access points for all of this data. 

From there, I will turn it back over to Mike and the rest of the team. 

Michael Sarlo

Thanks, John. Really, the name of the game here is don’t get caught unawares. Just have a strong sense of where data is, what relates to drugs that might be expiring. HaystackID with our information governance offering does a lot of work in this domain to help organizations organize all of their fringe data and really building out a data retrieval plan, when we start to get historical documents, like [inaudible] long timelines that we’re preparing for. 

I’m going to kick it off to Vazantha Meyers, Vee for short, who is going to talk about all of the document review magic that we bring to every [support opportunity].

Vazantha Meyers

Thank you, Mike. So, let me set the stage before I go into the next few slides. Mike and Albert and John have described the process, and all of that information from the timeline to the terms that are being used, to what were the goals that were being accomplished, the data sources and the milestones, and the key players have to be conveyed to a team so that they can then take that data and categorize it. 

So, all of what they’ve talked about has to be taught to the team and usually, that’s done through protocols, towards framing sessions, and a protocol that the reviewers can reference in order to make decisions on that document. The other thing that we’re asking reviewers to do is understand the data. What documents are they looking at and what’s in the document? 

So, one of the things that we understand about these particular Hatch-Waxman reviews and pharmaceutical reviews, in general, is that they contain a lot of medical terms and abbreviations that’s difficult for the industry. A lot of the drugs have long names, the protocols have long names, the projects have long names, and in order to efficiently communicate about those drugs, processes, and protocols, internally/externally, medical terms are used, and abbreviations are used across the board, internally and externally. 

One of the things that is important for a reviewer to do, in addition to understanding the process in terms of the goal of the process and the timeline and the key players, is understanding those terms in the documents. They cannot make a coding decision if they don’t understand the words that are coming out of the mouth, to quote a movie phrase. So, they have to understand the words on the paper, and so we want to make sure that that is being taught to the reviewers, and we also want to make sure that we’re being accountable for this timeframe and that we can do this teaching. So, we want to streamline that process. 

One of the ways that we can do that is by a few of the things I’m going to talk about in this next slide. So, one of the things that we do is that we make sure, in addition, is to review the protocol, the bible of the review. This is how the drug was developed, here are the timelines, the key players, the milestones, all of the information you know about the particular process in which the drug was developed. We also want to share with them background information, and that background information will be the terminology, the key phrases, the abbreviations, the project code names, etc. that we know about. A lot of times, that is shared information that comes from the client or the counsel, and it’s given to the reviewer. The other thing that we can do is take that shared resource, we mean the background information that’s available to the review team, and create a library. So, that library is everything that we’ve talked about in terms of terms, abbreviations, protocol names, project names, code names etc, and then we make that available not just on the particular project, but across several reviews for that same client, so it’s a library of terms that the reviewers have access to for every project that they work on for pharmaceutical clients, including these Hatch-Waxman reviews that have very truncated timelines. 

The other thing that we do in terms of making sure that we’re taking advantage of best knowledge is that we create client teams, so the same way that we have taken shared resources and created a library that can go across particular reviews for pharmaceutical clients, we take client teams and have review managers, key reviewers and first-level reviewers who have worked with the client, and we put them on the same – put on projects with the same clients, so that they can take that knowledge that they gained on the first few projects they work on and take that through the last project they work continuously, and they’re building their information, they’re sharing that information, which means team members go across projects, sometimes even with new counsel. And that’s a way of sharing information, sort of the library of review teams, for lack of a better way of phrasing that. 

The other thing that is available is public sources. There are public sources out there that have information about medical terms, abbreviations that’s sort of common in the industry. I will also encourage folks if they’re using that [inaudible] the one thing that we found, and this is true for every single thing that’s listed on this slide, is that these are living organisms, meaning you have background information, you have these libraries, and you have this vested team, but they are always learning new information as they’re going through the documents, and then they’re feeding that information back into the resources, meaning if I have some background information that has protocol names or medical terms or abbreviations and I go through the documents and I learn a few, I want to make sure I’m giving that information back to whoever created that shared resource, so they can update it. The same with the library, if I’m updating the shared resource, I want to make sure I’m updating the library. And the client team – and we’re going to talk about this a little bit later – client teams are always learning more information and they need to share that amongst themselves and also take that into the next review. The same with public resources, if you find that there’s something in that public resources that are lacking, please inform them and build that resource, because it benefits all of us. 

The other thing that happens in terms of a review, and I know you guys are familiar with this in terms of the day-to-day [inaudible] and communication with the review is that reviewers have a lot of questions, or they’re finding information as they go through the documents and we’ve talked about giving that back to those resources, but also we want to make sure that the reviewers are able to ask about that information in real-time. So, we use a chat room, and this is a secure chat room, but it allows the reviewers to ask questions to their whole team in real-time, meaning I have this information, I think this might be an acronym that will affect all of what we’re reviewing, can I get some clarification, can I inform you guys of this information in real-time. Everyone sees it, the QC reviewers, and the project managers, and the team leads can opine on that, they can escalate those questions, and get information back to the team in real-time. It’s really important, especially for fast-moving reviews, that reviewers are able to ask questions and get answers in real-time or give information and validate their understanding in real-time. And so, the chat room allows us to do that. 

And so, now having said that, all the information that’s pertinent that needs to go the library, go these other shared resources, or even to these public resources, it sort of needs to be documented and it needs to be [inaudible] issue logs documentation of anything that we think is impactful to the review. All of the terminology, the medical terms, the validations, the understandings, the clarifications that impact how reviewers categorize documents. We then do categorize that information in the issue log, particular to that review, and then we share that information and update our resources, these living things, these living resources I talked about after that fact. 

So, I’ve talked about… before I get into the next few slides, I’ve talked about these client teams, so one of the things that’s important for all review, but particularly reviews that have this need to understand the background information, is we select the team appropriately. So, I’m going to talk about a little bit about the selection of teams, generally, and then specifically for these particular types of review. 

So, one of the things that we have at HaystackID is we have the ability, we have our proprietary ReviewRight software that gives us the ability to gather a ton of information about reviewers and then match that reviewer to the project that is best suited for them, or at least match the project to the reviewers that are best suited for that. We do this through a qualification process, an identification process, a framing process, and then a ratings and certification process. 

In terms of qualification, we test the reviewers and we give them a 15-part test that goes through – across the review, issue coding, [prevalence review], and what we’re looking for is to see which is the best reviewer, who is going to sit up in this top right quadrant in terms of speed and accuracy and recall, who are the best reviewers technically. That doesn’t tell us if they’re better on this particular project, but it does tell us who has the best skills in terms of a reviewer. So, that’s the first assessment that we make on reviewer. 

The second thing that we’re doing is we’re looking to see what their background qualifications are, so we ask them questions about what reviews they’ve worked on, how many reviews they’ve worked on, what foreign languages do they have, skills in either fluent or reading or native etc, we want to know what practice areas they’ve worked in. Also, what tools they’ve worked on, and in particular what their scientific and their school background. What have they worked on outside of the legal field? We collect all of that information during the onboarding process. We want to be sure that we are selecting reviewers who are suitable for these Hatch-Waxman reviews. This list that I have here – you can see on the slide – we are looking for reviewers, and this is a list in ranking. 

First, we want to see what reviewers – if we’re selecting them for this particular type of review – do you have experience on Hatch-Waxman reviews. Do you have experience with this particular pharmaceutical client? Have you worked on projects with them before and are you familiar with their data, terms, and terminologies that they use in their data and communicating? Do you have experience in this industry? So, maybe you haven’t worked with this client specifically, but have you worked with other pharmaceutical clients similar to the one that we’re staffing for. Do you have patent experience? Do you understand the process, the timeline etc, the terminologies used and even that process? Then lastly, do you have at least a science or a chemistry background? 

A lot of times, reviewers will have all of these or some of these, but this is for me the [inaudible], and this is what we’re looking for and we collect that information during the onboarding process, so that we can match the reviewer to the project at hand when we’re staffing, which is particularly important, because like we talked about earlier, it’s very specific in terms of the terminology, the abbreviations, the processes being used and we’re assessing. We want to make sure that reviewers can look at a document and understand what they’re looking at. 

And then I’m not going to go through this slide in-depth, but we do a background check. Security is also very key. And we have some security information about our environment, so since we’re talking about reviewers, we do a background check. We do a general background check. We look to make sure their license is verified and we do a conflict of interest screening, so we check whether or not they have a conflict of interest-based on the employment information they’ve given us, and we also ask the reviewer to attest that they don’t have a conflict based on the parties of a particular project that we’re working on, and that’s for every project that we work on. 

So, the other goal… and I have five minutes, so I’m going to go pretty fast, so that I won’t hold you guys up. But the overall goal for managed review project is to get through the documents in a timely manner, efficiently, meaning you’re not going to cost the client any unnecessary money, accurately so you won’t make a mistake, and then defensibly so that you’re doing it according to prescribed standards. 

One of the things that we do is we want to optimize the workflow. We want to reduce the review count and then we want to optimize the workflow. Reducing the review count is interesting, when it comes to Hatch-Waxman reviews, because there’s targeted pools, so we’re looking at rich data sets. There’s not a whole lot to call [inaudible], but typically they have, and this is true for a lot of the pharmaceutical projects, they have a higher responsive rate, so their targeted pools, we understand what drug we’re looking at, this isn’t a data dump. And so, we have a higher review rate, a lower cull range, we want to go through the process and make sure that you’re optimizing your workflow. 

So, how do you that? It’s typical for a lot of reviews, so you want to make sure that you’re analyzing your search terms and that you are testing them, and that can be done pre-linear review or pre-analytical review, whichever one you’re using, and then there’s this decision on whether or not to use analytical review or linear review. 

Now, I found that with pharmaceutical clients, it’s a mixed set of data, and that data works well with certain workflows. For instance, spreadsheets and image files don’t really work that well with TAR, so 2.0 or 1.0, so continuous active learning or predictive coding. But the other documents do, like emails and regular Word documents do work well with TAR. What we’ve done for other clients is we’ve split that data set, so we have the data that works well with TAR, it goes through that process and then we pool adaptive data that doesn’t work well with TAR and put it through more of a linear process. The idea is that we’re optimizing the workflow for the data that we have, as opposed to making a decision for the overall project, so we’re being adaptive and that’s what you kind of are going to have to do with the data that we’re getting. We use custom de-duping, we make sure that we are culling out non-responsive documents as we identify them, either by similar documents or filenames, or we know that we have a newsletter that’s coming in and we want to make sure we call that out, even though it wasn’t called out at the search term level. We want to make sure we’re doing single instance review of search term hits, we’re using propagation. Particularly with redaction, most of folks who have been involved with managed review, you know that redaction can slow down the review and increase costs, so we want to make sure that we’re using the methodology available as to reduce that cost and clean up the review, and propagation happens to be one of them, as well as negotiating the use of using example redaction documents. 

Then there’s quality control, which is key for every review that you’re working on. So, I’m going to go through this, again, pretty quickly. We have a gauge analysis, and this is similar to what we’ve talked about in terms of testing reviewers as they come into our system. We test them as they come onto review, and so this allows us to give to the reviewers the same set of documents across the board. We have 10 reviewers; all 10 reviewers are looking at the same 50 documents. Outside counsel is looking at the same 50 documents as with someone in-house that’s managing the review who has been a part of the QC process. They can look at those same documents too. Everyone is coding those documents at the same time, and what that allows us to do is test understanding and instruction. 

We give the documents back [inaudible] for the reviewer and we get information about how well they do in terms of coding the documents and how well we do in terms of instructing them about how to code the documents. The solution to any low score is retraining, rewriting the protocol, or replacing reviewers, etc. So, we want to know that information upfront because it sets us off the right pace, everyone is on the same place with the review, and what that does and how it circles back to these particular reviews is that we’re on a staff timeline and we want to make sure that you’re catching any issues upfront, so it might be like a day that you have to do this gauge analysis, but it saves you so much time and additionals you see down the road because you’re making everyone should be on the same page, and all of the instructions that are given to the team should be given to the team, so it’s a really good [inaudible] go forward. 

We do traditional sampling, and targeted QC for sampling is looking at a percentage of what the reviewers have coded, looking for mistakes, and then the targeted QC would be [inaudible] in the data set and cleaning them up and that should be a typical part of most reviews. 

The other thing that we do, which is a quality control tool is event handlers, so event handlers prevent reviewers from making obvious mistakes. For instance, if I know I have a responsive document and every responsive document has to have a privileged coding or issue coding or a confidentiality coding, the event handler will trigger if the reviewer tries to save that document without making some of the necessary coding. So, if it has to have a responsive coding, the event handler will not let the reviewer save that document until it makes a privileged call or the confidentiality call or issue call. Event handlers are handlers that eliminate mistakes that we have to find later. However, for all of the systems that we can’t control, cleaning up the bottom is really important, so we want to make sure that we’re doing clean-ups and [inaudible] and conformity and consistency searches. One of the tools talked about already, if you know you have a mistake that you found with sampling or someone has told you about a mistake that you are aware of, you want to make sure that you’re going through and finding those mistakes as [inaudible] the data set so that mistake doesn’t exist, we also want to make sure that the documents are coded consistently and that redactions and very important privileged coding is very important, so you can check that in several ways. You can do hashtag searches, you can look for near dupes, and we can look for similar text and similar filenames, to make sure to clean up those documents. 

This has to be proactive and continuous. So, proactive in that you’re making sure that you are aware of mistakes that can happen with the event handlers, you’re looking at making sure everyone is on the same page in terms of the coding, and then you’re continuously looking for mistakes and [inaudible] to process. It has to happen in real-time, because we just don’t have time to clean it up after the review is over. And so, it’s really important on all reviews, it’s particularly important [inaudible] that we process that because we just don’t have the time to go back and fix it later. It’s a truncated timeline. 

With that, I apologize for breezing through these slides, but if you have any questions, please let us know. I will turn this back over to Mike Sarlo. 

Michael Sarlo

Thanks for that, Vee, really appreciate it, and I know all of our clients do as well. We have a question here and thank you all for joining. Here we go. “what is the best way to collect and especially produce the regulatory data? I assume this means the eCTD files, the NDAs, and those things. This has caused some issues in the past with respect to pages and pages of blank sheets when producing these types of documents.” 

First, that would be to understand if there’s an active eCTD management system behind the organization’s firewall or if they’ve used a cloud solution, if it is a newer matter where maybe the whole thing is digital. At that point, you would want to handle it just like any kind of unknown repository. We would test and triage it and get a repeatable outcome as we export data out from and audit it to make sure it’s the way that we think it should be. 

If these are just historical files that are sitting on a CD somewhere, that can be a process where we can scan for blank pages and things like that using some custom scripts based on pixel content or file size and look for those, but I would say that, typically speaking, you’re going to want to go back to whoever gave you the data and understand where it came from and how it was gathered, or bring in an expert company like HaystackID to work with you. 

It doesn’t cost a lot to do this right, but there can be so many systems involved and so many point of handoff, so to speak, from an eCTD becoming relevant to a matter and somebody else makes a call internally to somebody else and yada-yada-yada, it’s important to really audit that process so that you know that you have everything.

Then as far as produce it, it can then be uploaded through our tool in Relativity, where it can be acted on and tested and converted and thrown out like any regular production document. I’ve seen organizations try to produce the entire file. We’ve had them come to us with these types of issues. 

So, once we get the eCTD, handling the production is really easy. 

Any other questions? 

Great, well, thank you all for joining us today. We look forward to having you guys every month. We see a lot of the same names and faces, so we really appreciate the support. I will hand it back to Rob Robinson to close out. Any questions that pop up, please feel free to email us. You have access to these slides. We also post these on our learning section on our website. 

Go ahead, Rob. Thank you, guys. 


Thank you so much, Mike, we appreciate it. Thank you, John, Vee, and Albert for the excellent information and insight. We also want to thank each of you who took time out of your schedule to attend today. We know how valuable that time is, and we don’t take for granted you sharing it with us, so we appreciate that. 

Additional, we hope you have an opportunity to attend our next monthly webcast, and that’s scheduled for 14 October, Wednesday at 12 p.m. Eastern Time, and that will be on the topic of the Dynamics of Antitrust Investigations, and that presentation, which will be led by Michael, again will include some recent updates on FTC and DOJ practices and procedures regarding Second Requests, so please take the opportunity to attend. You can find a detailed description of that on our website and also register there. 

Again, thank you for attending. Have a great rest of the day and this formally concludes today’s webcast. 


HAYSTACKID – Hatch-Waxman and eDiscovery Webinar – 091620- Final

Learn More About HaystackID Electronic Common Technical Document (eCTD) Support

HaystackID eCTD Compliance Review Module – 091620 Update