Artificial intelligence (AI) is seeking to tackle one of the court room’s most important projects: content analysis. The ability to cost-effectively manage and analyze large amounts of unstructured evidentiary media is now a reality, and what once took hundreds, if not thousands, of human hours can now be accomplished via machine learning.
In the past, human transcription offered major benefits, such as ensuring quality. Today, machine-based transcription technologies are opening new business opportunities for broadcasters to government officials to legal professionals and others who can benefit from access to textual information derived from online video and audio content. The power and innovation now available to legal offices can explicitly be seen in the recent trend of high-dollar settlements involving the Telephone Consumer Protection Act (TCPA).
In just four years, the number of TCPA lawsuits in federal court alone swelled from 1,000 to 5,000. Because TCPA provides for damages of up to $1,500 per violation with no maximum limit, the potential exposure in a related class action can swiftly escalate into millions of dollars in judgments and, in turn, litigation expenses. The litigation expenses are largely due to the review of thousands of hours of evidentiary footage.
In a recent case, defendants and their agents were alleged to have violated the TCPA by contacting customers utilizing automated telephone dialing systems and automated or pre-recorded voice messages without obtaining prior consent. The defense used a Technology Assisted Review (TAR) partner to analyze the 33,000 hours of audio that might be relevant to the case.
The litigation investment to review these hours of audio was projected to be extremely expensive, with estimates ranging from $4.9 million to $16.5 million or more. In order to accomplish this, the defense incorporated artificial intelligence engines like transcription and A/V fingerprinting.
Transcription engines convert spoken audio and video recordings into readable text. They are built and trained to recognize different languages, dialects, and topics. Machine transcription is the most established class of cognition, with a relatively long history and significant use through industries such as call centers and legal discovery.
Word-for-word transcription has evolved into Natural Language Processing (NLP), adding contextual relevance to increase accuracy, which is expected to reach near 100% levels in the next few years. Machine transcription of multiple languages, dialects, and accents has made remarkable strides during the last few years, capped by Microsoft’s announcement late last year that its technology can transcribe text better than a human can, as reported by Business Insider.
The process of manual documentation requires a fair amount of time, and is also vulnerable to subjective interpretation of the discussion and next steps. Furthermore, data is often stored locally in a decentralized fashion, making such notes inaccessible for the rest of the organization. These recent advances have helped make recording and documenting more accurate and providing context extremely simple, efficient, and comprehensive.
Voice signal processing and acoustic technologies eliminate the errors in transcription, making them viable for use in capturing audio data. Identifying speakers using advances in speaker diarization technology and NLP help capture the essential context of a call, yielding more accurate information and saving significant effort and time.
Companies can capture and store the source audio file from conference-call systems or local microphones, and associate the resulting full transcript in a software platform that enables textual search, similar to the process that most public companies use for earnings calls and investor presentations.
The effectiveness of call documentation can be significantly improved using technologies like transcription, speaker separation, and NLP, and should be adopted by any organization seeking to improve on internal record-keeping and communication.
Similarly, audio/video fingerprinting engines generate a condensed digital summary, decisively generated as a reference clip, that can be used to quickly locate similar items across multiple media files. One of the best known examples of audio fingerprinting, also known as acoustic fingerprinting, is Shazam, which identifies songs. Fingerprinting can also be applied to video segments, to identify television commercials that run in unmeasured web video for example.
Putting artificial intelligence, like those listed above, to use drastically decreased legal review hours. AI engines listened and analyzed 33,000 hours of audio recordings and consolidated applicable data to 140 hours, saving the defendant millions and reducing review hours by 99%.
JPMorgan, as a separate example, is finding benefits in utilizing AI to expedite productivity. According to a recent article via Futurism, Lawyers and loan officers can spend a total of 360,000 hours each year on tasks like such as interpreting commercial-loan agreements. With the help of AI, the company has managed to cut this time to mere seconds.
In the banking industry alone, compliance and regulation costs $270 billion a year and accounts for 10% of operating costs. Each case can take several hours to review, monopolizing employee productivity. Not only can AI assist in things like weeding out the false positive cases, but it’s ability to streamline the review process allows employees to spend time on critical business initiatives.
As more legal agencies see the time and money saving capabilities of AI, it will become an increasingly valuable tool. Cognitive engines will more commonly augment compliance-tasks, streamline processes, and analyze case document data quickly and efficiently so legal professionals can spend their time on more critical tasks.