By some estimates, more than 10 billion emojis are sent every day in various electronic messaging mediums. With the use of chat and mobile platforms only increasing today, there is growing attention on the volume of emojis being created and their long-term impact. So, what do eDiscovery professionals need to know about these marks and how they impact the discovery process?
What is the difference between emoticons and emojis?
Emojis are small cartoon images that are interpreted and supported at the discretion of each application developer. Emoticons are the predecessor of emojis. They are traditional textual representations of an image or emotion expressed by punctuation marks, letters and numbers and typically read sideways to create a pictorial display of emotion.
While emoticons may seem simpler than emojis, they continue to grow in number and variation and some programs automagically render certain emoticons, transforming “:)” to “😊” – resulting in an emoji even when the writer didn’t intend it.
What makes emojis complicated?
The Unicode Consortium, which is the standards body making it possible for software programs to recognize and display text characters uniformly, acknowledges 3,664 different emojis today. But that number includes multiple versions of the same image with variations in gender and skin tone. A hamburger will generally be universally recognized even though in real life a McDonald’s hamburger will look different from a Burger King hamburger, and in emojis, a hamburger on an Apple device will look different than one on a Google device. Emojis are like fonts in that some fonts such as Arial are broadly supported to look the same in multiple programs and across both Windows and Apple ecosystems, while other fonts (Helvetica, for example) are not. Your beautiful presentation created on a Mac using Helvetica font probably does not look the same when presented on a PC. Throw in print or browser considerations while struggling to maintain formatting, and you start to realize why Arial is such a popular font.
Unlike fonts, however, you have little to no control over the emoji set used and how it is supported. Every application chooses which set of emoji images it supports, with many creating their own subset of proprietary emojis. Further complicating the situation, the operating system of the device itself may also choose which emoji set it supports and displays, and that set is not always compatible with the one used by the application. Much work has been done by the Unicode Consortium to standardize and translate emojis across systems, but any friend group with a mix of iPhone and Android users has probably experienced the curious incident of the little blank rectangle “”, which can also contain a question mark, indicating the receiving device does not support the emoji sent.
In day-to-day use, the discrepancies among various devices, operating systems or applications in their depiction of various emojis are a non-issue. But in discovery, there are numerous considerations when attempting to isolate and identify emojis in the data as it moves through cycles in the EDRM.
Chat platforms like Slack and Teams have provided users with creative ways to interact with emojis, not only by easily including them in messages, but also using them to react to messages. Slack users have the additional ability to create their own emojis by uploading an image and giving it a name. Once these are created, any other user in that Slack instance can employ that emoji as well. While the custom emojis follow the same shortcode textual representation that many programs support, there is no reference guide for them like Emojipedia.org, on which you can look up emojis with recognized shortcodes.
Another popular collaboration platform, Microsoft Teams, has proprietary emojis that are also not acknowledged by Unicode.org or Emojipedia.org and have little to no documented references outside of Microsoft’s ecosystem. In addition, Microsoft provides the ability for creators to customize existing emojis. In these instances, retrieving the visual backup of how the proprietary or custom emoji appeared in the application’s user interface might not be supported by the technologies used during discovery to collect, process and review.
Aside from new and custom ones constantly being added, emojis can be deprecated or evolve over time to the extent that their meaning changes. For instance, the emoji commonly displayed as a water pistol in major platforms today originally resembled a more dangerous pistol. Defined by the same Unicode value and shortcode “:gun:” over the years, the emoji once represented by Apple and Facebook as a revolver and Google as a blunderbuss rendered differently in the past than it does today, and that visual difference between the water pistol and the revolver could be the smoking gun that skews the context of the conversation. Chances are if your technology provider is displaying the emoji as an image, it is only maintaining one version of that image and previous or alternate versions will not be available.
Another important consideration is how emojis are preserved, exported and processed in the first place. Slack and Microsoft Teams use very different code language to support emojis within their own software, and the source output is not always easily transformed into something user-friendly, particularly if you are using different technology providers throughout the EDRM process.
It is also unlikely that one platform will be able to handle both the Slack versions and the Microsoft versions of emojis and display them in the same way, while also making them easily searchable. If you do need to search the emoji by text, can that search be isolated to emojis only or will searching for “smile” result in not only the textual representation of the emoji but also the use of the word “smile” throughout the rest of the documents?
While seemingly fun and innocuous in casual, or even professional, communications, emojis clearly present challenges for eDiscovery teams. But knowing what you’re dealing with is half the battle.
About the Author
Jessica Lee is a Product Manager at ProSearch who oversees WorkStream, ProSearch’s solution for chat, messaging and collaborative data. Having spent more than 15 years streamlining complicated business processes, she focuses on collaboration between business stakeholders and technical aspects to build user-centric products.