This article is the first in a two-part series discussing the importance of information governance practices and the increased need for revised information management practices. Regulatory and legal obligations imposed on corporations to protect the security and privacy of information create increasing information governance challenges. With widespread data volume growth as a norm, the need for automation and reliance on technology to serve a role in governing data is apparent. The article discusses various options organizations consider when trying to create or revise their information governance practices. The first part of the series discusses the need for information governance plans, with the second part of the series addressing the use of rules-based technologies to help classify and manage information.
Managing Technology with People, Process, and Planning
Finding specific business records can pose a challenge for corporations and other business entities. Managing the vast amounts of electronically stored information in the possession of large organizations is a complex task fraught with challenges. The need for an established “Information Governance” (IG) plan, replete with the means to enforce best practices of governing information is essential for many reasons. Why is information governance an important topic or concern?
Examining the rationale for “Why” information governance practices are essential requires discussion regarding the definition of information governance. Variants on IG’s definition exist, depending on the vertical market(s) that a business operates. The “Information Governance Initiative” (IGI) proffers the following definition of IG: “The activities and technologies that organizations use to maximize the value of their information while minimizing risk and cost.”
There are distinctions that exist between IG and Data Governance. IG protocols go beyond just governing data, seeking to improve business processes for implementing information management, and data protection procedures. IG requires that business stakeholders work closely with I.T.
professionals. Senior business stakeholders understand the content and use-case purposes of organizational information within their department, while their I.T. counterparts possess greater knowledge regarding the source of information and storage of electronic files. IG practices require planning across departmental lines and employment roles prior to implementation of new organizational technology. Effective IG practices seek to address not only identifying and protecting information, and managing data growth and storage, but it must ensure information retrieval for investigations, FOIA requests, and other business purposes.
Using Technology as Part of an IG Plan
Business organizations are often unique, they have their own processes in place and rely on a myriad of different technologies and systems designed to perform specific functions. From a macro-level, information contained within various data sources within a corporate environment presents a variety of challenges. The various structured and unstructured data possessed by a large business are subject to applicable privacy and security regulations while remaining accessible for business purposes when
needed. Communication across disparate systems is often impossible and gaining access to information can be difficult depending on the original data’s source and format. Textual information within the data landscape at the individual file level can serve as a valuable vehicle toward proper IG practices, helping safeguard sensitive information and enforcing disposition of files.
Global organizations face greater challenges since data transfer across geographical boundaries pose regulatory and data privacy challenges. Having an accurate data map of the infrastructure of a corporation’s I.T. environment showing the physical location of electronically stored information is necessary as a pre-requisite to any IG plan. Yet, knowledge regarding the files source location and format is only part of the equation. IG also requires actionable knowledge regarding information within your environment, including intelligence regarding the specific content of each file.
Professionals involved in records and information management will serve an integral role in creating or updating IG plans. Since organizational needs differ based on business uses and the technology systems that support the business, technologies never provide a true “one-size-fits-all” solution. It is important for an organization to understand where technology can improve their IG plans, and to define what IG tasks the technology addresses. Certain entities developed information management plans that prove useful to satisfy IG requirements, while other organizations will face more daunting tasks to establish new internal practices. Various entities rely on outside consultancies and advice from professionals with various information management certifications to assist them in defining and enforcing their data governance policies. Regulatory and legal obligations impact vertical industries differently, those with more complex challenges tend to place even greater reliance on automated IG solutions.
IG practices overlap with other aspects of information management. Data Loss Prevention (DLP) raises a major concern for businesses. Organizational DLP practices serve as a resource for IG protocols and IG solutions integrate with DLP technology. The insight provided by IG technology enhances the ability to safeguard data and protect against loss. Deletion of corporate data under records management protocols provides benefits for IG practices. Purging of information requires expertise and input from records management and information governance personnel. IG technology strives to remove the need for data entry in more than one system. Through direct integration with active data source environments, IG technology purges data while providing an audit trail.
Various stakeholder must take action to bring order to a chaotic enterprise environment. Business organizations in the past have implemented policies to help them address “GRC” (Governance Risk Compliance). Those combined GRC policies have been a component of records management policies and these elements are being reexamined by organizations and incorporated into a more holistic IG approach. Electronic Information Management (EIM) systems provide intelligence and enhance content management, records management, and knowledge management practices.
Existing policies and technologies can provide a solid foundation toward building a more automated approach for managing information, while also reducing risk and cost. Mathematical principles such as “Latent Semantic Analysis” (LSA) (referred to as “Latent Semantic Indexing” (LSI) in the information retrieval context) serve to assist in organizing information based on patterns detected within natural language. “Artificial Intelligence” (AI) uses algorithms continue to gain traction in data governance, with both supervised and unsupervised machine learning providing elements of automation for classifying information. As new technology is incorporated to perform IG functions, organizational plans or policies should be documented and approved by applicable audits.
Considerations for an IG plan
Understanding “Why” an organizations IG protocols are important merits a discussion that reaches several topics associated with the ability to manage information. Both data mapping and content analysis are techniques that assist business enterprise in implementing IG practices. An organization’s ability to determine what information is in their possession, where that information lives, and how to access it when needed is vital for a variety of business purposes:
- Data Privacy
- Data Security
- Business Intelligence
- Litigation
- Compliance
- Risk Management
A quote provided by Jason R. Baron, a lawyer at Drinker Biddle LLP, further illustrates the importance of Information Governance as follows:
“Information governance is more important today than ever—especially in light of recently imposed data privacy laws and regulations that effectively require corporations to know something about data and legacy data, including where sensitive, personal data is stored. See, for example, both the newly in force General Data Protection Regulation (GDPR), effective as of May 25, 2018, as well as the even more recent enactment of the California Consumer Privacy Act (with an effective date of January 1, 2020). These examples of privacy legislation potentially may end up being a real driver of culture change within organizations—empowering champions of information governance in the C-suite to make the business case for greater corporate awareness and discipline in the face of ever-growing data glut due to social media, the Internet of Things, and the growth of electronic communications generally.”
Three basic types of actors exist in any scenario: 1) The Proactive, 2) The Reactive, and 3) The Inactive. Organizational attitudes toward IG range and various factors exist that make specific vertical
industries more likely to find themselves in the proactive group. A significant percentage of business is reactional in nature. The reactional organizations are trying to expand and improve their IG practices to address various business needs and to avoid sanctions for poor information management
practices. Businesses are addressing IG concerns as part of a continued effort to improve their knowledge management capabilities, improving their abilities to rapidly retrieve information sought by a user.
Multiple factors are giving traction to IG efforts and updated information management practices. Risk management is a current driving force increasing the awareness of IG’s importance. The threat of losing data to a cybersecurity incident compels corporations to examine their IG practices. Theft or inadvertent production of sensitive corporate information is a major concern for organizational stakeholders. Regulatory compliance concerns have moved organizations out of the inactive category, forcing businesses to evaluate and implement solutions aimed at satisfying compliance requirements.
Managing the vast amounts of electronically stored information in an organization’s environment poses many challenges for records management and information governance professionals. However, customizable technology-based solutions available to address proper information management solutions can serve as useful resources for establishing IG best practices. Organizations must involve various needs across departments when forming IG plans, including internal audits, compliance, legal, KM, IT, risk, security, and finance.
The stakeholders within an organization require access to information so they can address different requirements, regulations and business tasks. IG protocols should provide an easy means of access to the required information regardless of the business need, while also limiting access to only authorized personnel. Information Governance practices ensure that valuable data is accessible to those with access, while also protecting against information loss. Policies regarding data retention require clear definition for IG enforcement. In addition, IG plans must clearly define responsibilities for implementation and enforcement of document retention schedules. Information lifecycles require consideration for handling files, ranging from data creation, storage, and purging.
Relying on technology in some capacity is necessary for proper information governance. Available IG technology solutions provide file level and content level analysis across an enterprise. Having access to both file and content level attributes across data sources within a company enhances an organizations ability to fulfill the various data privacy, regulatory, and legal requirements that specific data requires. Automated classification of information based on the textual content of each file can serve several IG functions.
IG technologies provide various solutions for specific tasks, comprising part of an overall IG strategy and plan. IG solutions allow for installation behind an organization’s firewall and integrate with both active and archival data sources. Web-crawlers index information regarding the file attributes of the source data contained in each disparate system, providing administrators and end-users insight into various metadata associated with each file. Information about the characteristics of each individual file, including the subject matter of the file’s content, enhances the ability to organize the data. Performing search and retrieval functions for data across an enterprise allows end-users to locate all of the specific files they are seeking regardless of the business purpose.
File Analysis
By performing “file analysis,” differing stakeholders accomplish a variety of important business tasks. Organizations use this aspect of an IG plan to help fulfill other related obligations: e-discovery compliance, investigations, FOIA requests, data privacy protection, data breach response, defensible disposition, and risk management. File analysis provides metrics regarding the existing data volume present within a source, or across multiple sources. Using file analysis enables custom report creation providing source data assessment.
File analysis determines custodial data ownership within a source location. In addition, the file level analysis provides details of the native source locations and the number of duplicative files within the source data. Culling down data sets to identify files unrelated to the matter at hand is a benefit provided by file analysis technology. If date range limitations are part of the search protocol, file analysis reduces the data universe of the project, retrieving only the information within the relevant date range.
IG plans using file analysis technology provide guidance to enforce document retention plans. IG technology enforces document retention schedules based on the specific file attributes and file types. File-level information enhances records retention capabilities and reduces I.T. storage costs by offering single-instance storage options. Detecting exact duplicates at the file level enables IG technology to limit the storage of that file to one occurrence.
Content Analysis
Beyond the file analysis function, IG technologies provide greater visibility into the specifics of any file within the enterprise through the provision of “content analysis” functionality. Having insight into the subject matter of each piece of information containing text through content analysis enables corporations to better organize their data. Subsequent to the content analysis results, IG technologies govern data based on a set of defined rules, applying those rules against the attributes of each file. Files identified as containing certain specific content will require different handling based on the data within that individual file.
Grouping information together based on similar attributes is a component of most existing IG plans. If sensitive personal information is present as part of a file’s content, it is essential for a modern business to know they possess that information. Business entities understand they have obligations to safeguard certain data. Relying on content analysis to identify and organize information that requires special handling simplifies the remainder of the IG process.
Content analysis provides visibility into the subject matter of the file’s message, along with information regarding other file attributes. Using the textual language present within the file assists in the final disposition of that file. Organizations detect content which poses additional security concerns: personally identifiable information, corporate trade secrets, privileged or confidential information. Files that possess sensitive content require additional access to view and certain restrictions limit user permissions. Protecting the sensitive corporate data assets becomes more manageable through content analysis.
Enhancing search and retrieval functionality of an organization’s information provides value to all business units. IG solutions provide search capability from one dashboard across an enterprise improving business functions which require search and retrieval of specific information. Since IG solutions deliver visibility into content across the entire data landscape, regardless of the specific source, more thorough results are delivered to the search. Creating efficiency by removing the need to run repetitive data searches across separated systems clearly provides business value. Content analysis results improve the accuracy of search results since they are applied as a “tag” to each file, indicating that the specific file contains an attribute with some meaning to the business organization. Searching for any specific tag will produce the results for all the files previously designated with that same tag.
Information Management Practices
Corporations put data governance elements in place and instruct their employees on how to comply with existing policies. Most organizations have established document retention schedules and rely on technology such as “Enterprise Content Management” (ECM) systems, or document management systems. Information stored in folder level taxonomies and content analysis is generated to help classify and store the information. Organizations rely on archival systems as a component of information management helping to reduce IT storage costs and ease the strain on active environments.
The presence of “dark data” in their possession and control of organizations is another factor which IG plans must address. Many large businesses have server environments that contain legacy data. Organizations have only limited knowledge regarding what data resides in certain sources and no easy means for indexing that information. Shared server environments used to store emails and loose office files from departed employees provides a significant challenge for IG professionals. Shared servers may contain large volumes of data requiring migration into another enterprise content management or document archival system. Content analysis provides more control over dark data reducing the risk it owns. EIM systems deliver some degree of content analysis useful for IG classification purposes.
While reducing risk should be a goal of any IG business plan, the ability to address other legal and regulatory requirements is also essential. Business obligations driven by regulatory or litigation requirements pose challenges which are considerations for any IG plan. I.T. and legal teams must account for litigation costs associated with information retrieval and should involve both in implementation plans for any new system.
Corporations have been striving to improve litigation readiness, seeking to reduce data collection and e-discovery processing and hosting costs. The resources in place that assist with gathering information during litigation, and the search and retrieval techniques used in e-discovery practices, can prove useful to an IG plan. Keyword search techniques familiar to e-discovery professionals also serve IG needs as well. “Technology Assisted Review” (TAR) workflows in e-discovery rely on an element of artificial intelligence to classify information, these practices can be incorporated into IG plan providing further automation and accuracy. Control sets of pre-classified information serve as a “seed-set”, used to train and adjust the recall and precision rates of an automated classification process. Using seed-sets consisting of information an organization knows to be accurately categorized to train IG systems to classify other files results in more accurate disposition of each file.
While technology, coupled with enforcement of proper records management policies and procedures, provide a means for effective IG practices, any solution must be customizable and serve to address unique organizational needs. Businesses tailor technology to provide varied information management options to multiple business units within an organization. Customized technology makes it easier for an enterprise to adopt a practical solution for all business units. IG practices recognizing the importance of three key elements: people, process, and technology, are more likely to create efficient workflows for the organization.
Everyone has heard the expressions, “Fight fire with fire,” and, “A good offense makes for a good defense.” Part two in this series will discuss fighting technology with technology, and going on the offensive to take control of organizational information.