BEST PRACTICES GUIDESix steps to successfultext miningA guide to building your strategy

ContentsSix steps to successful text miningIntroduction: There’s more to text mining than “Siri, find the good stuff for me.”3Step 1: Diagnose and define the problem4Step 2: Figure out how text mining can help4Step 3: Communicate the goal and set expectations5Step 4: Make detailed plans6Step 5: Establish and maintain technical parameters7Step 6: Measure output against business metrics7Conclusion82/8

Text mining helps organizations streamline businessprocesses and overcome challenges by gaining insights fromtheir mountains of unstructured textual data. However, as withany data science project, some essential steps must befollowed to produce successful results. This Best PracticesGuide offers six tips to help organizations get the most outof their text mining projects.Introduction: There’s more to text mining than “Siri, find thegood stuff for me.ˮThese days, many organizations are challenged by a lack of quick access to theright information to improve decision-making. They know it is somewhere withintheir terabytes of internal and external data, but are unable to find it quickly andcost-effectively. Number crunching alone will not tell them what they need to know.Often, the answers lie in unstructured data, such as written text.Text can be especially challenging to analyze, as it reflects the ambiguity,richness and variety of human speech. Moreover, business context can varywidely from one organization to another. For example, “hedge” means entirelydifferent things in landscaping, finance or law. With text mining, a subset of artificialintelligence, organizations can navigate through millions of pages of content, a taskthat has become far more complex than mere indexing or keyword searching.Modern, AI-driven content analytics solutions, such as OpenText Magellan Text Mining, have evolved to handle sophisticated processes and evaluate the“aboutness” of text. Ranging from a single sentence to a whole collection ofdocuments, they can understand the text’s sentiment and level of subjectivity.They can also evaluate their own accuracy, providing feedback to guide anefficient yet secure workflow process.Setting up an enterprise text mining project can seem intimidating to first-timeusers because it is a relatively novel and complex technology. Misunderstandingor misalignments can occur between stakeholders, including executives,IT departments, specialized linguists and data scientists setting up the project,metadata specialists who—before the advent of text mining applications—wouldhave processed the text manually, and Line-of-Business managers running theproject and using its output.This guide offers best practices to simplify the process and minimize misalignmentbased on OpenText’s experience in building industry-leading, AI-driven analyticssolutions that leverage Magellan Text Mining, such as Magellan for IntelligentRecommendations and AI-Enhanced Voice of the Customer Analytics, and inhelping our customers derive more value and insight from their enterprise content.Six steps to successful text mining3/8

PrecisionStep 1: Diagnose the problem and define the goalsIn document discoveryEvidence of business struggles are typically apparent in an organization’s bottom line.But research and examination are required to trace the problem back to the cause.For example, losing 1.2 million in a quarter could be due to underpricing productsor because the company did not effectively target the right customers. Losing 350customers in a month might be attributed to a competitor releasing a better-performingproduct or to frustrated buyers not being able to find needed support.projects, refers to findingonly relevant documentswithin a set, with nounwanted documents.RecallThe converse of precision,refers to finding all relevantdocuments, not overlookingany (even if a few irrelevantones are also returned).The first step toward using text mining to solve such problems is to associatebusiness solutions with their challenges. For example, an organization may seekto become more competitive by categorizing customer feedback to gain betterinsight into what customers want or to improve its indexing team’s productivityby implementing a semi-automated document tagging solution.The project team needs to go even further by setting goals that are quantitative,measurable and reasonably achievable. For the examples above, the team could seta goal to use six categories of customer feedback, based on communications withthe customer service department or to improve indexing team productivity by atleast 20 percent over the last quarter.Step 2: Figure out how text mining can helpOrganizations should focus on defining goals that are quantitative andbusiness-focused.For example, aiming to achieve 90 percent recall with the text mining solution is not abusiness solution benefit, since it does not directly solve a business problem. Somebusiness problems may be solved if precision and recall are at 80 percent, while otherscould be overlooked even if they reach 95 percent.Organizations should instead set business-focused goals, such as improving thebottom line, and then establish criteria to help get there.However, even when organizations know exactly what the business problem is andthe changes they want to make, they may not immediately realize how (or even if)text mining can solve it. The proposed solution may not be obvious to organizationsunfamiliar with text mining.The solution? Discuss problems and potential solutions with a trained semanticanalyst, for example in the OpenText Semantic Strategy Workshop described at theend of this guide.Six steps to successful text mining4/8

Step 3: Communicate goals and set expectationsThe reasons for using text mining differ, based on different roles in anorganization. Executives and managers are more likely to focus on cost reduction,profit growth and/or productivity improvements. Editorial teams, content managersand metadata experts tend to focus on accuracy.Organizations should set out to manage expectations at all levels. Desiredoutcomes should be tied to the business goals and solutions defined in Step1. To be successful, the project team should define expectations when it beginscapturing stakeholder needs. It helps to communicate the value of text mining andhow it benefits the organization throughout the project.Typical misunderstandings and discrepancies in the perceived value of textmining include:RoleFearRealityMetadata specialists:Text mining will replace us.Text mining in situations where specialistshave been manually adding metadata aretypically deployed to improve productivityand consistency and ensure content istagged more thoroughly and consistently,not to reduce head count. The benefitsof text mining deployments include thereassignment of some specialists to morecreative, less repetitive tasks.Metadata specialists/contentmanagers:Text mining will never be as insightfuland context-sensitive as us.No computer system can be as flexibleand sophisticated in its understandingas qualified human specialists but this isnot the goal of text mining. Instead, tools,such as confidence level thresholds, canbe used to pick out the most difficult orcontext-sensitive cases and direct thoseto experts for manual review.Metadata specialists/editorial teams:Text mining will never beaccurate enough.Linguistic-based text mining solutionscan achieve production-level accuracythrough intermediate steps of testingearly findings against human review andadjusting the parameters as needed.Understanding common misconceptions and countering them with the benefitsof text mining will help project teams set expectations and gain buy-in, which iscritical to the success of any AI project. It is important to distill concerns earlythrough training and education and make data specialists part of the solutionearly in the process.Six steps to successful text mining5/8

Step 4: Make detailed plansTo ensure success, include relevant teams and individuals at the following milestones:Integration planThis stage should include metadata specialists, content managers and IT managers/specialist teams to answers questions about the project, including: Business goals. The key audience. Technologies used. New workflows. Which content should be involved and how it should be preparedto ensure accuracy. Where/how the metadata will be stored. Anticipated impact on existing products. What controlled vocabularies should be trained/automated. How the text mining solution should be configured.Deployment planThis stage should include executives, managers, metadata specialists, contentmanagers and IT managers/specialist teams to answer questions about theproject, including: Details of the web user experience. Details of the back-office user experience. New syndication/content billing procedures. New products to be created. How to train users on the new workflow.Six steps to successful text mining6/8

Step 5: Establish and maintain technical parameters for thetext mining projectText mining is not a one-time operation. It improves over time as the context of thecontent grows or changes or as the organization of that content evolves. Moreover,organizations often use text mining to support ongoing processes, such as searchinginternal resources or routing forms to the appropriate destinations. It is important fordata scientists and content experts to carefully define the key technical parametersof the project, then maintain them during its lifespan.Strategic maintenance parameters to focus on include: Review of controlled vocabularies (e.g. taxonomies, authority files). The impact of new content sources, types and products. The impact of new syndication targets. The impact of the deployment of new internal/external applications.Note that Magellan Text Mining can not only extract semantic metadata, it hasthe knowledge of its own accuracy, so users can depend on its self-evaluationmechanisms to inform an efficient yet secure workflow process. Confidence andrelevance scores are useful in a business process workflow, as they can helproute the automated or semi-automated process. For example, if the text miningsystem is 80 percent confident that it tagged the document correctly, a user mightwant to automatically push that along to the next step in the process. When thesystem marks something with 60 to 80 percent confidence, the user might directit through another step where reviewers validate the tag. Using these scoresoffers organizations the flexibility to apply thresholds that reflect their specificprecision and recall tolerance levels.Step 6: Measure output against business metricsPerformance measurement lies at the heart of any improvement. The impact ofdeploying the text mining solution should be measured, at agreed upon intervals,against established business problems identified earlier. Depending on the industryand kind of problem an organization is trying to solve, key business metrics may include: Customer satisfaction. Customer retention. Product/service quality. Market growth rate. Revenue and/or profits. Process quality and capability. Productivity (including speed, capacity, number of users/customers/servedand more). Organizational, infrastructure and stakeholder capability improvements.Six steps to successful text mining7/8

ConclusionCreating a fruitful text mining project involves more than the actual contentanalytic algorithms. It requires thinking about the business problem,communicating goals, involving the appropriate stakeholders and finding the rightways to measure success. To learn more about Magellan Text Mining, click here.To further explore how text mining can unlock value, consider OpenText’sSemantic Strategy Workshop. Participants work with an OpenTextcomputational linguist on site to get an overview of how Magellan works andexplore various content challenges that it can address. For information, [email protected] more from the Magellan Text Mining demo.About OpenTextOpenText, The Information Company, enables organizations to gain insight throughmarket leading information management solutions, on-premises or in the cloud. Formore information about OpenText (NASDAQ: OTEX, TSX: OTEX) visit: with us: OpenText CEO Mark Barrenechea’s blog Twitter 2021 Open Text. All Rights Reserved. Trademarks owned by Open Text.For more information, visit: on 08/21 SKU189418/8

Text mining helps organizations streamline business processes and overcome challenges by gaining insights from their mountains of unstructured textual data. However, as with any data science project, some essential steps must be followed to produce successful results. This Best Practices Guide offers six tips to help organizations get the most out