Note: This article was previoulsy published on LinkedIn on September 24, 2024.
1. Where Does a Knowledge Graph Fit in an Enterprise Data Architecture?
A knowledge graph typically sits above data entry /storage systems and below data consumption/visualization systems. Knowledge graph products can support data entry, store data and visualize data, but their primary function is to organize data from multiple sources and apply context for business purposes: analysis, question answering, search and recommendation. An “enterprise knowledge graph” suggests a full suite of storage, management, editing, querying, modeling and data mapping capabilities that has multiple connections to other enterprise systems.
A note: a knowledge graph is, fundamentally, a database. It forms part of what is often called a “semantic layer,” which is a term of art describing the purpose of the tool, which is to create meaning through contextualizing data. It’s fine to use “knowledge graph” and “semantic layer” interchangeably, although purists would separate the technology from the application.
2. What’s the Difference Between a Data Fabric and Data Mesh?
Data Fabrics and Data Meshes are architectural styles for integrating a knowledge graph or semantic layer into data ecosystems. The major difference between them is in how data is accessed—is it transformed and stored in the knowledge base (Fabric) or is it linked to other data sources and performs analysis on the fly (Mesh). There are many shades of overlap between the two approaches. The choice of style does not change the fundamental functionality or purpose of a knowledge graph, although choosing the correct implementation may affect how well it performs as an aide to business processes.
3. What’s the Difference Between a Knowledge Graph and My Other Data Management Tools?
A knowledge graph can be used to perform master data management, taxonomy management and as a data catalog, but these functions tend to be supported by helpful purpose-built applications that feed into or are backed by a graph database. For example, several taxonomy management systems (Semaphore, EVN and Pool Party, for example) sit on top of their own graph database.
All of these systems working together can help the knowledge graph perform its primary contextual function by keeping underlying data correct and consistent. For example, a knowledge graph can use a data catalog to keep track of and understand underlying data sources.
4. What’s the Difference Between a Knowledge Graph and Other Databases?
A knowledge graph is not used to re-represent or establish a common data model for underlying systems. That is a function supported by a data catalog or data pipeline, like Databricks or Microsoft Fabric. This is a common misconception, especially among database managers within enterprises. Rather, the knowledge graph provides an extra layer of data modeling at more abstract level. The model in a knowledge graph (properly, an ontology) is a representation of a real-life area of knowledge or operation (a “domain”).
Think of a supply chain in which raw materials are sourced and fashioned into components that are combined into a finished product. Multiple suppliers, vendors, facilities and assembly teams contribute at each stage–not to mention inputs from sales, marketing, finance, quality control, inventory management and logistics. It’s a complicated web of interactions and inter-locking dependencies that are not readily captured in a relational database. Those relationships between people and stuff and conditions and events are modeled as an ontology. The ontology then provides the context the business needs to answer questions: when will my shipment arrive, what happens if a factory shuts down, which retail outlet needs product now and which can wait?
5. Can You Provide an Example of the Business Value of a Knowledge Graph?
Questions like those posed above get asked and answered every day in all kinds of organizations. But often they get answered slowly or incorrectly, based on insufficient information or guess work. Worse, the answers to business questions often rely on data that is completely divorced from all the expensive and elaborate data management controls established with the help of taxonomy, data catalogs, master data management and security systems. Further, the logic behind data-driven decision making is often opaque or relies on rarified expertise.
Let’s go back to the supply chain for an example: a warehouse manager needs to decide whether to send out a truck that is 80% full or wait for it to fill up. He knows from his inventory software that the expected goods are in stock. But he also knows that one of the forklift drivers called in sick today and that the truck is headed to a destination just at the edge of the delivery window. So, he determines that it will take too long to wait for the rest of the inventory and sends the truck. All of the information used to make that decision is data in some database somewhere—probably several databases: one for inventory, one for personnel, one for delivery management, another for demand tracking. But the synthesis of all that data—the synthesis upon which a material decision was made—is not in any database. Did the warehouse manager make the right call? Maybe. Is it defensible? Maybe. How long would it take to recreate those conditions and review the decision to find out? Likely too long.
A knowledge graph, by modeling the world in which the warehouse manager operates, also allows the enterprise to expose the decision factors for such a circumstance and make them repeatable. Good warehouse managers are hard to find, after all. So, the knowledge graph is not recreating the data schemas from the inventory system or all the underlying data points—it’s building a logical map of the world in which all the data lives. And that helps the enterprise be leaner, more consistent, save money and continuously improve. Call centers responding to customer requests, order management systems orchestrating vendor capacity, pharmaceutical companies exploring drug pathways, banks assessing financial risk, systems operators monitoring points of failure, marketers predicting consumer behavior and hundreds of other examples rely not just on data, but on connected data. The connections provide the context needed to make a business decision.
6. How Hard Is It for an Enterprise to Adopt a Knowledge Graph?
In order for an enterprise to adopt a knowledge graph effectively, it must be willing to commit to some new processes and overcome some institutional inertia. The process of creating an ontology to model an interconnected world is not the same as building a SQL table or an API. The enterprise works with its subject matter experts (SMEs), like the warehouse manger, to understand where data fits together and how it is used. A good warehouse manager will realize that the ontology and knowledge graph will help her understand her world better, make better decisions and game out scenarios beforehand to improve. The knowledge graph is never going to replace a warehouse manage, but it can help her do more and manage a growing inventory without learning a new system.
The enterprise needs specialists to work with the SMEs to convert their knowledge into an ontology. And data experts to then match up specific data points to the ontology model. Enterprises often import this expertise with specialty vendors who provide domain modeling services. Most knowledge graph companies also maintain a staff of such experts to assist in the adoption process. But outside consultants don’t know your data and don’t know your domain better than your SMEs; they are there to facilitate, not replace or dictate an idealized process. Enterprises sometimes balk at acquiring such expertise or embarking on such a modeling process. The good news is that many business functions are very similar at the logical level (an ontology is a logical representation), so existing ontology model may be readily applied and adapted for a given enterprise without major effort.
Many enterprises find that once they’ve started the ontology-building process in one area, they want to repeat in a related area, thus growing the power and value of the underlying knowledge graph. At this point, the enterprise may wish to hire one or more ontologists on a permanent basis. It is a good bet that a few of the SMEs that participated at the start can grow into a domain modeler—putting in the role of an educator as well as a domain expert.
7. How is Data Retrieved from a Knowledge Graph/What’s the Query Language?
Knowledge graphs rely on special query languages (e.g. SPARQL or Gremlin) that may not be familiar to relational or even no-SQL database managers. These new languages do create a learning curve that must be climbed in order to fully adopt a knowledge graph. Both of these languages will look familiar to anyone who already knows SQL, so the curve is not too steep. Knowledge graph vendors try to make the transition easier by providing graphical query builder interfaces so that business users don’t need to learn a query language at all.
8. How is Data from a Knowledge Graph Consumed or Visualized?
Getting data out of a knowledge graph (or filtering data through a knowledge graph in some cases) is only part of the solution. Many business users — data scientists, business analysts, program managers, and executive decision-makers want to see data in familiar forms, such as spreadsheets, tables or charts. Knowledge graphs lend themselves readily to such visualizations.
Enterprises may choose to build their own custom tools for such purposes, rely on the onboard functionality of knowledge graph platforms or connect the knowledge graph to popular presentation and business intelligence tools. But, the best use and most value from a knowledge base relies on its ability to drive analysis through context, hopefully eliminating time-consuming and ad hoc data manipulation at the consumption level. Done correctly, the knowledge graph and its ontology and mappings to databases save time, expose logical relationships and provide repeatability that has been elusive for enterprises since the dawn of databases.
9. How is ROI Calculated for Knowledge Graph Adoption?
ROI calculations can be difficult to determine when enterprises are considering adoption of a knowledge graph. There are documented cases of greater than 100% returns on investment for saving time and other metrics. But often the knowledge graph is supporting a function that is not always measured or measured only anecdotally. For example, many organizations do not measure internal search time or the number of failed searches in which the seeker just gave up. These can be important “hidden costs” in an organization on which a graph can have a huge effect by improving speed and accuracy.
Also, Knowledge Graphs may support a function that was not previously possible and so never measured. Up-take of services, sales of products, time to market are all common measurements that can be improved with a knowledge graph. But sometimes the actual value becomes obvious only at significant scale or complexity. If a business approaches a knowledge graph adoption decision as acquiring yet another tool for data management, it is not likely to find easily quantifiable returns on that investment. But if they are measuring business outcomes and applying the knowledge graph to a well-defined business problem, the value is infinitely more obvious.