Copy of: What are Knowledge Graphs?

Knowledge graphs are used to describe people, things, ideas and anything else you can imagine. They capture knowledge in a way that both humans and computers can understand.

For humans it is intuitive to communicate using graphs. Graph objects can store data to describe them (e.g., color, size, URL for more info), or references to other objects in the graph (e.g., mother, agent, doctor). Graphs can be jagged and incomplete, because that's how our brains work when we're trying to understand something new.

The use of knowledge graphs by computers can be quite exciting because they are reliable, flexible and able to scale to high volumes. Facebook and LinkedIn are two giant member graphs. The reason web searches are getting smarter and more consistent is an open knowledge graph called schema.org that describes the most common search terms.

This page describes knowledge graphs from two perspectives:

 

Please note that the functionality described on this page is not yet available. Please see our Release plans page for more.

What do knowledge graphs look like

Models

There are several technologies that can be described as knowledge graphs. Most notably this includes graph databases and semantic models. Tag uses semantic models which are able to support automated reasoning.

The term semantic model encompasses more than one thing. At a low level this includes RDF files, which contain data that has additional meaning attached. Each unit of knowledge is stored as a property attached to an entity (graph object). Properties can store data values or references to other known entities.

At a higher level, ontologies are also considered semantic models. Ontologies are built using RDF while extending it with the OWL language. OWL provides a richer way to classify objects and more ways to connect them.

Tag supports both. When you create a knowledge graph in Tag, you are creating an OWL ontology. If you have RDF data from another source (e.g., open data from data.gov) that works great too. Both formats share the concept of Class which is the foundation of a knowledge graph.

 
Classes and properties

Knowledge graphs provide a way to describe people, things or ideas - these are all referred to as entities. The best way to describe an entity is to classify it using known terms.

A class is an object in the graph that describes a type of entity by giving it a name - that's it. It's a name you use to refer to this type of entity when relating it to other graph objects. Class names are unique within a knowledge graph.

 

A class is associated with, or linked to, properties that describe things you know about that type of entity. For example, a Dog class could have several useful properties including name, breed, size and color. It might also be useful to relate Dog information to other entities like dog houses and toys. This is shown as a graph below.

This graph contains three classes: Dog, DogHouse and Frisbee. Properties are used to store data and make connections between classes.

The green rectangles represent data properties. Entities that belong to a class may be expected (not required) to store data values for all data properties.

The blue rectangles represent object properties. Entities that belong to a class may be expected (not required) to store references to other entities for all object properties.

Instances

Entities that belong to a class are called instances. They are objects in the graph that may belong to one or more classes. For each class that an instance belongs to, you can expect to find data values or object references stored that are described by properties linked to that class.

Data is often incomplete and information may not be stored for all properties. Graphs are like that. You can even refer to entities that are not defined in your knowledge graph. The technology is designed to be resilient in this way, and values flexibility over strict rules of use.

Some instances have names just like classes. The name must be unique in the knowledge graph, and must be different than all class names. Names allow more than one entity to reference a specific instance.

 

You don't always want names. If each row in a CSV represents an instance, no one wants to name them. It's this dual role of instances that sometimes trips people up - sometimes an instance is an object, and sometimes it's just a row in a CSV. For the computer, either way, it's just a collection of named data values to work with.

Anonymous instances do not have names. They can store data values and object references, but can't be looked up by name in the graph. Both types of instances are shown below.

Rover is a named instance of the Dog class. It's name in the graph (Rover) is unique, and is not the same thing as the name data property. A graph name is a logical name, while a data property that happens to be called "name" is just data and has no other special significance.

The dog house and three frisbees are anonymous. They store useful information but do not deserve the visibility of a named instance in the graph.

In another knowledge graph, named frisbee instances may be useful according to the author's needs.

URIs

The discussion so far has been a bit misleading. It talks about unique names in a knowledge graph, but that wouldn't be specific enough when combining more than one graph. What if a dog walker has a graph containing Dog, and that is combined with a graph for training seeing eye dogs, which probably has different properties.

This is resolved by embracing the URI. A URI is just like a Web URL except it doesn't have to refer to a downloadable resource. URIs are used for identification only and to declare ownership over a knowledge graph, or entity within the graph.

These two classes of Dog have the same local name within their own graphs, but when addressed using URIs you have a combination of graph identifier and the local name. For example, the following two classes can be safely used together in a graph.

http://graph.dogwalker.com/model1/Dog

http://graph.seeingeyedogs.com/model2/Dog

In fact, these classes could be entirely different and share no properties in common. One could be written in English, one in French (although it would probably be named Chien). It's simply coincidence that both classes share the same local name.

Values

The bulk of information stored in most knowledge graphs are property values. As discussed earlier, there are two kinds of properties: data and object.

Data properties store values that can be stored as a string of characters. It could be words, numbers, dates, image bytes or anything that can be typed into a simple text editor.

Object properties store URIs which identify entities known to the graph. This is a critical aspect of graph technology, because it allows the computer to efficiently connect multiple entities (hint, it turns URIs into numbers and solves algebra equations with them). This efficiency is what allows giant graphs like Facebook to function so well, even with an enormous volume of instances and properties in the graph.

Adding values to the example graph looks like this.

We now have much more detail. Rover has one dog house and one frisbee - both are anonymous instances (labels start with "_i").

He also has a height and two names stored as data properties (which may be different than his graph name Rover).

All shapes in the graph can be selected much like logic bubbles in the smart content example. This allows the author to change the structure of the graph (classes and properties) or the content of the graph (instances and values). It also allows actions to pivot off graph objects.

What can you do with a knowledge graph

Organize

Knowledge graphs are ideal for organizing information. They can be used to define a model of real world objects that can be explored and queried. They can be used to create data dictionaries or taxonomies with varying levels of formality. They can even incorporate a purely logical layer of organization. This is discussed further below.

Another useful application is to embed codes used for business process and connect them to intuitive objects for humans to find. For example, create an instance for each supplier of a key part and link everything needed to fill out an order for that supplier to each instance. Data properties can store any string, including paths to documents in your file system. Users of the graph could select a supplier using any criteria that makes sense, then generate a purchase order with all the right codes filled in using smart content.

You used to need programmers to setup something like this. Now you don't with a bit of help from Tag.

 
Queries and actions

The main reason to have a knowledge graph is to ask it questions. In Tag there are two ways to query a model; both hide the syntax and formalities, and just let you point and click.

The result of a query can be a single piece of data or a dataset of rows and columns of data. Datasets can be viewed in a grid, saved to CSV or used to drive an action.

An example of driving an action was provided in the previous section (generating a purchase order). When the user is selecting a supplier they would presumably be querying the graph for things like pricing, delivery times, reliability, level of satisfaction and more. As you brainstorm about what's important when picking the right supplier, you can store almost anything in the graph including links into other running systems (e.g., to do an online price check) that can be run in a browser.

Note that some actions might require the generation of multiple documents. This can be accomplished using tasks (also called pipelines). In the previous example you could generate a sample purchase order for each supplier, possibly doing some math to calculate # units to purchase. The purchasing agent could then review all possible orders and pick the best one, keeping only one file on record for the winning order.

Connect

In many industries there are shared models that are important in day-to-day work. In a growing number of cases, these models are being formalized as ontologies and other digital formats that are compatible with knowledge graphs. Some examples of notable public knowledge graphs include:

 

Online commerce

  • Good Relations

Web search

  • schema.org

Provenance metadata

  • PROV

Finance

  • FIBO (things of interest in financial applications)

  • FRO (regulatory compliance)

  • ACTUS (interoperable digital financial assets)

Medical

  • FHIR (electronic health records)

  • ICD-10-CM (International Classification of Diseases)

  • RxNorm (clinical drugs)

  • SNOMED (clinical terminology)

  • OGMS (Ontology for General Medical Science)

  • MedDRA (data entry, retrieval, analysis, and display)

  • CPT (Current Procedural Terminology)

Legal

  • LKIF

Insurance

  • IRO

There are many other examples. Useful models don't have to be a specific format as long as they are well organized.

 

For example, the DSM () is published as a book and used by millions of healthcare workers. Usually people work with a subset of the diagnoses, and information for these can be copied to a knowledge graph to make them easier to find and apply to their work. (Note, we have an electronic version of the DSM that can be used in some circumstances - contact us if you think that might be useful for your team).

Reasoning

One of the main strengths of ontologies over other kinds of knowledge graphs is automated reasoning. There is a certain amount of "default" reasoning provided to make things convenient. Or you can turn on an OWL reasoner to get full power.

The default view of a graph in Tag respects the hierarchical structure of classes. For example, you can define a Mammal class and a Dog subclass. When you create a subclass it inherits all properties that are linked to the parent (superclass). This way a fur property attached to Mammal can be reused by Dog, Cat and many other classes. Put another way, the computer infers that Dog can use the fur property even though it is not directly linked. This loosely describes default reasoning.

Using an OWL reasoner allows you to infer even more information. A good example of this relates to pizza.

 

There is a well known example ontology about pizza that nicely shows how reasoning can be used. This knowledge graph defines classes for Pizza and many Toppings. Each topping has some descriptive information including a way to tell if it contains meat. After defining many delicious recipes (like Capricciosa, Cajun, CheeseyPizza, Fiorentina, FourSeasons and Giardiniera), each Pizza instance (recipe) knows what toppings it has.

When browsing the pizza graph, the "table of contents" view changes dramatically when OWL reasoning is turned on. You can now browse hierarchies of meaty and vegetarian pizzas. Meaty pizza is defined as any pizza with at least one meat topping. Vegetarian pizza is defined as any pizza without meat or fish. You can now do queries that treat VegetarianPizza as if it's a class referenced by every Pizza instance that has no meat or fish. This provides a very powerful way to organize information that can infer missing pieces of information.

Summing up

This discussion has only scratched the surface of what's possible with knowledge graphs. Some topics not yet covered:

  • Storage: how to store large models efficiently and share over a network or the Web

  • Editing: how to link classes, properties and instances - and add data

  • Rule engines: how automated rule engines can be defined using graph entities

  • Machine learning: how graphs can help define a problem space for ML models

We'll be providing more content when the knowledge graph module is released.

nSymbol Technology Inc. is a startup company operating out of Alberta, Canada. 

© 2021 by nSymbol Technology Inc.