Network (knowledge) graphs represent a collection of interlinked entities organized into contexts via linking and semantic metadata. They build a framework for data integration and analysis. Having the ability to do this can provide more context around metrics recorded from a network system. By leveraging these graphs, you can enhance your understanding of the data.
Networks Are Everywhere – If you consider the common notion of networks, you’ll realize they are everywhere, both in the virtual and real worlds. The internet is just an extensive network of computing devices. Road systems are also networks in how they intersect with one another.
On smaller scales, businesses all have networks of data. By leveraging only standard business intelligence approaches, you only see the metrics; network or knowledge graphs open up the possibility of deeper analysis.
3 Main Components of Network Graphs
Knowledge or network graphs consist of three main components—nodes, edges, and edge weight. Let’s define each.
1. Nodes – Tangible and Intantible Entities
The image above depicts a node. In the network or knowledge graph, nodes could represent any tangible entity such as people, places and equipment as well as any intangible entity such as ideas, concepts or topics.
2. Edges – Illustrate Relationships
If nodes have a relationship, then this appears as an edge. The edges of the same node above are now highlighted to show what connections this node shares relationships with. The edge can define the specific relationship and why those two nodes share connections. Both the node and edge contain attributes that help describe the relationship of the network.
3. Edge Weights – Signify the Strength of the Relationship
An edge will be given a weight to define the strength of relationship between any given two nodes. There are four types of weights possible in defining relationships:
- Undirected and unweighted: Signifies there is a connection, but it has no direction or weight
- Undirected and weighted: Demonstrates a connection, and the weight is the “number” of connections
- Directed and unweighted: Indicates there is no weight to the connection; it’s just connected or not
- Directed and weighted: Denotes that there is a strong connection, and it has a definable weight
How A Knowledge Graph Is Built?
These graphs usually include datasets from many sources. Those can, of course, vary in structure. To drive consistency in form, the use of schemas provides a framework. Identities then classify the nodes. The third part is context, which determines the setting in which that information exists, which is useful when data points have multiple meanings (e.g., Apple versus apple).
Most knowledge or network graphs leverage machine learning (ML) and natural language processing (NLP) to comprehensively view nodes, edges, and edge weights via semantic enrichment. This is a process that, when data is ingested, allows the graphs to identify individual objects and their relationship to one another. Next, there is usually comparison and integration with other datasets that are relevant or similar.
Why Use Network Graphs?
In theory, networks can be very basic and simple with nodes and edges. However, the more nodes added to the network, the more complicated the graph will be. As a result, scaling of large networks can be challenging for interpretation or analysis.
Some use cases for network graphs include:
- Retail: Knowledge graphs can drive intelligence around upselling and cross-selling strategies, product recommendations based on past buyer behavior, and purchase trends specific to demographics.
Further read: Understanding Consumer Behavior Patterns Through 3D Data Visualization
- Finance: Banks often use these for anti-money laundering initiatives. They can serve as a preventative measure against crime and prompt investigations. They can watch the flow of money across customers to identify noncompliance.
- Healthcare: Medical researchers use knowledge or network graphs to organize and categorize relationships. It can support diagnosis validation and determine the best treatment plans on an individual basis.
In these situations, obtaining insight can be challenging, which is where network visualization and software come into the picture.
Example: How To Visualize Airport Traffic Data?
Starting from raw data above we can safely assume little meaning on how this network is structured when visualized in tabular format. So the starting point is to extract and visualize this data as a network whose node represents a given airport and whose edges represent specific flight routes.
We can better understand this network by overlaying it on the surface of the earth so that we can begin to ask questions on data such as what the shortest path between two nodes is.
In this case we are selecting Tampa International (Florida) and Circle City (Alaska), whose shortest path is composed of 5 different flight routes.
One can imagine the example above to be used with protein data to understand how effectively certain drug molecules target different disease pathways, or to understand what else customers are likely to buy given a purchase.
Networks like these are everywhere and through visualizations like these their behaviour comes to life in ways that drive new questions, hypotheses and decisions on how to optimize them.
Author: Alessandro Lativ