top of page

Why does Knowledge graph, Ontology and GraphRAG still not work?

  • Writer: Rajasankar Viswanathan
    Rajasankar Viswanathan
  • Jun 2
  • 5 min read

Tl;dr - Summary of the points. 

  1. Graph databases are here for long time

  2. Adaption rates are low. 

  3. Graph databases are used as a relational database without tables.

  4. Getting data into graph databases is still a problem.

  5. Using Ontology to define relationships may work in scientific research not Enterprises. 

  6. Ontology classifications can’t account for literally infinite amounts of variations and categories. 

  7. Extracting relationships from LLMs is fraught with risks. 

  8. If LLMs can extract relationships then LLMs can work without a graph database, that invalidates the reason for graphs, ontology.

  9. With the data, graph databases still don't add much value.

  10. Graph data still suffers from the Ranking problem that plagues GraphRAG.

  11. Knowledge discovery runs into the Curse of Dimensionality as at the 3rd hop, nodes to be analyzed would be in millions or entire dataset. 


With knowledge graphs getting more attention and usage is rising, I decided to pen down my thoughts on why Knowledge graphs and graph databases didn't have widespread use. 


The idea for graphs as a data structure and database is solid. Store and retrieve the data with connections and relationships without worrying about tables and fully fine grained schema. 


Graph databases also led to the promise that new discovery of facts would be possible. These expectations and promises never worked. 


The reason is obvious. A true graph theory based database would be impossible to work with. Graph traversal even with 100s of nodes and edges will soon hit the compute problem as it would take forever to analyze. 


A decade and half ago, a lot of applications and projects had mushroomed to use this new technology. 


Does anyone still remember Freebase which was later renamed as Google’s knowledge graph or Facebook's graph query tool?

How many people today use wikidata? 


Why did these databases and tools fail? That is a major question we need to confront in the age of LLMs. 


Process of creating graph data (Knowledge graph included)


The steps for creating graph data structure is simple on paper, very hard to implement in the real world. 


Find the entities, find the relationships between them, store it in the database. Very simple on paper, not so in the real world. 


In the real world, relationships are complex, even finding which one is the place and which one is the person is difficult. So industry comes with some workarounds. 


Instead of trying to wrestle with an impossible problem or several impossible problems, it would be better to reduce the scope of the problem so that graph data structure can be deployed. 


Frameworks such as Resource Driven Framework and Query Languages such as SPARQL would make it easy to design a schema and run the queries. 


This means instead of being a true graph structure these databases and products become relational databases without tables. 


Relational database structure is also a connected graph but in a very limited way. It has tables which are connected to other tables by foreign key. Instead of connecting each data point, relational databases connect a set of data points which are in tables. i.e tables are nodes instead of names or entities. 


With the same kind of schema setup, graph data structures and databases became relational databases without tables. Except tables all other structures remain. 


Writing SQL or Join queries to graph databases doesn't make any sense. The original idea is to do a graph traversal not join queries. But that is a workaround. 


Next, Ontology is introduced as a workaround. Ontological classifications are used to handle the complexity of relationships itself. However, compressing the data variations has a limit and trying to do that in a real world data will hit a computing and knowledge bottleneck.  


So Workarounds are workarounds. It won't make the problem go away. It prevented the graphs from wider acceptance. And these workarounds made the graph data structures literally unusable in many ways. 


That was the reason even major tech companies retired their graph data structured products.

Then the advances in Natural Language Processing made finding entities and relationships easier. Once famous algorithms such as Word2vec, Glove etc can be seen as a way of fixing this issue too. Google famously tried to find similar semantic pairs using a distancing method. 


But that was still not enough. 


Enter the era of LLMs. 


LLMs may solve some of the issues in the Graph data structures. However, there is still a problem. The argument for using LLMs for creating graph data structures is fraught with issues. If LLMs can find context to extract entities and relationships then the reason for having separate knowledge graphs is invalid. LLMs can do it without graphs. 


If LLMs need graphs because it lacks context or it can't hold the memory then graphs created by LLMs are hallucinated. Either way, graphs with LLMs can’t be justified.  The argument of creating a graph database to be used for LLMs with the LLMs because LLMs would hallucinate is a non-starter.  


Let us assume that the graphs created by LLMs are perfect. What is next?


All the above issues are still there. It can’t reason, it can't discover. 


If that is not possible how about the new use of Graph data ie GraphRAG? 


Well, Retrieval Augmented Generation and its so-called enhancement GraphRAG still don't solve problems. 


As this article is not about RAG, only GraphRAG is discussed. 


Does GrapRAG serve the purpose of search? No. Because there is no Ranking score or function in GraphRAG. Search is all about what is most relevant information. GraphRAG has no method to provide that. 


Without that how the relevance is decided? 


There is no way about it and without the relevancy, LLMs can't be fed what is more relevant and less relevant, the whole point of using GraphRAG to fix the LLMs issues falls apart. 


Coming to reasoning, reasoning isn’t about a correct answer always. Most of the time it is about several probable answers which would allow the people to choose what would be preferable to them. 


Reasoning in graph structures can't work because there is no way to select the most probable and most relevant option nor listing those options for the human reviewer to choose. 


It is not that industry experts don’t know that or haven't tried to solve the Ontology problem. Frameworks such as Word2vec, Glove are introduced to solve this problem. But we are here even with those efforts. 


Still the original problem for which Graph Data structures are designed is not solved. 


If the data is converted into relationships, reasoning and querying the data would be easier. That premise is not yet operational. 


Manual curation, Ontologies, Semantic Structures are all designed to solve these problems. But the problem is yet to be solved. Either it takes too much time or the nature of data makes it impossible. 


Take the case of new knowledge discovery. The basic problem with using graphs for knowledge discovery is the curse of dimensionality. 


Once the graph traversal crosses 2 hops, the dimensions or number of nodes to consider, would exponentially increase. This makes finding any meaningful connections impossible. 


We need to imagine an alternative.

 
 
 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
NaturalText_Logo-09.png
About

NaturalText AI uses novel artificial intelligence technology to uncover patterns and reveal insights hidden in text data.

NaturalText, Inc.

Delaware, USA

Navigation
NaturalText_Website-Buttons_Request-a-De
  • Instagram
  • Facebook
  • Twitter
  • LinkedIn
  • YouTube
  • TikTok
Contact

Thank you for submitting! We will be in touch.

© 2025 NaturalText, Inc. All rights reserved.

bottom of page