Magnifico’s Post

Magnifico reposted this

View profile for Tony Seale, graphic

The Knowledge Graph Guy

🔵 How Graphs Could Shape the Future of Vector Search 🔵 With ongoing advancements in Large Language Models (LLMs) such as ChatGPT, vector-based search mechanisms are rapidly transitioning from being auxiliary features to core functionalities in many platforms. Vector search is now found not only in specialised stores like Pinecone and Weaviate but also in search platforms such as Elasticsearch and databases like MongoDB. Notably, both these platforms have employed an algorithm called Hierarchical Navigable Small Worlds (HNSW) to deliver efficient vector search. HNSW is a graph-based algorithm, its power lies in the ability to transform continuous embedding vectors into a discrete, layered graph. 🔵 Discrete and Continuous Semantics Traditionally, fuzzy matching strategies are often implemented in conjunction with discrete filters. In search, this is referred to as 'faceting' (think of searching for 'shiny black shoes' on eBay and then selecting a specific brand from a dropdown menu). This hybrid approach has proven effective and is being widely adopted for vector search as well. For example, one might restrict documents based on geographical origin or timeframe and then use vector-based search to gauge sentiment only within that subset. 🔵 A Graph-Based Revolution Traditional filtering is typically based on tabular (rows in a database) or tree-like (JSON documents) data formats. The landscape changes significantly when the data itself is structured as a graph. When employing HNSW in a graph-based setup, both continuous vectors and discrete facets become vertices in the same graph. This allows for more nuanced relationships and more efficient alignment. Furthermore, the upper layers within HNSW represent a form of compression. With your data in a graph, you can move beyond the classic HNSW node-degree compression algorithms to consider more semantic forms of compression, which take domain-specific ontologies into account. This could prove to be very powerful. 🔵 Key Takeaways for Organisations I posit that transitioning to graph-based data structures is the next logical step in the evolution of search and knowledge representation. Therefore, my advice to organisations looking to stay ahead in the data management and analytics game is to transition as much of their core data into a graph structure as quickly as possible. ⭕ HNSW: https://lnkd.in/eH7JqEyZ ⭕ Continuous and Discrete: https://lnkd.in/ex8HA_Nj ⭕ Embrace Complexity: https://lnkd.in/ejZikEGp ⭕ Semantic Router: https://lnkd.in/eucZUjrV

  • No alternative text description for this image
Mercia E. Arnold

Quick-study, multifaceted, friendly, life- long learner, strategic practical orthogonal critical thinker and business model surfer. Economist & Attorney. State Legislative and Fiscal Counsel.

10mo

How does one protect against #bias in #vector based search #algorithms where people may choose to ignore non-zero #crosspartialderivatives in their #LLMs? Should #disclosure of #vector #assumptions be requited as #validation of the #LLM as #science rather than #faith? Biases that are #faithbased, in the United States, are permissible, but may be a basis for #impermissable exclusions of the #possible. IMHO, #technology should #facilitatenotfrustrate human creativity as a transport TO information, & NOT an arbiter OF information.

John O'Gorman

Disambiguation Specialist

10mo

Tony - I'm not that comfortable with the phrase 'Discrete and Continuous Semantics' but I think I know where you're headed here. I think what may be more precise is the word 'Discrete' refers to a single token with a semantically equivalent definition or general description (also a 'token' in my view). So a semantically discrete token has meaning, LLMs also (obviously) use uniquely identified (discrete) tokens, but what you describe as 'Continuous' refers to a list of probabilities of 'next tokens' which seems to indicate syntax. So, a token is discrete, while a list of embeddings (let's call it a vector) tells me which discrete, semantically stable tokens are likely to be associated with it. The combination of semantically enriched faceted knowledge graphs and syntactically robust LLMs is indeed a powerful combination. McCarthy Tétrault

Victor Grazi

Oracle Java Champion, Pluralsight author Twitter @vgrazi

10mo

I find it easier to express this relationally select customer, sum(sales) where product=tv group by customer order by sum(sales) desc Is that just my experience bias?

Pranab Ghosh

AI Consultant || MIT Alumni || Entrepreneur || Open Source Project Owner || Blogger

10mo

With RAG , the LLM is hardly doing any search because you are providing all the content to search for as context using vector indexing, text indexing, KG and whatever. In this case LLM is working more as an NLP engine gleaning the answer from the provided context

David Bergling

Web3 Sweden | wAI3.io | Digital Innovation | Business Development | Business Analyst | Management Consulting | Data-driven Transformation | Web3 | Blockchain |

10mo

Tony Seale, can Graph Structure and Blockchain be aligned in a good way would you say?

Harry Powell

Data science leader with track record of innovation and value creation

10mo

HNSW is a great use case for graph. What architecture are you currently using?

Kingsley Uyi Idehen

Founder & CEO at OpenLink Software | Advancing Data Connectivity, Multi-Model Data Management, and AI Smart Agents | Unifying Disparate Data Silos via Open Standards (SQL, SPARQL, RDF, ODBC, JDBC, HTTP, GraphQL)

10mo

Yep! BTW -- here's an live variant of what you've depicted based on actual data from an @Apple product page. https://www.openlinksw.com/data/turtle/general/apple-knowledge-graph-manifestation-3.html Fundamentally, the notion of a Semantic Web has stealthily put so much in place to be exploited by this era of LLM-based language processors and code generators 😀 #KnowledgeGraph #SemanticWeb #CDO #CIO #CDAO #CTO #CMO #LinkedData #DataConnectivity

  • No alternative text description for this image
Jon Cooke

(S&L)LM/GenAI & Analytics Data Products | Composable Enterprises using Data Product Pyramid and GenAI | Data Product Workshop podcast co-host

10mo
  • No alternative text description for this image

Tony Seale Doesn’t a data product contain meaning or at least context to be properly consumable? If so, shouldn’t the Ontology be part of that, so on a lower level in your layer map?

Mark Spivey

Helping us all "Figure It Out" (Explore, Describe, Explain), many Differentiations + Integrations at any time .

10mo

what is “discrete” or “continuous” here is not the “semantics” …

See more comments

To view or add a comment, sign in

Explore topics