We can push AI reasoning further by uniting vector databases and graph databases. Pinecone and Neo4j come together to make the best of GraphRAG. Read more about it in our latest post by Roie Schwaber-Cohen. https://hubs.ly/Q02N_cZL0
Pinecone’s Post
More Relevant Posts
-
This a great read! Its been interesting to see the evolution of Graph databases from 2018 to now. It felt like in the initial wave of Gen AI there was slowdown of interest in Graphs due to just the wave of interest in LLMs. When in reality they really make each other more powerful, but also bring some of the same LLM challenges (hallucinations, grounding etc) into the graph world.
We can push AI reasoning further by uniting vector databases and graph databases. Pinecone and Neo4j come together to make the best of GraphRAG. Read more about it in our latest post by Roie Schwaber-Cohen. https://hubs.ly/Q02N_cZL0
Vectors and Graphs: Better Together | Pinecone
To view or add a comment, sign in
-
In part 4 of his series on AI engineering, Chris Samiullah covers open source retrieval-augmented generation (RAG): what it is, why we do it, and how to do it using LlamaIndex and open source models. A great, comprehensive overview: https://lnkd.in/eXvHKv9h
To view or add a comment, sign in
-
So much changing so fast! CPU from Intel, GPU from Nvidia, TPU from Google, now LPU from Groq. > "Groq as achieving a throughput of 877 tokens/s on Llama 3 8B, the highest of any provider by over 2X." > "Groq offers 284 tokens per second for Llama 3 70B, over 3-11x faster than other providers." I skimmed the article but did not see if it was the "pre-trained" foundational text or "instruct" for chat/dialogue use case model deployed for the benchmark. I also did not see which quantization method they used. If you look at the Meta blog about Llama3, they list different available quantizations (think of quantization like vector compaction). https://lnkd.in/gvgmBKeQ Quantization can lead to higher throughput at the expense of lower accuracy. If you're curious, quantization meanings can be found on: https://lnkd.in/gh9iT3um. Below just listing some main quantization types. - **q4_0**: Original quant method, 4-bit. - **q4_k_m**: Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K - **q5_0**: Higher accuracy, higher resource usage and slower inference. - **q5_k_m**: Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K - **q 6_k**: Uses Q8_K for all tensors - **q8_0**: Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. #GenerativeAI #UnstructuredData #Llama3 #Grok Grok The Milvus Project Zilliz
Mind blowing. Groq is serving LLaMA 3 at over 800 tokens per second! That's faster than GPT-4, and it's completely free. https://groq.com. ↓ Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos, and papers in AI. Read by 180,000+ engineers and researchers.
To view or add a comment, sign in
-
There are six UN official languages and many more non-UN languages. Can AI handle them all? It's all for money 💰🤑💸. As to the AI hype, see some risks and limitations of AI. AI consumes electricity heavily. In our reality, AI has doubled the electricity bills of a C-level friend of ours. 😔 😥 😿 See a warning about LLM and AI from German government: https://lnkd.in/gMDaGDij An UN newly released report said that AI ONLY benefits small amount of states, companies and individuals, i.e., some few humans are making big money by using AI to harm many people. AI is based on math models, and models must be V&V before we can trust them. Using math models to simulate a physical process started from the Manhattan Project in WWII for nuclear bomb design. Then, in 1960s, it came the C4ISR, which is today's AI, whose original missions were just breaching enemy's security, dis-/mis-information, cognitive manipulation, cheat, surveillance, detection for kill, etc.. AI can do many things, but NOT everything, at least NOT what we are doing, by using our intellectual property (IP), a copyrighted multilingual metadata, for dataanalytics. Without metadata, NO data can be found/retrieved, even by AI. https://lnkd.in/g-aJFnXR
Mind blowing. Groq is serving LLaMA 3 at over 800 tokens per second! That's faster than GPT-4, and it's completely free. https://groq.com. ↓ Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos, and papers in AI. Read by 180,000+ engineers and researchers.
To view or add a comment, sign in
-
Mind blowing. Groq is serving LLaMA 3 at over 800 tokens per second! That's faster than GPT-4, and it's completely free. https://groq.com. ↓ Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos, and papers in AI. Read by 180,000+ engineers and researchers.
To view or add a comment, sign in
-
Very impressed by the speed with which text and images are generated! Generating images with the options is very clever, and my initial tests with the text are positive, but we'll need a bit of time to explore the potential, but well done in any case. Can't wait to test it out in more detail
Mind blowing. Groq is serving LLaMA 3 at over 800 tokens per second! That's faster than GPT-4, and it's completely free. https://groq.com. ↓ Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos, and papers in AI. Read by 180,000+ engineers and researchers.
To view or add a comment, sign in
-
this is actually decent output at lightning speed
Mind blowing. Groq is serving LLaMA 3 at over 800 tokens per second! That's faster than GPT-4, and it's completely free. https://groq.com. ↓ Check out https://AlphaSignal.ai to get a daily summary of breakthrough models, repos, and papers in AI. Read by 180,000+ engineers and researchers.
To view or add a comment, sign in
-
Vector databases are changing the way we do search. Semantic retrieval provides more relevant information by embedding queries in latent space and allows AI to interface directly with your data. Ram Sriharsha explains retrieval augmented generation in this week's episode 🎧/🎥: https://buff.ly/490wKQT.
To view or add a comment, sign in
-
I have given a couple of versions of this presentation recently - Libraries, AI and the Messy Middle. I am taking a different approach with this one. I have annotated and augmented the presentation so that it is a little more self-standing than my usual more elliptical powerpoints. https://lnkd.in/g3KTeg9A
Libraries, AI And The Messy Middle
figshare.com
To view or add a comment, sign in
-
GenAI approaches to question-answering and information retrieval are helpful and easy...until you realize that the answers the AI gives you are "dreams" with little-to-no explanation on how they were derived. How do you know if an answer is accurate? Matt Welsh explores this topic on the Aryn blog, and discusses how our GenAI powered document lake, called Luna, has verifiability and explainability of answers as core parts of the system. These capabilities are closed beta-only right now, so if you're interested in seeing Luna in action, shoot me a note. Blog: https://lnkd.in/guibWqj8
Building trust by closing the verification gap in AI-powered unstructured analytics
aryn.ai
To view or add a comment, sign in
61,316 followers