LlamaIndex is your friendly data assistant for building LLM-based applications. You can easily acquire, manage, and retrieve private and domain-specific data using natural language.

LlamaIndex is a data framework targeted at large language model (LLM) applications. LLMs such as GPT-4 are pre-trained on massive public data sets and can achieve incredible natural language processing capabilities out of the box. However, without access to your own private data, their usefulness is limited.

LlamaIndex allows you to extract data from APIs, databases, PDFs, and more via flexible data connectors. This data is compiled into an intermediate representation optimized for LLM. LlamaIndex then allows natural language queries and conversations over your data via a query engine, chat interface, and LLM-driven data agent. It enables your LLM to access and interpret large-scale private data without having to retrain the model on newer data.

Whether you’re a beginner looking for a simple way to query your data in natural language, or an advanced user looking for deep customization, LlamaIndex has the tools. The high-level API allows you to get started with just five strokes of code, while the low-level API gives you complete control over data extraction, indexing, retrieval, and more.

How does LlamaIndex work?

LlamaIndex uses a Retrieval Augmented Generation (RAG) system, which combines a large language model with a private knowledge base. It usually consists of two phases: indexing phase and query phase.

indexing phase

During the indexing phase, LlamaIndex will effectively index private data into a vector. This step helps create a searchable knowledge base specific to your domain. You can input text documents, database records, knowledge graphs, and other data types. Essentially, indexing converts data into numeric vectors or embeddings that capture its semantic meaning. It enables fast similarity searches within content.

query phase

During the query phase, the RAG pipeline searches for the most relevant information based on the user’s query. This information is then provided to LLM along with the query to create an accurate response. This process allows the LLM to access current and updated information that may not have been included in its initial training. The main challenge at this stage is to retrieve, organize, and reason about potentially multiple knowledge bases.

SetLlamaIndex

Before we dive into the LlamaIndex tutorial and project, we have to install the Python package and set up the API.

We can simply install LlamaIndex using pip.

pip install llama-index

By default, LlamaIndex uses the OpenAI GPT-3 text-davinci-003 model. To use this model, you must set OPENAI_API_KEY.

import os

os.environ["OPENAI_API_KEY"] = "INSERT OPENAI KEY"

Also, make sure you have the openai package installed.

pip install openai

Add personal data to LLM using LlamaIndex

In this section, we will learn how to create a resume reader using LlamaIndex. You can download your resume by going to your Linkedin profile page, clicking More, and then saving as PDF.

Please note that we use the DataCamp workspace to run the Python code. You can access all relevant code and output in the LlamaIndex: Add Personal Data to LLM workspace.

Before running anything, we must install llama-index, openai, and pypdf. We are installing pypdf so that we can read and convert PDF files.

pip install llama-index openai pypdf

Load data and create index

We have a directory called “Private-Data” which contains only one PDF file. We’ll use SimpleDirectoryReader to read it and then use TreeIndex to convert it to an index.

from llama_index import TreeIndex, SimpleDirectoryReader

resume = SimpleDirectoryReader("Private-Data").load_data()
new_index = TreeIndex.from_documents(resume)

Run query

Once the data is indexed, you can start asking questions using as_query_engine(). This function enables you to ask questions about specific information in the document and receive corresponding responses with the help of the OpenAI GPT-3 text-davinci-003 model.

Note: You can set up the OpenAI API in the DataCamp workspace by following the Using GPT-3.5 and GPT-4 via the OpenAI API in the Python tutorial.

As we can see, the LLM model answered the query accurately. It searched the index and found relevant information.

query_engine = new_index.as_query_engine()
response = query_engine.query("When did Abid graduated?")
print(response)
Abid graduated in February 2014.

We can learn more about certification. LlamaIndex appears to have a comprehensive view of candidates, which is advantageous for companies seeking a specific person.

response = query_engine.query("What is the name of certification that Abid received?")
print(response)
Data Scientist Professional

Save and load context

Creating an index is a time-consuming process. We can avoid re-creating the index by saving the context. By default, the following command will save the index storage in the ./storage directory.

new_index.storage_context.persist()

Once that’s done, we can quickly load the storage context and create an index.

from llama_index import StorageContext, load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)

To verify that it is working properly, we will ask the query engine the questions in the resume. It looks like we have successfully loaded the context.

query_engine = index.as_query_engine()
response = query_engine.query("What is Abid's job title?")
print(response)
Abid's job title is Technical Writer.

chatbot

In addition to Q&A, we can also use LlamaIndex to create a personal chatbot. We just need to initialize the index using the as_chat_engine() function. We’re going to ask a simple question.

query_engine = index.as_chat_engine()
response = query_engine.chat("What is the job title of Abid in 2021?")
print(response)
Abid's job title in 2021 is Data Science Consultant.

And without providing additional context, we’ll ask follow-up questions.

response = query_engine.chat("What else did he do during that time?")
print(response)
In 2021, Abid worked as a Data Science Consultant for Guidepoint, a Writer for Towards Data Science and Towards AI, a Technical Writer for Machine Learning Mastery, an Ambassador for Deepnote, and a Technical Writer for Start It Up.

It’s obvious that the chat engine is working fine.

After building a language application, the next step on the timeline is to understand the pros and cons of using a large language model (LLM) in the cloud versus running a large language model on-premises. This will help you determine which method best suits your needs.

Use LlamaIndex to build Wiki text-to-speech

Our next project involves developing an application that can answer questions from Wikipedia and convert them into speech. Code sources and additional information are available in the DataCamp Workspace below.

Web scraping Wikipedia pages

First, we will scrape the data from the Italy – Wikipedia page and save it as an italy_text.txt file in the data folder.

from pathlib import Path

import requests

response = requests.get(
    "https://en.wikipedia.org/w/api.php",
    params={
        "action": "query",
        "format": "json",
        "titles": "Italy",
        "prop": "extracts",
        # 'exintro': True,
        "explaintext": True,
    },
).json()
page = next(iter(response["query"]["pages"].values()))
italy_text = page["extract"]

data_path = Path("data")

if not data_path.exists():
    Path.mkdir(data_path)

with open("data/italy_text.txt", "w") as fp:
    fp.write(italy_text)

Load data and build index

Next, we need to install the necessary packages. The elevenlabs package allows us to easily convert text to speech using API.

pip install llama-index openai elevenlabs

By using SimpleDirectoryReader, we will load the data and use VectorStoreIndex to convert the TXT file into a vector store.

from llama_index import VectorStoreIndex, SimpleDirectoryReader
from IPython.display import Markdown, display
from llama_index.tts import ElevenLabsTTS
from IPython.display import Audio

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)

Inquire

Our plan is to ask a general question about the country and receive the response from the LLM query_engine.

query = "Tell me an interesting fact about the country?"
query_engine = index.as_query_engine()
response = query_engine.query(query)

display(Markdown(f"<b>{query}</b>"))
display(Markdown(f"<p>{response}</p>"))

text to speech

After that, we will use the llama_index.tts module to access the ElevenLabsTTS api. You need to provide the ElevenLabs API key to enable audio generation. You can get an API key for free on the ElevenLabs website.

import os

elevenlabs_key = os.environ["ElevenLabs_key"]
tts = ElevenLabsTTS(api_key=elevenlabs_key)

We will add a response to the generate_audio function to generate natural speech. To listen to audio, we will use the Audio function of IPython.display.

audio = tts.generate_audio(str(response))
Audio(audio)

This is a simple example. You can use multiple modules to create your assistant, like Siri, which can answer your questions by interpreting your private data. See the LlamaIndex documentation for more information. In addition to LlamaIndex, LangChain also allows you to build LLM-based applications. Additionally, you can read “Introduction to LangChain Data Engineering and Data Applications” to learn what you can do with LangChain, including examples of problems LangChain solves and data use cases.

LlamaIndex use case

LlamaIndex provides a complete toolkit for building language-based applications. Best of all, you can use the various data loaders and proxy tools in Llama Hub to develop complex applications with multiple features. You can use one or more plug-in data loaders to connect custom data sources to your LLM.

In short, you can use LlamaIndex to build:

  • Documentation Q&A
  • chatbot
  • acting
  • structured data
  • Full stack web application
  • Private Settings To learn more about these use cases, go to the LlamaIndex documentation.

in conclusion

LlamaIndex provides a powerful toolkit for building retrieval-enhanced generation systems that combine the advantages of large language models with custom knowledge bases. It creates a domain-specific indexed data store and leverages it during inference to provide LLM with relevant context to generate high-quality responses.

In this tutorial, we learned about LlamaIndex and how it works. Additionally, we built a resume reader and text-to-speech project using just a few lines of Python code. Creating LLM applications using LlamaIndex is very simple and it provides a large number of plug-ins, data loaders and agents.

Leave a Reply

Your email address will not be published. Required fields are marked *