In today's cutthroat world of e-commerce, staying ahead of the competition is crucial for businesses aiming to thrive. And one key ingredient to achieving that edge lies in unlocking the secrets hidden within competitor data. However, sifting through vast amounts of data manually can be a painstakingly slow process. That's where the power of artificial intelligence and natural language processing comes into play.
Imagine having a virtual assistant at your fingertips, ready to effortlessly analyze competitor data with a few simple questions. Sounds intriguing, doesn't it? With the advent of AI-powered chatbots, businesses now have the opportunity to leverage these cutting-edge technologies to their advantage. By transforming user queries in plain language into powerful SQL commands, these chatbots can swiftly extract valuable insights from CSV files. This game-changing approach not only simplifies the data extraction process but also accelerates the analysis of competitor data.
In this blog, we're diving deep into the realm of AI-powered chatbots, where we'll help you build your very own. You'll learn how to create a simple yet effective chatbot that can effortlessly translate natural language questions into SQL commands.
Workflow of the Chatbot
Uploading the CSV file: The chatbot is a web application that presents the users with a file uploader interface first. Users can browse and upload any CSV file present on their system.
Asking questions: Once the user has uploaded a CSV file, they are presented with a space to write their questions. Users can ask questions in natural language about the data within the CSV file.
Real-Time Responses: Behind the scenes, the AI-powered model converts the user’s question into a SQL query and scans the CSV file uploaded by the user to generate a response to the SQL query. The response is presented to the user on the chatbot website in natural language. Users can have a natural conversation with the chatbot and ask as many questions as they wish.
Sample Working of the Chatbot
The given screenshot shows the user interface of the chatbot when the user opens the chatbot. It provides the user with the option to upload a csv file of their choice.
Once the user has uploaded the csv file, an input field appears where the user can enter the question they want to ask. The chatbot will query the file and generate a response for the question and display it. Examples of a few questions asked are shown below.
Similarly, we can ask anything about the data. The AI-powered model will convert the user questions into SQL commands and give responses in real-time.
Creating the Chatbot
The user interface for the chatbot is developed using the Streamlit framework. The model and the agent used to extract the user’s question, convert it into SQL query, search the csv file, and generate the response are developed using the Langchain framework. Now, we will see the step-by-step process of building the chatbot.
Importing Required Libraries
The first step in creating the chatbot is to import the required libraries. Here, we will be importing the following libraries:
Streamlit: It is an open-source Python framework used to build web applications. The user interface of the chatbot is created using this library.
Langchain: It is a framework used to develop applications powered by language models. It helps to create powerful conversational agents and consists of several large language models (LLMs), which take a text string as input and return a text string as output. There are many LLM providers with langchain among which we will be using OpenAI.
# Importing libraries
import streamlit as st
from streamlit_chat import message
from langchain.agents import create_csv_agent
from langchain.llms import OpenAI
Initialization
In order to be able to work with models developed by OpenAI, we need to have an OpenAI API Key. This key can be generated by visiting the OpenAI website. Once the key is generated, initialize a variable in your program with this key. Then the default settings of the webpage is configured using the set_page_config function of Streamlit. This is used to set the title of the page shown in the browser tab.
# Initialize the API key
openai_api_key = "your api key"
# Set page title
st.set_page_config(page_title="CSV Reader")
Along with this, a sidebar is created which will display a short description of the chatbot. This is done using the sidebar function of streamlit. Inside the sidebar, first, the company name is displayed in header formatting using the header function. Then, a few lines of content is added to the webpage using the markdown function of streamlit.
# Create a sidebar to display information
with st.sidebar:
st.header("Datahut")
st.markdown("""
## About
This is a chatbot designed by Datahut to interact with a csv file and extract information from it.
Simplify your data analysis process with our user-friendly chatbot which enables you to quickly retrieve key information from your CSV files.
""")
Building the Model and Interface
The next step in creating the chatbot is developing the interface of the chatbot where users can upload their csv files and a chatting interface. For this, we used the file_uploader function of streamlit which prompts the users to upload their CSV files. The uploaded csv file is saved in a variable named user_csv. After the user has uploaded a csv file, a container is presented using the get_text() function, which takes a text as input. Users can write questions about their csv file in this container and press enter to deliver the message to the bot. This question is saved in a variable named user_input.
Next, the OpenAI language model is initialized with the API key and the temperature parameter set to 0 to control the randomness of the response. With the temperature parameter set to 0, the response generated will be the same as many times as a question is asked. When the temperature parameter is set to 1, the response generated will be different each time for the same question, as the model will try to be more creative. The create_csv_agent function from the langchain library is used to create the agent. It combines the OpenAI language model and the uploaded CSV file, enabling the agent to understand the context of the questions and generate relevant responses.
Since we are developing a chatbot, we need to keep track of the users’ questions and the model’s response and display it accordingly. For this, we utilize the streamlit session_state. The session_state consists of two variables - generated and past. The 'generated' variable stores the chatbot's responses, while the 'past' variable maintains the user's questions.
When the user enters a question, it is appended to the session state, and the get_response() function is called. This function generates a response based on the user's input and appends it to the session_state. The past and generated messages are rendered using the Streamlit message() function, creating a seamless back-and-forth between the user and the chatbot.
st.header("CSV Reader ")
# File uploader function
user_csv = st.file_uploader("Upload your CSV file", type="csv")
if user_csv is not None:
# Get the user input
user_input = get_text()
# Initialize the OpenAI model
llm = OpenAI(openai_api_key=openai_api_key, temperature=0)
# Initialize the agent
agent = create_csv_agent(llm, user_csv, verbose=True)
# Initialize the session state
if 'generated' not in st.session_state:
st.session_state['generated'] = ["Yes, you can!"]
if 'past' not in st.session_state:
st.session_state['past'] = ["Can I ask anything about my csv file?"]
if user_input:
st.session_state.past.append(user_input)
# Get the chatbot response
response = get_response(user_input)
st.session_state.generated.append(response)
# Displaying the chat
if len(st.session_state['generated']) != 1:
for i in range(1,len(st.session_state['generated'])):
message(st.session_state['past'][i], is_user=True, key=str(i)+'_user')
message(st.session_state['generated'][i], key=str(i))
Defining Functions
Here, we define two functions. One, to get the question from the user and second for generating a response.
get_text() function is used to get the user’s question. It uses the text_input() function of streamlit for this purpose.
get_response() function is used to generate the response to the user’s question. It takes the user’s question as a parameter. It is passed to the run() function from the agent object.
# Function to get user input
def get_text():
input_text = st.text_input("Enter your question")
return input_text
# Function to generate response to user question
def get_response(query):
with st.spinner(text="In progress"):
response = agent.run(query)
return response
Conclusion
In this blog, we learned to leverage Streamlit, long chain, and the OpenAI language model to create a user-friendly chatbot interface that enables users to extract information from CSV files by asking natural language questions. This will reduce the time required to analyze and extract information from a CSV file and hence overall increase the performance of businesses and organizations. This is just a small step towards harnessing the power of AI and data, and the possibilities are endless.
But why stop here? At DataHut, we specialize in harnessing the power of web data scraping to provide comprehensive insights on competitor data. Our expert team can extract valuable information from the vast expanse of the web, enabling you to stay one step ahead of the competition. Don't miss out on this opportunity to supercharge your business.
Contact DataHut today and unlock the true potential of competitor insights through web data scraping.