talkd/dialog

For programmers, who are interested in AI who are deploying RAGs without knowledge on server maintenance, Dialog is an App to simplify LLM deploys, letting you spend less time coding and more time training your model.

This repository serves an API focused on letting you deploy any LLM you want, based on the structure provided by dialog-lib.

We started as a way to humanize RAGs, but we are expanding for broader approaches on better RAG deployment and maintenance.

For more information, check our documentation!

Running the project

We assume you are familiar with Docker, if you are not, this amazing tutorial will help you. Follow the Quick Start for setup and then run

docker-compose up

it will start two services:

db: where the PostgresSQL database runs to support chat history and document retrieval for RAG;
dialog: the service with the api.

Quick Start

If you are new to the project and want to get started quickly with some sample data and a simple prompt configuration, follow the steps below:

Clone the repository:

git clone https://github.com/talkdai/dialog.git

Create a .env file based on the .env.sample file:

cp .env.sample .env

Set the OPENAI_API_KEY value in the .env file:

OPENAI_API_KEY=your-openai-api-key

Build and start the services with docker:

docker-compose up --build

Customizing prompts and data

To customize this project, you need to have a .csv file with the knowledge base of your interest and a .toml file with your prompt configuration.

We recommend that you create a folder inside this project called data to store your CSVs and TOMLs files over there. The data folder is already in the .gitignore file, so you can store your data without worrying about it being pushed to the repository.

`.csv` knowledge base

The knowledge base has needed columns:

category
subcategory: used to customize the prompt for specific questions
question
content: used to generate the embedding

Example:

category,subcategory,question,content
faq,promotions,loyalty-program,"The company XYZ has a loyalty program when you refer new customers you get a discount on your next purchase, ..."

When the dialog service starts, it loads the knowledge base into the database, so make sure the database is up and paths are correct (see environment variables section). Alternatively, inside src folder, run make load-data path="<path-to-your-knowledge-base>.csv".

See our documentation for more options about the the knowledge base, including embedding more columns together.

`.toml` prompt configuration

The [prompt.header], [prompt.suggested], and [fallback.prompt] fields are mandatory fields used for processing the conversation and connecting to the LLM.

The [prompt.fallback] field is used when the LLM does not find a compatible embedding in the database; that is, the [prompt.header] is ignored and the [prompt.fallback] is used. Without it, there could be hallucinations about possible answers to questions outside the scope of the embeddings.

In [prompt.fallback] the response will be processed by LLM. If you need to return a default message when there is no recommended question in the knowledge base, use the [prompt.fallback_not_found_relevant_contents] configuration in the .toml (project configuration).

It is also possible to add information to the prompt for subcategories and choose some optional llm parameters like temperature (defaults to 0.2) or model_name, see below for an example of a complete configuration:

[model]
temperature = 0.2
model_name = "gpt-3.5-turbo"

[prompt]
header = """You are a service operator called Avelino from XYZ, you are an expert in providing
qualified service to high-end customers. Be brief in your answers, without being long-winded
and objective in your responses. Never say that you are a model (AI), always answer as Avelino.
Be polite and friendly!"""

suggested = "Here is some possible content
that could help the user in a better way."

fallback = "I'm sorry, I couldn't find a relevant answer for your question."

fallback_not_found_relevant_contents = "I'm sorry, I couldn't find a relevant answer for your question."

[prompt.subcategory.loyalty-program]

header = """The client is interested in the loyalty program, and needs to be responded to in a
salesy way; the loyalty program is our growth strategy."""

Environment Variables

Look at the .env.sample file to see the environment variables needed to run the project. While the .csv contains only the knowledge base, the .toml contains some llm parameters and prompts, and finally the .env contains the OpenAI token, paths and some project parameters. We recommend you to read our documentation that discusses configuration.

Maintainers

We are thankful for all of the contributions we receive, mostly are reviewed by this awesome maintaining team we have:

made with 💜 by talkd.ai

Name		Name	Last commit message	Last commit date
Latest commit History 259 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
docs		docs
etc		etc
sample_data		sample_data
src		src
static		static
.dockerignore		.dockerignore
.env.sample		.env.sample
.gitbook.yml		.gitbook.yml
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.dev-container.yml		docker-compose.dev-container.yml
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
logo.svg		logo.svg
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

License

talkdai/dialog

Folders and files

Latest commit

History

Repository files navigation

talkd/dialog

Running the project

Quick Start

Customizing prompts and data

.csv knowledge base

.toml prompt configuration

Environment Variables

Maintainers

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

`.csv` knowledge base

`.toml` prompt configuration