Artificial Intelligence (AI) has been making waves across multiple sectors, but its practical implications in business applications (AI Slackbots) are only on the verge of proper adoption.
One of the most advantageous applications of AI is found in Large Language Models (LLMs) for question-answering. The advent of OpenAI’s ChatGPT and the release of subsequent open source LLM models have revolutionized how we can harness AI for a wide variety of business applications.
For instance, our internal team processed around 4,000 messages per year, averaging ten minutes per message. Through effective AI implementation, we reduced the workload by a significant third.
But this post is not just another “how to build your QA bot”, but rather sharing the learnings we have gained from our exploration of prompt engineering, testing, and deployment of LLMs into a live production environment.
Here’s 6 bits of wisdom we can pass on to you:
Lesson 1: You can build a cool demo in a day, but it’s a long road to launch
Slack for communication and Notion for internal documentation are integral parts of our organization, so connecting these two with the ChatGPT APIs was a no-brainer. We stitched together a cool demo in a day with some really good results. But also some wrong, partially accurate and even entirely fabricated results have occurred. So now the real work could be started.
Lesson 2: Garbage in, gibberish out
No matter how hard you try to improve your prompting game, if your input data is garbage, you can’t expect good results. Not that our handbooks and source data were in a complete state of disarray. Sometimes all it takes is missing key information, or mixing in a resource that is obsolete. You have to work in close cooperation with business to get this part right. Next, you need to process it correctly.
Lesson 3: It’s all about the data
How you organize your data matters, especially so you can use it as efficiently as possible.
On top of the data source, splitting the resources was equally important. Because many parts have been deliberately rewritten, we found it was most efficient splitting content using a markdown structure, i.e. if the document is too long for one context, we try to split it using the highest level title.We used OpenAI embeddings (Ada) and local vector database (FAISS). In other settings, overlapping contexts could be more reasonable. Another trick that we deployed was having technical pages that were not necessarily accessible, as they just contained lists of key contacts and information to be fed into context .Keeping all of this separate helps with keeping the eye on the changes and providing always up to date data.
Lesson 4: There are 101 ways to fail
Testing and collecting feedback is a crucial step. It’s easy to overlook the many ways a bot can fail its users.
Each knowledge-bot chat interaction in Slack could be marked as good or bad, and on top of that, a request for human interaction could be submitted. But a more rigorous approach was necessary, especially when we tested our first round of evaluations with our operation teams that handle these types of queries normally by themselves. Almost 90% of answers were correct, but the wrong ones were the most tricky – consisting of multiple sources or requiring some additional knowledge about the user. That’s why we incorporated the slack id (a unique employee id number) to obtain additional info about the user. This way we could assume the default office location, department and other basic information to provide more relevant results.
Lesson 5: Don’t work blind- define your output and time-box your experiments
Especially when working with large language models, defining acceptable levels of the output and error rates are crucial indicators to set your evaluations and expectations.
In this case we started with iterations, because it was so much fun and we failed to take our own medicine in terms of project setup.
Lesson 6: Don’t forget to stay inspired
With technologies like AI, you can’t be afraid to try new things. While it’s nothing new, we are now seeing an unprecedented improvement that allows for much more exploration and experimentation.
For example, our Notion Q&A works quite well, so now we are ready to explore a combination of data sources – connecting our neo4j data warehouse. In order to do that, we would need to implement fine-grained access management. Another avenue to explore are the capabilities of the Slack bot. Not only can it answer your questions, but perhaps if an employee would like to order extra catering or parking spots, the solution could just be an API call away.
What are some ways AI can help your business? Let’s discuss your unique use cases.
Our experts will be happy to talk about all things AI.
Start your AI transformation today
We have guides and use cases to help cut through the noise – discover how you can optimize and boost your capabilities.