• Digital Economy Dispatches
  • Posts
  • Digital Economy Dispatch #153 -- How and Why You Should Build A Generative AI Interface to Your Data

Digital Economy Dispatch #153 -- How and Why You Should Build A Generative AI Interface to Your Data

Digital Economy Dispatch #153 -- How and Why You Should Build A Generative AI Interface to Your Data
15th October 2023

Over the past few years, I have produced a lot of materials on the topic of digital transformation and its implications. From books and journal articles to blog posts and emails, I have written a lot of words on many topics published in diverse set of forums. Creating them has taken a lot of effort. So it is frustrating when they are not used. For anyone interested in seeing them they are scattered about on different webpages, available from a variety of publishers, and used in different ways to inform a wide variety of materials. But the truth is that much of what I write is not read by many people. And past materials are rarely re-read.

A key problem is how to collect these materials together so that people can easily interrogate and interact with them. At times, people want to look at one article or view a video and spend some time absorbing information from them. But these days, more often than not, they want to move quickly across different materials to find the information they need. Synthesize and summarize from a pool of information rather than battle their way through a long article. Instead of having to search around and work their way through all of the materials, a much more readily accessible approach is needed.

As a result, today many people turn to general generative AI engines such as ChatGPT and Bard to answer their queries. Ask these tools almost anything, and within seconds they give you a coherent response. The well-document problem is trying to establish the source for the answer they produce. Essentially, any response is a mashup of many materials from different places with unknown pedigree. In some cases that may be fine, but it many cases it is not. What if I want to be more focused to ask something such as “how does Alan Brown define digital transformation?”.

Thankfully, now we have a useful and compelling way to address this need using generative AI tools. I have created a simple generative AI system based on my writing over the last few years. Using this, you can interactively ask questions of these materials. It is available as:

Have a go! But first, let me tell you more about how and why it was built. 

The Generative AI Approach to Information

Based on the successful launch of generative AI engines such as ChatGPT and Bard, a lot of new tools have now become available that allow you to build conversational AI capabilities on top of your own data sources. By ingesting your own data sources, the generative AI system answers questions based on the information you supply. This has numerous uses across several domains, for example:

· By providing product specifications and support data, users can ask about product details and get answers to issues they have using them

· By feeding the generative AI engines with company data, FAQs, and internal policy information, employees of a company can find out about current practices and operating procedures.

· By collecting data on the educational and training materials being used by students, educators can refine curricula for classes, update leaning programmes, and personalize educational pathways.

These, and many more scenarios, are now being explored by different teams and organizations. Case studies and commentaries on many of their uses can readily be found.

In addition to these use cases, there is another way to look at the generative AI. It can allow you to create your own personal archive to be interrogated by anyone who has questions about the topics that on which you have commented. It allows people to get your responses to queries rather than more general responses to questions. It is the equivalent of creating a virtual assistant based on your work.

How to Train Your AI Dragon

The first step to bring together your own generative AI solution is to identify the data sources you can use. If I consider what I’ve been writing over the last few years, although my interests have varied over that time, overall I have create quite a substantial body of work addressing digital transformation, the application of digital technology to business, and ways to define digital strategies that have impact and value. Bringing this material together I have the following sources available:

This data can be used to train a Large Language Model (LLM) in ChatGPT. When ingested in ChatGPT, it results in an LLM consisting of about 250,000 tokens. Tokens in ChatGPT are the basic units of text that the model uses for processing based on the text it ingests. Models like ChatGPT read text one token at a time, and the context of previous tokens influences the interpretation of the current token.

The idea of using data to “train an AI” sounds daunting. However, in the past few months it has become very easy due the emergence of a variety of supporting tools.

This chatbot was built using ChatNode (see chatnode.ai) and I have found it to be an excellent tool for experimenting with your own chatbot. This is one of a wide range of tools now appearing that make use of an underlying generative AI engine to offer easy-to-use services for those interested in applying AI but not wanting to have to configure their own AI platform. In this case, ChatNode offers a simple interface and a straightforward workflow.

Similar to other "question-and-answer" style chatbot tools, ChatNode can be used for a variety of purposes. They are most often seen in scenarios in areas such as customer support where an appropriately trained chatbot can quickly respond to customer inquiries about products and services. It does this by being trained on data such as product specifications, upgrade notices, existing FAQ documents, and customer support emails into the bot, it can provide helpful and timely responses.

However, such chatbots can also be used in many other situations, including as a support for authors and researchers. ChatNode can help researchers distil short responses from large archives of books, articles, papers, and blog posts. Once trained on this data, it can save you time by providing concise and relevant information from this extensive library of data.

The solution uses OpenAI's GPT models to power its chatbot. To train the model, you supply it with data. This can come in several forms, for instance from Word documents and PDFs, or from data scrapped from websites. Once ingested, these models are trained on the data that you supply to it and are designed to generate human-like responses to user queries.

Such tools are also interesting because they allow you to try out different generative AI engines. In ChatNode, you have the option to choose which version of ChatGPT that you use: Chat GPT3.5, ChatGPT3.5 Turbo, or ChatGPT4.0. The first is free to use. For the others you need to provide an OpenAI API key (and pay for its use).

When you interact with this chatbot, your messages are sent to the GPT model, which processes the input and generates a response based on the context and information provided. The generated response is then sent back to you through the chatbot interface.

A Simple Example

Let’s take a look at a few simple examples of how it works. Remember, what you are doing when you ask a question of these conversational-style AI systems is posing a question to a generative AI tool that is trained on a specific set of data sources. In the case of dispatches.digital you should always imagine that every query you make is preceded with:

“Based on Alan Brown’s writings on digital transformation over the past few year….”

Here is a simple example of a query made to dispatches.digital when using the ChatGPT 3.5 engine:

Q: What are the key themes that are discussed in Alan Brown's articles published in digital economy dispatches?

A: The key themes discussed in Alan Brown's articles published in Digital Economy Dispatches include the impact of digital disruption on business, digital technology directions, and the future of the digital world.

Interestingly, the same query gives an expanded answer with more detail when using ChatGPT 4.0:

A: The key themes discussed in Alan Brown's articles published in Digital Economy Dispatches include:

  • The impact of digital disruption on business: This involves how digital technologies are changing traditional business models and practices.

  • Digital technology directions: This theme explores the trends and developments in digital technologies and their implications for businesses and the economy.

  • The future of our digital world: This theme looks at predictions and speculations about what the digital future might look like.

  • Digital transformation: Many of his articles focus on different aspects of digital transformation, such as AI's role, digital security, and the importance of digital skills.

  • AI and data science: Given his affiliation with the Alan Turing Institute, it's not surprising that AI and data science are recurring themes in his articles.

One more example gives you a sense of the way that the answers from dispatches.digital provide you with an answer to your query based on the data sources on which it was trained. These are not answers that have come from a variety of unknown sources across the web. They are specifically drawn from the materials used to train it.

Q: Why is AI important and what are its strengths and weaknesses?

A: AI is important because it has the potential to revolutionize various industries and improve efficiency, productivity, and decision-making processes. Its strengths lie in its ability to analyze large amounts of data quickly, identify patterns, and make predictions or recommendations. AI can automate repetitive tasks, handle complex calculations, and perform tasks with precision and accuracy. However, AI also has weaknesses. It can be limited by the quality and availability of data, lack of transparency in decision-making processes, and potential biases in algorithms. Additionally, AI may struggle with understanding context, emotions, and subjective human experiences. Privacy and security concerns, ethical dilemmas, and the potential for job displacement are also important considerations when it comes to AI.

Please go and explore for yourself and see what answers you get from querying my last 3 years of work!

A Digital Me

As digital data sources expand, gaining control of information is becoming increasingly difficult. Fortunately, generative AI is bringing new ways for content creators and consumers to access information. We are learning how generative AI can consolidate scattered writings, making them readily accessible. In less than an hour’s work, I was able to use the last 3 years of my output (over 150,000 words) to train an AI engine and produce an invaluable tool for anyone wishing to interact with me about Digital Transformation. Take a look at dispatches.digital. Why not have a go yourself? Whether you're in academia, business, or a passionate blogger, this technology can transform the way you work.