Addendum to Project Submission

Background

I built the GTM bot in a Custom GPT with the intention that:

  1. I could harness the power of the native natural language models available to match client goals to my list, whilst

  2. Tightly controlling access to data and coding the user flow through a decision tree to avoid conversational drift or AI hallucinations (making data up)

What happened since project submission

I realised that the bot was hallucinating — making up responses that looked correct but were subtly changed.

The hallucinations were small (like changing a word or rephrasing a label), but they had a huge impact, which included:

  1. Breaking the user flow because the database didn’t recognise data in the flow that wasn’t supposed to be there

  2. The custom-gpt actually changed data in the database which caused severe data integrity issues and the database to essentially fail

Why was this missed in testing and feedback?

The bot was responding clearly and appeared to follow the structured logic I’d designed.

The tightness of the controls I had implemented actually made these issues harder to spot because the changes it made were minor and all responses sounded both plausible, and incredibly close to expected output, that it seemed to be working correctly.

Investigations

Whilst investigating what at first seemed like small issues, I came to realise that:

  1. No amount of controls and coding can stop the GPT from hallucinating.

  2. It is a natural language model, not true AI, so it will confirm it’s following your instructions whilst subtly not doing so if that feels more natural.

Change in Approach

In order to keep my design principles of only using my curated data, I had to step away from the custom-gpt as basis of GTM Bot

I retained the natural language model element needed for goal matching only (as was always intended).

I had to code the decision tree and data lookups outside of the GPT and only call it for the goal matching.

This introduced a new technology element to the project that was not included in my original submission, which is a hard-coded user flow in landbot.io still calling the Airtable database as originally intended.

What it is the impact?

  1. My original project submission does not accurately the technology in use (but the design, principles and user flow remain).

  2. In order to get it working quickly some of the intended functionality it not present, i.e.

    1. Looping function to allow multiple tasks and methods to be selected for same goal

    2. The use of task categories to limit large lists returning

    3. The use of modality and reference links for methods

  3. The extent of subtle data corruption in the database has been difficult to trace and a complete data integrity review is required to ensure quality and to replace any missing data.

Conclusion

The project submission is technically inaccurate but still validly describes my project and how it should work

The delivered bot has less functionality than intended, and results cannot be relied upon as well as I had intended

The Bot works, is available online now, and with additional work will evolve well from this new stable base.

Lessons and Reflections

Key technical learning is the limit of what can be done (and controlled) with current AI

The subtle errors would have been caught earlier with more rigorous testing but I give myself some slack for that as I was focsssed on the method research, learning and designing something useful, not delivering an IT project.

I can see weakness in the breadth and depth of my references and non-technical research for the project but still I feel like I’ve still met the project core learning outcomes, demonstrated resilience in the face of adversity, included pluralistic philosophy and methodology throughout, and importantly, the bot is working and available for use.