Large Language Models have a User Experience (UX) problem.
This screenshot from ChatGPT shows the problem and OpenAI’s attempted solution.
Simply put, most users need help figuring out what to do when faced with a blank chat screen. An empty chat gives us no hints about the possibilities of the technology.
OpenAI tries to solve this problem by suggesting what you can do with ChatGPT. However, there is limited real estate for suggestions on this screen and these recommendations are not personalized.
In this post, I discuss the importance of metaphors and mental models to help users make sense of new technologies. I discuss a couple of ways of thinking about Large Language Models and end the post with a quick summary of Jeavio’s approach to building Large Language Models.
Why GUIs rule the world
For the last 40 years, the GUI, or Graphical User Interface, has been the primary mode of interaction with Personal Computers.
Modern desktop (and touch) interfaces work well because they provide users with a simple mental model for their use. Files, folders, and even the desktop are metaphors that allow users to reason about what to do with their data. Using metaphors enables users to transfer their understanding of how one system works to a novel system. It solves the “cold start” problem of getting new users up to speed.
Chat, or more broadly, Conversational Interfaces are also metaphors. They make the user think that they are communicating with someone or something. However, this metaphor only sometimes works or can be challenging to understand.
Conversational Interfaces Aren’t New
The current experience of using Large Language Models is a call back to an older era of computing—that of the command line.
While LLMs and chatbots have made “Conversational Interfaces” mainstream – we have been “chatting” with computers for a long time. The difference is that we used command lines on the Unix and DOS command prompts. Before desktops, we used terse commands like cp or grep or mv that gave very specific instructions to the computer. Getting comfortable with using the command line required either formal or on-the-job training. The users of command line systems had to learn an entirely new language to do their job effectively.
The Missing Metaphor
What we don’t (yet) have is a similar and simple metaphor for using Large Language Models. Techniques like Prompt Engineering, Chain-of-thought reasoning, Zero-shot learning, etc., are closer to using the command line or learning a programming language than simply chatting with someone.
The challenge is that the language of interaction is English (or other “natural” languages). So, on one hand, users are told to “chat” with the model, and on the other, they must learn complex prompt engineering techniques to be effective. This steep learning curve will become a significant blocker to the widespread adoption of LLM capabilities.
To deploy LLM capabilities, we need to make them accessible and understandable to a broad range of users.
One way of doing this is using examples and metaphors that resonate with intended users.
Another is to build single-purpose applications that focus on a limited set of users and expose a narrow set of LLM capabilities. We will cover this topic in another post.
Let’s start with a metaphor suitable for a common LLM use-case – search.
The Dark Library
A Large Language Model is like a vast library shrouded in darkness. Prompt engineering is like shining a light into the dark library.
When chatting with an LLM, you are giving it hints and directions about what to search for in the library. The model will then retrieve the data and present it to you – like an obedient robot librarian who needs a helping hand.
Prompt Engineering techniques like Chain-of-thought (CoT) or Analogical reasoning are ways of shining a light in the dark library. CoT forces the model to break down the information search into smaller steps. Analogical reasoning forces it to search in a sub-section of its knowledge base.
This metaphor can help someone think through how to use an LLM-powered application effectively.
Like a library, the knowledge available to a LLM is finite. The model, at best, can tell you that it doesn’t know the answer, or at worst, can make up an answer when responding to a prompt that is outside its training data.
Metaphors provide a framework for your users to think about the boundaries of what is possible when working with LLMs.
When building an LLM-powered application, you must think carefully about what metaphors would make sense to your users and how you should communicate them. Simply putting a skin on top of ChatGPT will be a problematic UX strategy.
Metaphors and mental models help users to understand how a new system works. They are not the only tools available, and metaphors can also be problematic, but they are helpful.
Designers, developers, product managers, and executives need to think about how to use metaphors and analogies to explain the value proposition of their applications. The capabilities of the underlying technology may be incredible, but they won’t be used if the intended users do not understand them.
At Jeavio, we have been thinking hard about how to build LLM-powered applications for both internal use and for our clients. Our approach is to take a reasonably common UX Design approach – focus on the User Persona and their pain points and help surface prompts relevant to the work they are likely to do.
I will write more about our thoughts on UX design for LLM-powered applications in subsequent posts.