I had the pleasure of moderating a stimulating discussion around the use of Large Language Models (LLMs) for Blue Cloud Ventures last week. Presenters from 10 varied B2B SaaS companies shared their current and planned use of LLM capabilities. We covered a number of use cases, challenges, as well as discussed vendors and the broad opportunities provided by LLMs. See the Appendix for more information on the presenters.
This document is a summary of the discussion. I cover:
- The broad Use Cases of LLMs
- Common themes around the use of LLMs
There is a lot of material here and in order to keep things somewhat succinct, I have taken the liberty of summarizing and editorializing.
The presenters covered a number of use cases, including some already in production, for LLMs. We describe these use cases here with a summary in the following table.
|Natural Language Processing & Information Retrieval||Summarization, Entity Recognition, Categorization & Classification, Topic Detection, Q&A|
|Software Development||Code Generation, Code Analysis|
|Text Generation||Smart composition, Intelligent Templating|
|Customer Support||Enhanced Documentation, Augmenting Support Agents, Conversational Interfaces|
|Internal Tooling and Efficiency||Knowledge Management, Curated & Secure Access to LLM Capabilities,|
Natural Language Processing (NLP)
- Large language models are valuable assets in executing traditional NLP tasks. For example, they are highly effective in Named Entity Recognition (NER), where they can identify and categorize entities in a text into predefined categories.
- Similarly, they excel in information retrieval, being able to answer natural-language queries over a provided dataset, identifying main themes in given text (topic detection), and fetching relevant information guided by contextual understanding (in-context information retrieval).
- LLMs can also classify text into predefined categories based on content, generate concise summaries of long documents or meeting transcripts, and convert speech into written or printed text (transcription).
AI-Assisted Software Development
- Code Generation: LLM-powered tools like GitHub Copilot and Starcoder can automatically generate code based on provided specifications or examples.
- Code Analysis: Fine-tuned LLMs can also be used to understand and evaluate the quality of existing code.
- Smart Composition: LLMs also show great potential in text generation tasks. For instance, they can assist in writing tasks by suggesting completions or generating content.
- Intelligent Templating: Generating text based on predefined templates that follows guidelines around style and tone.
- Conversational Interfaces: In the customer support domain, LLMs can facilitate a conversational interface (i.e. chat bot) with users, helping to expose product capabilities in an engaging and effective manner.
- Augmenting Customer Support Teams: Helping customer success and support teams scale by improving efficiency and provide in-context support. LLM powered agents can answer most common use cases with agent support and intervention when necessary.
- Pre-Curated Prompts: LLMs can be used to create safe tools that address specific internal use cases by limiting end-users to a set of pre-curated prompts in a monitored environment. creating secure access points or wrappers around tools like ChatGPT ensures safe and efficient utilization of these powerful capabilities.
- Common Internal Use Cases: Enforce document templates, Q&A around internal resources and documentation, “Intelligent” FAQs, guided employee on-boarding, etc.
The presenters covered a range of topics around the use of LLMs. Here we group the topics into a set of broad themes:
Performance Challenges in Large Language Models (LLMs)
LLMs, despite their immense capabilities, face limitations in inference speed due to their high parameter count. Additionally, their extensive energy consumption and financial costs pose significant barriers.
Optimizations for Improved Performance
Several strategies can enhance LLMs’ effectiveness:
- Sparser Models and Distillation: Utilizing models with fewer parameters or training smaller models to mimic larger ones can boost inference performance.
- Pre-Filtering and Cache Use: Deploying LLMs only where traditional methods fail and using output caches can efficiently manage resources and data.
- Progressive Summarization: Tools like Buffer Memory in Langchain, which provides progressive summarization, can effectively tackle the challenge posed by limited context windows. LLamaIndex is another tool that can help deal with large datasets.
User Experience with Large Language Models
LLMs, when aligned with specific use-cases, can significantly boost productivity. However, the introduction of these models should be executed thoughtfully.
- Prompt Engineering and Templating: Prompt engineering, while powerful, can be complex. By providing appropriately templated and curated prompts, the adoption process can be made smoother, especially when working towards a specific set of use-cases.
- Security and Guard Rails: Exposing LLM capabilities to end-users may lead to security risks and potential unintended destructive operations. It’s crucial to provide adequate guard rails to customers, thus preventing destructive actions via conversational interfaces. There are emerging frameworks to mediate the output of LLMs including Guardrails.
- Picking the Right Use Cases: Certain use-cases, such as Q&A and Summarization, are comparatively easier to understand and deploy, providing a good starting point for users integrating LLMs into their workflows.
Infrastructure Considerations for Large Language Models
The infrastructure supporting LLMs involves critical factors like cost, tooling, vendor selection, and operational practices, each having significant implications for successful implementation.
- Cost Implications and Control: Hosting models can be financially demanding, and API providers like OpenAI, while offering excellent capabilities, can also be costly. However, through careful model selection and optimization strategies, these costs can be controlled. Crucial to this is the effective monitoring of prompts and API calls to prevent any rogue processes from generating unexpectedly large bills.
- Tooling: Several tools are available to facilitate LLM-powered application development and deployment. Langchain is a widely used framework for building composable LLM applications, while Streamlit supports rapid prototyping and UI development. Vector DBs like Pinecone, along with established databases like ElasticSearch and OpenSearch, can efficiently store embeddings for data. In general, the presenters were building LLM infrastructure using Python, with APIs built on top of frameworks like FastAPI and user interfaces built using Streamlit.
- Vendor Selection: While OpenAI and its APIs remain the most popular due to their superior capabilities, other options like Azure and Claude offer distinct advantages. Azure’s wrappers around OpenAI provide additional data protection and security, and Claude, with its larger context windows, can be beneficial for specific tasks despite its slightly lower performance.
- LLMOps: LLMOps, a discipline dedicated to driving efficient LLM deployment, includes critical tasks like monitoring, prompt versioning, and cost management. These operational practices help ensure smooth, efficient, and cost-effective use of LLMs in various applications. There is still a lot of work to be done in effectively running LLMs in production environments.
Deploying LLM Capabilities
The deployment of LLMs requires a careful and strategic approach, as they often form part of an ensemble ML solution. Given their limited context windows and performance bottlenecks, LLMs must be used judiciously.
- Ensemble Approach and Langchain: Tools like Langchain can be leveraged to integrate LLMs into a chain of tools, capitalizing on their strengths while minimizing their constraints. Using LLMs in conjunction with other existing ML models as an ensemble can also be an effective way of optimizing performance and minimizing costs.
- Access Control and Security: Direct access to tools like ChatGPT could result in substantial data and compliance issues. A viable solution is to create a wrapper around LLM APIs, offering a cost-effective alternative to pricier options like ChatGPT Pro. This not only adds a security layer to prevent data leakage, but also assists users in focusing on prompt building by abstracting model interactions.
- End-User Guidance and Training: Providing templates and guardrails to end-users is a crucial step to prevent misuse and ensure safe utilization of LLM capabilities. Additionally, training teams to use LLMs effectively can pose a challenge, hence, integrating LLMs into learning and development programs is recommended. Identifying which internal projects are best suited for LLM integration can be difficult, but over time, LLM and AI technologies should become a standard part of the engineering toolkit.
- Cost: LLMs can be financially taxing, with both inference and fine-tuning processes posing significant costs.
- Accuracy: In Q&A applications, the deterministic versus probabilistic nature of responses can be a challenge in achieving desired accuracy levels.
- Performance: LLMs can be slow in inference, and their limited context windows further constrain their performance.
- IP Concerns: Issues around data provenance and code generation using models that leverage restrictively licensed code can raise serious IP concerns.
- Specialized vs General Purpose Models: Finding the right balance or choosing between specialized and general-purpose models can be a challenge depending on the specific use case.
- Security: Prompt injection, the risk of generating insecure code, and the potential for leaking internal data to third-party providers are significant security concerns.
- Product Differentiation: Differentiating your product from established providers such as ChatGPT and others can be a daunting task in the competitive AI market.
There is tremendous excitement about the potential of Generative AI and Large Language Models. There are a number of use cases where LLMs can improve operational efficiency, save costs, and increase ROI.
However, there are also challenges associated with the use of these models. Some, like poor performance, can be overcome by optimization as well as a better understanding of the underlying technologies. Others, such as intellectual property issues and the probabilistic nature of LLM behavior are more significant barriers to widespread adoption of these tools.
There is also an emerging set of vendors, libraries, tools, and APIs that will end up becoming the “LLM Stack” (See Appendix). We should see the emergence of common standards and patterns for the adoption of LLMs in the coming days.
I came away (even more) enthusiastic about the potential of Generative AI. Many thanks to all the presenters and to BCV for facilitating this event.
|Jeavio (Moderator)||Venture Services and Product Development|
|Teleskope||Data Protection and Security Platform|
|Arctic Wolf||Cybersecurity Operations Platform|
|LMS365||Learning Management Platform Built on Microsoft|
|Nylas||API Platform for Email, Calendar, and Contacts|
|Clari||Revenue Operations Platform|
|Druva||Data Resiliency Platform|
|Templafy||Enterprise Content Management Platform|
|Wrike||Collaborative Project Management Platform|
|Applyboard||Study Abroad Application Platform|
|Conductor||Enterprise SEO and Marketing Platform|
The Emerging LLM Stack
|GPT 3.5||OpenAI model providing the best combination of cost and capabilities|
|Claude||The Claude model provides larger context windows making it suitable for specific use cases|
|Hugging Face||AI platform for finding and using Open Source ML models|
|Azure AI Service||Microsoft’s Azure AI service provides a wrapper for OpenAI (and other) models.|
|Langchain||Popular toolkit to build LLM Powered applications|
|StreamLit||Framework to build UIs for ML-powered applications|
|FastAPI||Framework to build performant python APIs|