Serverless AI Tools: Create LLM Functions in Your Browser

Written by on .

engineering
serverless
javascript

  1. Tool Use
    1. Lams
      1. Use Cases
        1. Future Work
          1. Limitations and Pricing

            Lams are small JavaScript snippets of code, written in the browser and run on our servers.

            The idea of Lams originated from interacting with val.town – a social website to write and deploy TypeScript. I liked the concept of writing small snippets of code that I can use to automate small tasks, e.g., notifying me of emails that contain a specific keyword.

            However, instead of humans interacting with these services, I wanted to give LLMs access to invoking functions. This would allow me to make my AI more useful by giving it access to retrieve data and perform actions. Furthermore, I wanted to be able to do all of this from my browser, without having to setup a backend. This is where Lams come in.

            Tool Use

            If you are familiar with the OpenAI's Assistant Tools or Anthropic's Tool Use feature, you will recognize the concept. I even shared a tutorial demonstrating the basic implementation of this feature.

            In short, the idea is to give LLMs access to invoking user-defined functions. This is done by describing what the function does (the expected input and output), and then giving an opportunity for the LLM to call the function when answering user's questions. The possible use cases for this are endless, and include:

            • Fetching data (e.g., using an API to get the current weather in a given location)
            • Performing actions (e.g., sending an email)
            • Calculating values (e.g., getting the total cost of a shopping cart)

            The concept of RAG (Retrieval Augmented Generation) is well known in the AI community. However, traditionally tools functionality required you to setup your own infrastructure, and Lams allow you to define tools directly in the browser.

            Lams

            Lams are defined directly in the browser.

            Lam editing
            Lams are defined in the browser

            Lams themselves are just JavaScript snippets that are executed on our servers, e.g.,

            export default async (input) => { return new Date().toISOString(); }

            We use e2b.dev to safely execute Lams. This is done for security reasons.

            Every Lam returns a function that is invoked with input as the only argument. The input is always a JSON object, and it corresponds to the JSON schema that was defined when creating the Lam, e.g.,

            { "type": "object", "properties": { "location": { "type": "string", "description": "The location to get the weather for, e.g. New York, US." } } }

            Every time a user asks a question, the LLM will evaluate if answering the question would benefit from invoking a Lam. If so, the LLM will call the Lam with the necessary input and construct a response based on the Lam output.

            Lam using getCurrentTime
            Example of Glama using the getCurrentTime Lam to answer user's question

            In fact, our implementation of RAG can even chain multiple Lams together to answer complex questions. Furthermore, since RAG is implemented at the application level, you are able to use Lams with any LLM model that is available on our platform (e.g., Claude, GPT-4, etc.).

            Use Cases

            It's been less than a day since this feature has been deployed to production. I've been predominantly experimenting with it myself. However, I have already found a few neat use cases for it:

            • I use it to query TimeZone GPT service to accurately answer questions about time.
            • I use it to control lights in my home using Home Assistant.
            • I use it to tell me about the latest posts on Hacker News that mention AI and LLMs.

            I am currently working on implementing a Lam that will talk to https://fal.ai/ to generate images.

            Future Work

            As I have been implementing the first few Lams, I thought the following features would be useful to add:

            • Add ability for Lams to access the conversation history, e.g. to forward the conversation to another service.
            • Add ability to share Lams between workspace users.
            • Add ability to run Lams on a schedule.
            • Allow Lams to generate UI elements.

            While there is no timeline for these features, I am sharing them here in case they spark some interesting ideas.

            I can also imagine a future where the AI itself will be able to create Lams. Maybe this will grow to a Lam marketplace where users can create, share, and monetize their Lams.

            Limitations and Pricing

            Lams today are scoped to individual users. However, I am planning to make it possible to share Lams between workspace users.

            During the rollout, Lams are capped to 30 seconds of execution time. There is no additional cost for using Lams at this time.

            This feature is still in its early stages, but I am eager to see what use cases people will come up with. If you have ideas that you need help implementing, connect with me on Discord.

            Written by Frank Fiegel (@punkpeye)