100 Days of Gen AI: Day 51

Today I’m exploring design patterns for websites that cater to both human and LLM users.

This is something I’ve been noodling on for a while but I haven’t put my thoughts down until today.

Humans vs LLM users

Humans find visual layouts easiest for navigating websites. We care about fonts, buttons, images, text size, and relative positioning between objects. LLMs don’t care about any of these things. Instead they care about structured information that can be easily parsed, with available actions it can take next (API calls).

Also, much of the request/response bandwidth between client/server is dedicated to serving assets such as images, fonts, icons, which are visually pretty, but offer little value for an LLM user. Transmitting as little information as necessary means faster response times, but also fewer demands on compute resources and reduced energy usage. If we’re going to be living in a world where 90% of all internet traffic is from LLM users (and I think we are), then making these interactions efficient for LLM users is important.

As a web developer I also can’t easily distinguish between human and LLM users, which makes it hard to use metrics to understand how each user is interacting with the webpage and cater to improving the experience for each type of user.

Is the goal for LLM users simply API discovery?

I could imagine a world where the LLM just reads an API documentation page, then uses this to determine a sequence of curl commands to run to achieve the goal. This could work, though most API’s have lots of extraneous API commands that are irrelevant to most interactions. An LLM would be more likely to get confused or stumble through a few different commands before finding the right one. A guided flow, with a more limited set of options at each step, would be faster for an LLM to parse, and leave it more likely to arrive at the correct decision from a smaller list of options.

Permissions for agents

This also brings up another consideration around permissions for agentic LLM users. How do we constrain permissions to small set of actions so an agent can’t wreak too much accidental havoc?

For example, when I log into my bank account as a human user, I can:

I might want to delegate an agent the ability to login to my bank account to perform certain actions, but I don’t want to allow it to do all of these things.

Granting agentic users with granular permissions seems a worthwhile investment for web developers. I could see granting a set of base permissions that the agent always has (e.g. login, read account data), but then additional permission scopes are granted just for a single agentic session (e.g. ability to change my address).

This is a new paradigm for consumer websites, which typically only have one type of user with full access to all actions.

Designing a seamless permission-granting interface is important. You need to minimize friction for human users so they are not spending tons of time granting permission scopes to an agentic LLM. Also, they could be interacting via many different modes such as via voice commands.

Two approaches

  1. Convergence: machine-readable content embedded within traditional HTML
  2. Divergence: a separate interaction pattern designed specifically for LLM’s

This is what I mean by convergence:

Design one webpage, but with extra hints for LLM users, like this:

<div class="appointment-selector" data-action="schedule-appointment" data-doctor="Dr. Jane Smith">
  <!-- Human UI elements here -->
</div>

This is what I mean by divergence:

Tailor the experience based on type of user (this could be determined by a header like X-LLM-USER: true or a specific User-Agent used by LLMs).

If human:

If LLM:

Agent-first design

Now imagine that every website is designed for Agent-first interactions. Why even build a human interface at all?

An LLM could dynamically generate a human-readable UI based on the user’s goal. For example, if I want to find my most recent health insurance claim details, instead of navigating through several menus (and lets be honest, in the case of a health insurance website this involves stumbling around for a while until I find the correct page), an agentic user could find the requisite information I care about, then display it on the fly in a custom rendered HTML page.

For example, take a list of hypothetical list of health insurance claims that an API might return:

[
  {
      "claim_id": "CLM002",
      "patient_name": "Jane Smith",
      "date_of_service": "2024-05-10",
      "provider": "City Clinic",
      "claim_amount": 450.00,
      "status": "Processed"
  },
  {
      "claim_id": "CLM004",
      "patient_name": "Jane Smith",
      "date_of_service": "2024-09-05",
      "provider": "Suburban Medical Group",
      "claim_amount": 300.00,
      "status": "Processed"
  },
  {
      "claim_id": "CLM005",
      "patient_name": "Jane Smith",
      "date_of_service": "2025-03-18",
      "provider": "Regional Hospital",
      "claim_amount": 1500.00,
      "status": "Pending"
  }
]

If this is the data your health insurance is ultimately returning, why not just have the LLM generate a visual display on the fly? This could even be rendered client-side if you have a device with a local LLM and sufficient hardware.

For example, this could be the prompt:

Here is my patient data, can you display this as an HTML webpage. Draw my attention to any pending claims and make it beautiful. {insert patient data}

I just popped this prompt into ChatGPT and here’s the page it generated:

Patient Claims Summary

Patient Claims for Jane Smith

Claim ID Date of Service Provider Amount Status
CLM002 2024-05-10 City Clinic $450.00 Processed
CLM004 2024-09-05 Suburban Medical Group $300.00 Processed
CLM005 2025-03-18 Regional Hospital $1500.00 Pending

It’s pretty good!

Is this the future of web design?

This raises whole new questions:

If rendered server-side, then a web developer could monitor quality before it’s sent to the client. But if content is generated client-side, the developer will have no visibility into the user experiernce.

Conclusion

I believe the way we design webpages is ripe for rethinking in the LLM era. In my mind, the key areas for focus are how to represent information for an agent-first world, how to manage permissions for agents, and how to create dynamic web experiences that maintain a high quality bar.