Last week Google released their extensions feature for their Gemini product enabling it greater access to their office suite. I commented on LinkedIn this seems like a ‘boring but important’ step toward delivering on many of the AI hype claims.
This is why I suspect Google and Microsoft are feeling pretty smug about their longer term prospects. They can pre-integrate their office suites into their AI products in the form of reusable tools. The feedback loop will be powerful as the more they integrate the more data they’ll receive and the more edge cases they can handle. Ultimately the barrier to AI agents might just end up being another boring ‘last mile’ system integration problem.
OpenAI also queitly announced their Strucutred Outputs feature this month which I’ve been playing with every day since it shipped on the 6th. It’s a developer facing feature innteded to solve a really common problem where the LLM can be consitently giving you the right answer but always in a slightly different format (e.g. JSON, YAML, Markdown, plain text).
Spin of the wheel, which output are we going to get?
Prompt: Extract product feedback from customer reviews, specify the product being reviewed, and categorize the sentiment as positive, negative, or neutral. Single result per output.
Json?
{
"product_name": "acme widget",
"customer_feedback": "Just wasted $150 on this piece of garbage. Was working fine then just stopped turning on one day. No response from customer service either. Save ur money and buy something else. Can't believe I fell for the hype smh",
"sentiment": "negative"
}
Markdown on mondays?
# Product Name
acme widget
# Customer Feedback
Just wasted $150 on this piece of garbage. Was working fine then just stopped
turning on one day. No response from customer service either. Save ur money
and buy something else. Can't believe I fell for the hype smh
# Sentiment
negative
Yolo yaml?
- product_name: acme widget
customer_feedback:
Just wasted $150 on this piece of garbage. Was working fine then just stopped
turning on one day. No response from customer service either. Save ur money
and buy something else. Can't believe I fell for the hype smh
sentiment: negative
Or a pure “vibe” response
The acme widget
Broken dreams, wasted cash,
Silent widget, customer's crash.
Hype's allure fades,
As frustration cascades.
Structured Solutions
The structured outputs feature enables developers to specify output schema’s of exactly how they would like the LLM to respond, if in the past we were trying to parse freeform essays this feature turns the responses into ‘dot the i, cross the t’ forms that the LLM has to fill out exactly as instructuted.
With structured output’s it makes it much easier to connect promtps togeather, it turns every prompt into clever little lego bricks that are much easier to ‘snap togeather’ into increasingly complex prompt chains.
In fact the first use case I thought was to use them to build the schema’s themselves, I created a simple POC app that takes any arbitary text file (e.g. CV, product feedback, blog post, meeting transcripts), an instruction prompt (e.g. “extract CV details”) and generates a structured output schema that can be used to extract that information.