LLMs for ML

May 14, 2024

This is another follow up to my other posts on how I see LLMs integrating and accelerating traditional ML workflows. You can read more in my previous posts: prototyping models and model research.

LLMs have enhanced my typical ML workflow lately, primarily by prototyping models (see prototyping models) and a bit for model research (see model research). But the biggest efficiency gain that LLMs have afforded me is actually in the cleaning and organizing of data. Anyone who has used an LLM regularly knows that LLMs can do a great job of transforming data, but it's been especially helpful as an army of data analysts that help label and enhance data that make our models better.

At Attentive, we've been able to get more use out of our product catalog data by using LLMs to label and enrich items. A similar service is provided by Refuel, as detailed in their blog post. We're able to use images and descriptions to add tags to items that we are missing out of the box, and this greatly helps other models and products with accuracy. We're also able to use LLMs to enrich our subscriber details by filling in pieces of data we don't have 100% coverage on.

Even among those who don't think AGI is coming up soon, there's a lot of belief in what LLMs (AI) can do. So much of the investment at companies is going to figure out how LLMs can be applied to various problems, but my experience at Attentive has shown me that there might be an alternative opportunity. We've invested quite a bit in growing our ML team since we see how traditional ML can work alongside LLMs to enhance our product.

Just like Ron Miller said in this article "Good Old-Fashioned AI Remains Viable in Spite of the Rise of LLMs", I think that a majority of workloads will still be handled by traditional ML over LLMs (at least for the time being). LLMs will help juice ML performance by accelerating the labeling, cleaning, and enrichment of data, so I think every company should be figuring out how to ramp up their ML efforts with their AI investments.

Siddharth Ramakrishnan

Writing

LLMs for ML