Prompt-based development is making the machine learning development cycle much faster: Projects that used to take months now may take days. I wrote in an earlier letter that this rapid development is causing developers to do away with test sets.
The speed of prompt-based development is also changing the process of scoping projects. In lieu of careful planning, it’s increasingly viable to throw a lot of projects at the wall to see what sticks, because each throw is inexpensive.
Specifically, if building a system took 6 months, it would make sense for product managers and business teams to plan the process carefully and proceed only if the investment seemed worthwhile. But if building something takes only 1 day, then it makes sense to just build it and see if it succeeds, and discard it if it doesn’t. The low cost of trying an idea also means teams can try out a lot more ideas in parallel.
Say you’re in charge of building a natural language processing system to process inbound customer-service emails, and a teammate wants to track customer sentiment over time. Before the era of large pre-trained text transformers, this project might involve labeling thousands of examples, training and iterating on a model for weeks, and then setting up a custom inference server to make predictions. Given the effort involved, before you started building, you might also want to increase confidence in the investment by having a product manager spend a few days designing the sentiment display dashboard and verifying whether users found it valuable.
But if a proof of concept for this project can be built in a day by prompting a large language model, then, rather than spending days/weeks planning the project, it makes more sense to just build it. Then you can quickly test technical feasibility (by seeing if your system generates accurate labels) and business feasibility (by seeing if the output is valuable to users). If it turns out to be either technically too challenging or unhelpful to users, the feedback can help you improve the concept or discard it.
I find this workflow exciting because, in addition to increasing the speed of iteration for individual projects, it significantly increases the volume of ideas we can try. In addition to plotting the sentiment of customer emails, why not experiment with automatically routing emails to the right department, providing a brief summary of each email to managers, clustering emails to help spot trends, and many more creative ideas? Instead of planning and executing one machine learning feature, it’s increasingly possible to build many, quickly check if they look good, ship them to users if so, and get rapid feedback to drive the next step of decision making.
One important caveat: As I mentioned in the letter about eliminating test sets, we shouldn’t let the speed of iteration lead us to forgo responsible AI. It’s fantastic that we can ship quick-and-dirty applications. But if there is risk of nontrivial harm such as bias, unfairness, privacy violation, or malevolent uses that outweigh beneficial uses, we have a responsibility to evaluate our systems’ performance carefully and ensure that they’re safe before we deploy them widely.
What ideas do you have for prompt-based applications? If you brainstorm a few different ways such applications could be useful to you or your company, I hope you’ll implement many of them (safely and responsibly) and see if some can add value!
P.S. We just announced a new short course today, LangChain: Chat with Your Data, built in collaboration with Harrison Chase, creator of the open-source LangChain framework. In this course, you’ll learn how to build one of the most-requested LLM-based applications: Answering questions based on information in a document or collection of documents. This one-hour course teaches you how to do that using retrieval augmented generation (RAG). It also covers how to use vector stores and embeddings to retrieve document chunks relevant to a query.