This is the third post in a series of five on AI. In my last post, I gave examples of tasks I’d outsource to AI. How do you outsource them? Through prompt writing – a skill some call prompt engineering. Because large language models (LLMs) like ChatGPT, Claude, and Gemini are based on conversational prompting it’s easy for anyone to use them. You don’t need to learn a coding language like Python or HTML or a software interface like Excel or Photoshop. You just tell it.
Generative AI can produce remarkable results.
In an experiment, researchers found consultants at Boston Consulting Group gained 40% higher quality work using GPT-4 (via Microsoft Bing) without specialized prompt training and without training the AI on any proprietary data. What mattered was the consultants’ expertise. Knowing what to ask and how to evaluate the results.
AI expert Ethan Mollick describes large frontier LLMs as working with a smart intern. Sometimes they’re brilliant. Sometimes they don’t know what they don’t know. AI will even make things up to give you an answer. Mollick and other researchers call this the jagged frontier of AI. In some tasks, AI output is as good or better than humans. In others, it can be worse or wrong.
Their research with Boston Consulting Group found AI can be good at some easy or difficult tasks while being worse at other easy or difficult tasks. Level or task isn’t a predictor. One professor’s research found ChatGPT got difficult multiple-choice questions right but got easy questions wrong. Testing and learning based on expert knowledge is the way to know. How do you explore this jagged AI frontier while improving results? A prompt framework like the one I created below.
First, have a clear understanding of what you want.
Begin with the task and goal. Are you summarizing to learn about a topic for a meeting, generating text or an image for content, looking for suggestions to improve your writing, performing a calculation to save time, or creating something to be published? Defining the task and objective sets the stage for a successful prompt and output.
Second, give AI a perspective or identity as a persona.
LLMs are trained on vast amounts of broad data, which makes them so powerful. This can also produce output that’s too generic or simply not what you want. It helps to give AI a perspective or identity like a persona. Personas are used in marketing to describe a target audience. Persona is also the character an author assumes in a written work.
Third, describe to AI the audience for your output.
Are you writing an email to your boss, creating copy for a social media post, preparing for a talk, or is the output just for you? You know how to adjust what you create based on what’s appropriate for an audience. AI can do a remarkable job at this if you give it the right direction.
Fourth, describe the specific task you want it to complete.
Err on the side of more detail than less. Consider things you know in your mind that you would use in completing the task. It’s like giving the smart intern directions. They’re smart but don’t have the experience and knowledge you do. More complicated tasks can require multiple steps. That’s fine, just tell AI what to do first, second, third, etc.
Fifth, add any additional data it may need.
Some tasks require data such as a spreadsheet of numbers you want to analyze, a document you want summarized, or a specific stat, fact, or measurement. But before uploading proprietary data into an LLM see my post considering legal and ethical AI use. Recent research, Systematic Survey of Prompting Techniques, also suggests adding positive and negative examples – “like this not like that.”
Sixth, evaluate output based on expectations and expertise.
Sometimes you get back what you want and other times you don’t. Then you need to clarify, ask again, or provide more details and data. Go back to earlier steps tweaking the prompt. Other times you get back something wrong or made up. If clarifying doesn’t work you may have discovered a task AI is not good at. And sometimes you just wanted a rough start that you’ll modify considering copyright for legal and ethical AI use.
A prompt experiment with and without the framework.
I’ve been testing the framework and it has improved results. In one test I used GPT-4 via Copilot to see if it could recommend influencers for a specific brand – Saucony running shoes. First I didn’t use the framework and asked a simple question.
- “Recommend influencers for 34-55-year-old males who like to run marathons.”
It recommended Cristiano Ronaldo, Leo Messi, and Stanley Tucci. Hopefully, you understand why these are not a good fit. I ran the same prompt again and it recommended Usain Bolt. Bolt is a runner, but known for track sprinting not marathons.
I tried to be more direct changing the prompt to “34-55-year-old males who run marathons.” For some reason dropping the “like” started giving me older bodybuilders. I wouldn’t describe marathon runners as “shredded” the way the one influencer described himself.
I tried again with “34-54-year-old males known for their involvement in marathons.” This gave me a random list of people including Alex Moe (@themacrobarista) a Starbucks barista. As far as I can tell Moe doesn’t run marathons and his Instagram feed is full of swirling creamer pours.
Finally, I tried the prompt framework.
- “You are a social media manager for Saucony running shoes. (Persona) Your target audience is 34-55-year-old males who run marathons. (Audience) Which influencers would you recommend for Saucony to appeal to and engage this target audience? (Task)“
This prompt gave me better results including Dorothy Beal (@mileposts) who has run 46 marathons and created the I RUN THIS BODY movement. Her Instagram feed is full of images of running. Copilot still recommended Usain Bolt following the framework, but the other four recommendations were much better than a soccer star, bodybuilder, or barista.
I tried to add data to the prompt with “Limit your suggestions to macro-influencers who have between 100,000 to 1 million followers.” (Data) The response didn’t give suggestions saying “as an AI, I don’t have access to social media platforms or databases that would allow me to provide a list of specific influencers who meet your criteria.” That’s okay because the more precise prompt gave me more relevant macro-influencers anyway.
Alternatively, I added positive and negative examples. I tried again adding to the prompt “Don’t provide influencers like Cristiano Ronaldo or Usain Bolt, but more like Dorthy Beale or Dean Karnazes.” (Data). This time I received a list of 8 influencers who all would have potential for this brand and audience.
You don’t need to be a prompt engineer to explore.
Experts in various fields are finding frameworks that work best for their needs. Christopher S. Penn suggests the prompt framework PARE (prime, augment, refresh, evaluate). Prompt writing can also be more advanced to maximize efficiency. Prompt engineers are working on creating prompt libraries of common tasks.
But for most people, your job will not switch to prompt engineer. We need discipline experts to test the best uses of AI in their specific roles. Over time you’ll develop knowledge of how to prompt AI for your profession and what LLMs are better at each task. Penn suggests creating your own prompt library. You’ll gain marketable skills as you explore the jagged frontier of AI for tasks unique to your industry.
LLMs are already introducing AI tools to improve prompts. Anthropic Console takes your goal and generates the Claude prompt for you. Microsoft is adding Copilot AI features to improve prompts as you write promising to turn anyone into a prompt engineer. And Apple Intelligence is coming, running efficient more specific task-focused AI agents integrated into Apple apps.
In the article, The Rise and Fall of Prompt Engineering, Tech writer Nahla Davies says, “Even the best prompt engineers aren’t really ‘engineers.’ But at the end of the day, they’re just that–single tasks that, in most cases, rely on previous expertise in a niche.” The Survey of Prompting Techniques, also finds prompt engineering must engage with domain experts who know in what ways they want the computer to behave and why.
Thus, we don’t need everyone to be prompt engineers. We need discipline experts who have AI skills. In my next post, I’ll explore the challenges of teaching students to be discipline experts with AI.
This Was Human Created Content!