Researchers have conducted a study to determine how prompt methods can be used to generate ideas and which prompt methods generate the greatest diversity of ideas.
The working paper by Lennart Meincke, Ethan Mollick and Christian Terwiesch from the Wharton School of the University of Pennsylvania focuses on idea generation with GPT-4.
The team investigated how different prompting methods can influence the diversity of ideas generated. Specifically, the goal was to develop new products for students that cost less than $50.
The researchers tested different prompting methods, including minimal prompts, prompts in which the AI model takes on different personalities, and prompts in which the AI model applies different creativity techniques from existing literature.
The diversity of ideas was measured using cosine similarity, a measure of the similarity between two ideas, but without comparison to existing ideas. The researchers also measured the number of unique ideas and the rate at which the idea space was exhausted.
The team found that different prompting methods had different effects on the diversity of ideas generated. But “chain-of-thought” prompting, a long-established prompting method, was the one that nearly reached the level of one group of students in the tests and came out on top by a wide margin.
This method was also the one that generated the most unique ideas. This suggests that CoT prompting can help to open up the idea space more effectively and generate a greater variety of possible solutions.
Getting to better AI ideas – step by step
CoT prompting asks the AI model to solve a task in multiple steps. You don’t have to specify these steps; just asking it to proceed step by step can improve the outcome.
It’s not exactly clear why this works – the basic premise is that the prompt causes the model to focus on higher-quality data from the training dataset, which is more analytical.
In their experiment, the researchers gave a task for each step: First, GPT-4 was to generate 100 ideas, from which the AI was then to filter out the strongest and most diverse ideas.
In the final step, GPT-4 was told to give the result a name and a product description. The researchers excluded about 15 percent of the runs from the statistics because the model had not completed the second step correctly.
Image: Meincke et al.Overall, the researchers conclude that AI can be a useful tool for improving the ideation process.
However, they emphasize that choosing the right prompting method is critical to maximizing the diversity of ideas generated. The overlap between prompts is relatively small, the team says, allowing for hybrid prompting, or the generation of smaller pools of ideas using different prompting methods.
The study’s co-author, Ethan Mollick, has published a GPT for idea generation that follows the step-by-step principle, but it is not the prompt used in the study.
Another recent study showed that the length of reasoning steps in CoT prompts is directly related to the performance of language models in complex problem-solving tasks. This was true even when the longer prompt did not contain significant new information.