Article by ICG member Dan Young, Shed Research
In 2025, ChatGPT, Perplexity and Gemini all released their deep research tools. They take a brief, clarify it with you, run the desk research, and produce a fully-referenced report, in less than 30 minutes.
This sounds impressive. But how good is the content? How good is the sourcing? And does it handle insight as well as a human with over 20 years of research experience? (That’s me by the way)
I compared the output of these three tools with my own piece of ‘manual’ insight synthesis. I looked at where these tools shine, where they’re lacking, and what this means for the future of AI in the insight sector.
I’m no AI expert. I’m a specialist in insight synthesis. My “Deep Research Challenge” focused on the quality of the output rather than what sits behind the models, the ethical issues around IP, or any issues of data security. (There’s an internet full of information on all of this if you want to find it.)
Where did the deep research tools shine?
-
- Multiplying productivity – they processed large numbers of sources (up to 50) into a simple, readable report (up to 30 pages) in less than half an hour. Even the fastest reader can’t come close to this level of productivity.
- Expanding sources – the tools (especially Perplexity) drew from a larger number of sources than I used in my own synthesis (It also missed some key sources, but more on this later)
- Researching the right topic – they performed well where they could tap into a vast amount of published research available online (This won’t be the case for all topics)
- Dealing with (some) nuance – the tools were able to deal with some nuanced arguments in the insight. Like the say-do gap around ethical investments
- Showing their working – where the tools showed their thinking and asked clarifying questions (ChatGPT and Perplexity), or asked me to check their plans (Gemini), they stuck to the brief and built my trust in their conclusions
Where did they fall short?
-
- Over-reliance on ‘official’ sources – all tools relied heavily on Governments, regulators and the media for their content. This might be caution on the part of the tools, wary of quoted an untrustworthy source. My own synthesis drew from a broader range of sources, including more published corporate and academic research
- Uploading sources – even in the paid versions of these tools had restrictions on file type, size and quantity. This means using different types of insight (e.g. video or dynamic websites) or your own sources is difficult. The tools instead seem designed to scour the internet
- Tending Towards the Mean – all tools gave a good aggregate view of the topic. But this often presented itself in passive language, heavily couched, with many caveats. All tools seemed unable to make decisive statements or write for impact
- Not sweating the small stuff – they often missed niche studies and tangential insights. They struggled to provide “killer quotes”, even with direct prompting. These smaller pieces of insight are gold in research. These are the things the audience remembers most afterwards
- Inconsistent content – all the tools missed something. Either an important piece of insight (Gemini), a key prompt (Perplexity) or sourcing some of its statements (ChatGPT). And they tended to overly favour digital solutions e.g. ChatGPT’s first two quoted studies were all about AI. This raises concerns about the robustness and balance of their findings
- Poor UX – some cut and pastes were patchy, the number of different model names is bewildering, and Perplexity in particular presented findings in blocks of text in an overly academic tone. Gemini had the best UX overall
- Missing vital researcher skills – the tools were unable to use critical thinking, to think creatively about the topic, to spot gaps in its knowledge, and most crucially, to give a convincing argument for why its insight was important – the “so what?” These are all core researcher skills and needed overlaying on all of the outputs.
What does this mean for the future of market research and insight?
You might look at a short list of positives, and a longer list of negatives, and assume I’m not a fan. That’s not true.
This experiment clearly showed me what these deep research AI tools are great at – multiplying productivity in that difficult first stage of desk research, for the right topic, and when carefully prompted.
But to get the best out of them, we need to acknowledge their limitations. We need to be aware when they misinterpret, miss a key piece of insight or don’t address the research objectives. We still need to use our core research skills – verify sources, check methods, add tangential insight, use metaphors, tell stories, visualise, and ask so what?
As Chase Ballard (AI commentator) put it:
When investing, understand that you’re getting a productivity multiplier, not a replacement. Used strategically, these tools can dramatically accelerate initial research phases. But set your expectations accordingly – there’s still work for you here.
If we’re able to use them with our eyes open, and with caution, I can see a very bright future for these tools in the researcher toolkit. The future of research and insight will likely involve a collaborative approach. One where we’ll leverage the strengths of both AI and human expertise to deliver richer, more insightful, and ultimately more impactful results.
You can watch the full presentation from the Insight Platforms summit by registering here