Keeping up with Joe Rogan
TL;DR: We’re building an AI tool to better understand the Joe Rogan Experience, the world’s biggest podcast. As the U.S. election showed, Rogan is a powerful force in the public discourse—but he and other popular podcasters, streamers, and YouTube personalities are in a huge media blind spot. We urgently need better tools to follow how ideas spread. As we expand this project, we’re looking for collaborators and support.
The way Donald Trump just pulled off re-election should make every journalist, media analyst, and political strategist stop to think, hard. It was a striking demonstration of the way public conversations are shifting from traditional media to sprawling, unpredictable new spaces where they’re often overlooked.
Inspired by his son Barron’s media habits, Trump largely bypassed traditional news outlets in favor of friendly podcasts and YouTube interviews, sitting for hours-long conversations with hosts outside the mainstream who are wildly popular with younger audiences—especially young men.
These interviews gave Trump unfiltered access to millions of listeners, with some rivaling the reach of presidential debates: The Trump episode of The Joe Rogan Experience has over 50 million views on YouTube, and that’s not counting Rogan’s millions of followers on Spotify.
The president-elect’s chat with Rogan picked up plenty of mainstream coverage, but he’s not a typical guest. What about when Rogan interviews lesser known tech executives, stand-up comedians, or musicians? Then, for the most part, the show gets ignored.
But these conversations can spawn narratives and boost conspiracies that reverberate through other platforms and become widely influential—we think they’re too important to tune out. So we built Roganbot.
It uses AI to analyze episodes of The Joe Rogan Experience, transforming sprawling conversations into data. With each new episode, our system downloads the audio, transcribes it into searchable text, and identifies key moments and emerging ideas.
When we spot something significant—a claim about election fraud or vaccine side-effects, say—we can see when it developed within the conversation. Important quotes are transcribed and linked to the exact moment in the video where they appear, allowing us to understand the context they’re in.
Our blind spot
News outlets routinely dissect Saturday Night Live sketches, but the most influential podcast on the planet gets barely a second glance. And it’s not just Rogan: He’s the heavyweight in a loose network of commentators and media creators that’s exerting an ever-growing influence on U.S. politics and society.
Maybe the media overlooks these conversations because they seem too wacky or even outright conspiratorial to merit coverage. On Rogan, a chat about whales can morph into a segment on drugs, which leads to a story about nuclear weapons in North Korea. Rogan and his ilk don’t hold themselves up as journalistic interviewers—if a guest says something crazy, they’re rarely going to fact-check it. In fact, that’s one of the qualities that makes these conversations attractive to their audience.
But wild as many of these conversations are, they matter. Ideas emerge on one podcast, then evolve and spread on other YouTube channels or livestreams. A claim made on Rogan might bounce across shows like Theo Von’s This Past Weekend or Andrew Schulz’s Flagrant until it shapes how millions of people understand an issue.
If we keep ignoring this media universe, we’re missing something big.
🤓 Nerd Mode: How Roganbot works
Roganbot is powered by a chained language model workflow, designed to handle long-form podcasts. It was coded by us in collaboration with Anthropic’s Claude 3.5 Sonnet language model, using the tool Cursor. Here’s a detailed look at how it all comes together:
Episode Retrieval
When we fetch an episode, we download the audio, along with metadata like titles, guest names, and runtime, ensuring everything is ready for analysis.
Transcription with Speaker Diarization
Transcription is handled by AssemblyAI, which converts audio into detailed text with timestamps. But speaker diarization—identifying who said what—has been particularly tricky. To tackle this, we’ve integrated a language model-driven approach for speaker attribution, leveraging Claude 3.5 Sonnet to improve accuracy in labeling speakers, even in messy, multi-person conversations.
High-Level Summary
Once the transcript is finalized, it’s passed to GPT-4o, which generates a concise, high-level summary of the episode. This summary captures the main narrative arcs and significant moments, providing a blueprint for the more detailed analysis to follow.
Step-by-Step Analysis
The transcript is divided into manageable chunks to stay within the model’s context window. This chunking process not only makes analysis more efficient but also improves accuracy by keeping the language model focused on smaller, relevant pieces of the conversation at a time.
Each chunk undergoes detailed processing in the following areas:
- Key Topics: Identifying recurring themes or subjects that shape the conversation, from cultural debates to niche technical topics.
- Notable Quotes: Extracting word-for-word excerpts with timestamps, highlighting moments of insight, humor, or controversy.
- Emerging Controversies: Spotting statements that challenge norms, introduce polarizing views, or spark debate.
- Fact-Checkable Claims: Flagging numerical data, historical assertions, or scientific claims for potential verification.
Structured Outputs and Delivery
All outputs are compiled into a detailed, interactive report. This includes:
- Clickable timestamps for diving directly into the most critical moments.
- Speaker-labeled segments to make multi-person conversations easy to follow.
- A comprehensive summary of key insights and takeaways.
The report is then sent to a Verso Slack channel, notifying us when there’s a new episode.
Scaling up
We think Roganbot could be the first step toward a new way to understand the way political ideas form and spread in the decentralized world of new media. Extending this tool could help journalists, researchers, and advocacy groups understand what people who have largely checked out of the mainstream are hearing about.
The ultimate goal isn’t just to follow Rogan, who among his peers is actually pretty well known, but to keep up with the enormous new media world he’s part of. A full-fledged system that follows dozens or even hundreds of podcasters, YouTubers, and streamers would give us awareness of a whole universe of information we’re currently half-blind to. We could:
- Analyze how conspiracy theories and viral claims gain traction
- Decode the narratives that resonate, why they matter, and who they influence
- Spot emerging controversies and cultural shifts before they hit mainstream awareness
This isn’t just for journalists. Researchers, nonprofits and PR professionals—anyone who needs to understand how ideas move through media—could benefit from a tool like this.
Visibility tools
This is just one example in a category of automated research we’re calling visibility tools. It’s one of the areas we’re most excited about in AI-assisted journalism. The same framework we’re applying here to Rogan and friends can also power systems for local reporters to keep tabs on their city governments and police forces, or help science and technology journalists, professionals, and enthusiasts stay abreast of the latest academic research in their field.
Visibility tools are about reclaiming our ability to see what’s happening in the spaces where people actually get their information. Without that, we can’t have an informed democracy.
We’re writing publicly about our project because this is a shared challenge. We’re just two people here at Verso. Between client work and independent projects, we have limited time and resources to fully build this out.
If you share our concern for the evolving media ecosystem and believe in this project, we’d love to hear from you. We are currently looking for collaborators as well as financial support for the project. Get in touch: [email protected]