Agentic AI for Automated UGC Discovery
ADD LINKS
Before a creative project, such as an ad campaign, is kicked off, a document called a “creative brief” is written, which serves as the functional specification document for a creative project. It acts as the single source of truth that defines the strategic requirements for creative work. In it are described the strategic requirements, such as the objectives, target audience (user profile), key message (value proposition), and tone (design system) that the final output must meet. Instead of detailing how the creative team should execute the project, it focuses on the essential inputs and desired outcomes. Creative briefs are often ambiguous, multidimensional documents that emphasize emotion and style over verbose descriptions.
The pain point for automation lies in the translation gap. These briefs prioritize ambiguity and human interpretation through subjective, qualitative language ("evoke a sense of nostalgia," "feel authentic") over explicit, machine-readable instructions. The desired output is often an emotional effect, which is difficult for a system to map directly to structured media tags or database queries. Consequently, even a straightforward requirement can introduce algorithmic complexity, necessitating intricate constraint satisfaction across visual, thematic, and emotional dimensions.
Translating these nuanced briefs into actionable search criteria for user-generated content (UGC) has historically required human curators to break down requirements, define themes, and manually query large media libraries. Content Genie introduces an agentic AI workflow that automates this process end-to-end. By combining large language model (LLM) reasoning, semantic search, and media analysis, Content Genie enables precise retrieval of on-brief UGC images and videos.
The Road to Content Genie
Content Genie isn't just a technical experiment; it is the automation of the most sophisticated content-sourcing expertise in the industry. Catch+Release is fundamentally a licensing company, and for years, we have been the licensing experts, helping the world's top brands find and license the perfect UGC to meet their most demanding creative needs.
Our solution was built hand-in-hand with our best-in-class curation team, the very people who spend their days translating ambiguous creative concepts into actionable content strategies. Every technical decision, every agentic step, and every scoring mechanism within Content Genie was informed by this expert curator experience. This deep, operational knowledge is what makes Catch+Release uniquely suited to build a tool that bridges the gap between creative intent and technical execution. We understand the language of the brief because we wrote the playbook for fulfilling it.
This robust, intelligent platform is underpinned by Temporal, a critical component of our infrastructure that manages the workflow and state of all our services. By leveraging Temporal, we ensure that Content Genie's complex, multi-step process, from initial brief decomposition by one LLM agent to final vector-based ranking by another, is resilient, observable, and scalable.
Specifically, Temporal provides durable execution, guaranteeing that our agentic workflows run to completion, regardless of infrastructure failures, network outages, or application crashes. This is crucial for long-running, multi-stage processes. Normal applications lose their execution state when a failure occurs, but Temporal tracks the progress of our workflows and persists a detailed history of every event. If a worker crashes mid-search or during a complex media analysis, Temporal ensures that the process automatically resumes exactly where it left off, eliminating the need for extensive, error-prone recovery coding and allowing our engineers to focus purely on the business logic of content discovery. From our clearance automation to content ingestion, Temporal is the foundation.
Ultimately, Content Genie draws its power from the vast Catch+Release marketplace, which houses millions (15MM at the time of writing) of pieces of high-quality, licensable content and it continues to grow every day. The agentic AI is not searching a generic internet pool; it is searching a quality-controlled and massive library, ensuring that the precise content it retrieves is always ready to be integrated into any creative campaign. However, just having a library of content isn’t very useful to our users if it is time consuming or difficult to find the perfect shot for your campaign
Our users need to be able to find what they’re looking for, even if they don’t actually know what they’re looking for. The users are creatives and producers on hectic timelines and looking for the right content to satisfy a feel in their project. This isn’t the sort of searching that comes easily with traditional boolean logic found in most search engines, which is why we based our marketplace search on semantic similarity and not traditional textual indices. This allows us to rely on the specific text meaning and feel in the brief, rather than building out complex boolean search queries consisting of derived keywords in a programmatic and error-prone manner.
The AI Curation Workflow: Under the Hood of Content Genie
Content Genie’s power lies in its agentic architecture, which mimics the rigorous, multi-faceted process of an expert human curator. The workflow is orchestrated by Temporal, which manages the reliable execution of the complex, stateful processes and ensures that each step of the content curation "journey" is completed successfully. n8n is used to build the agents that perform the heavy lifting, which is key to enabling close collaboration between product and engineering teams, along with rapid iteration of prompts.
Here’s how our coordinated network of AI agents tackles a creative brief:
Brief Analysis and Decomposition
The first step is converting the uploaded PDF document into a machine-readable format, which in our case is simple Markdown. Our first agent analyzes the document directly with AI or defers to its tools to extract the content for later agents. We found that in some cases, Gemini is unable to process certain PDF documents. As a fallback, we use a Golang PDF library to extract the text. Simple text extraction doesn’t do as good a job at preserving formatting, hence Gemini is our preferred approach for document processing.
Next is understanding the intent of the brief. This is where the power of Gemini, Google’s LLM, comes into play. A dedicated agent is responsible for processing the brief text, meticulously breaking down its often-vague, high-level language into meaningful, concrete components. This includes identifying key requirements such as content format, demographics, and style. Here, we use a Few-Shot prompting approach to guide Gemini’s reasoning from a vague concept (e.g., "authentic moments of joy") to concrete themes and search queries. Our use of n8n here is also helpful in ensuring that the output from the prompt maintains the expected JSON format, thanks to its auto-fixing output parsers. These parsers take a JSON schema that they use to validate the output of the prompt, and if the output fails validation, it handles retrying the call for us until the output passes.
Following this, a third agent utilizes Gemini to extract core creative themes and concepts for each theme. For instance, a brief request for “authentic moments of joy” might be translated into themes like “spontaneous laughter,” “family connection,” or “celebration.” Each of these themes is further enhanced with fields that subsequent agents can leverage to find content. This agent also uses the extracted themes and decomposed requirements to develop a series of highly tailored search queries. These are not merely simple keywords, but complex semantic descriptions specifically designed to probe our content marketplace for the most contextually relevant content. This initial analysis is crucial, as it effectively translates creative vision into actionable search parameters. This is why we’ve taken care to validate our results during development with the help of our expert curators, and we have found that Content Genie is able to extract the correct themes and concepts with an 80% accuracy rate.
Deep Media Analysis: Unlocking UGC Context
Before the search, the content in our content marketplace must be intelligently indexed. We don’t rely solely on user-provided tags. Instead, AI agents are continuously at work:
- Using CLIP (Contrastive Language–Image Pre-training), we analyze the media (images and videos) in the Catch+Release marketplace. CLIP is instrumental in generating rich, text-based descriptions of visual content for each image and video scene as defined by PySceneDetect. Specifically, we’re using a model that generates a 1024-dimensional vector embedding.
- Beyond what is in the shot, AI determines what is happening (the action), the style (e.g., documentary, cinematic, handheld), and the mood (e.g., contemplative, energetic, nostalgic). This depth of analysis allows us to match the nuanced requirements of a creative brief.
- These rich, high-dimensional descriptions are then converted into vector embeddings and stored in Qdrant, a high-performance vector database. This enables rapid, semantic similarity-based search across both the general description provided by CLIP, along with the more specific style and feel aspects we get from Gemini.
Agent-Driven Search Strategy and Content Retrieval
Following brief analysis and content indexing, a dedicated search agent develops and executes a tailored search strategy based on the output of previous agents. Queries are converted into vectors and used to search the Qdrant database. This semantic similarity search, based on vector space rather than exact keyword matches, enables Content Genie to surface conceptually relevant content even when exact keywords are absent from media descriptions.
The agent then retrieves the most semantically relevant images and videos, presenting them as a curated “haul” of shots. This process ensures that the results are not only visually similar but also align with the required style, mood, and creative intent derived from the brief analysis. It accomplishes this by reviewing each piece of content and, through a series of scoring algorithms, returns only what is the most relevant based on a weighted, final score. One of our scoring algorithms employs an LLM-as-a-judge to produce a score for how relevant a piece of content is to the initial brief. Additional functions employ cosine similarity and cross-encoder models to produce scores. This ensemble of scores provides a backstop for addressing the known issues of LLM hallucinations and non-determinism within the LLM-as-a-judge framework. As an additional measure, we have trained a machine learning model to determine the weights based on expert curator feedback. This machine learning model ensures constant improvement via a feedback loop.
Why Agentic AI is the Future of Curation
The core value of Content Genie lies in its agentic approach. A single, monolithic AI model would struggle to manage the state, reliability, and precision required for high-stakes creative projects. It would also be far too brittle due to the nondeterministic nature of LLMs and performance limitations. Each failure would be slower to detect and take longer to resolve via retries. Our agentic approach allows each agent to retry individually, as necessary to correct any failures.
By using a coordinated network of specialized agents, Content Genie achieves:
- Modularity and Reliability: Temporal ensures that if one part of the complex workflow fails, it can be reliably retried or compensated for, thus ensuring the reliability of the whole process.
- Specialization: Each agent is optimized for a specific task, such as brief decomposition, theme extraction, query generation, or search execution, resulting in higher precision than a general-purpose system. This also makes it easier for human reasoning of each agent’s role. Furthermore, using multiple agents allows for more accurate tuning and simplified troubleshooting. It’s far more difficult to understand what went wrong in a monolithic AI agent than it is with a series of smaller ones.
- Creative Intent Matching: By emphasizing deep, AI-driven analysis of both the creative brief (using Gemini) and the media content (using CLIP and Gemini), Content Genie moves beyond literal interpretation to understand and match the core creative intent, enabling creatives and producers to focus on what matters most: telling their story.
- Speed: From brief upload to final delivery of content, Content Genie takes roughly two hours, which is significantly faster than its human curator counterpart would accomplish in eight to ten hours. This is also just the start, as we expect to reduce speeds by further parallelizing the agents.
Content Genie in Action
Interested in learning more or seeing Content Genie in action? Dive deeper into its capabilities and witness real-world examples by visiting the post put together by our curation director, accessible [here](INSERT LINK TO LIZ’S BLOG), which offers an in-depth exploration of Content Genie.
Within this blog post, you'll find a series of practical samples demonstrating Content Genie in action, showcasing its versatility and power across various content types. This resource is designed to help you understand not only what Content Genie does, but also how it empowers creatives and producers to optimize their workflows, enhance content quality, and achieve their project goals.
Closing Thoughts
Content Genie marks a pivotal advancement in content curation, surpassing the limitations of traditional manual methods and rigid keyword-based approaches. This revolutionary platform leverages the capabilities of agentic AI, large language models (LLMs), and sophisticated media understanding to empower creatives and producers with unprecedented efficiency and effectiveness. It enables creatives to discover and integrate the ideal user-generated content (UGC) that precisely aligns with their creative vision.
This innovative methodology not only dramatically streamlines content workflows but also transforms how authenticity is preserved and amplified. Content Genie ensures that the inherent emotion, distinctive style, and essence of every story are accurately identified, captured, and conveyed in a compelling manner. By moving beyond simple keyword matching, Content Genie is able to uncover the deeper meaning and emotional resonance of UGC, enabling creatives and producers to craft narratives that genuinely connect with their audiences. The integration of agentic AI allows the system to intelligently interpret creative briefs and user preferences, proactively suggesting content that might otherwise be overlooked, thereby unlocking new creative possibilities and fostering more engaging and impactful storytelling.
Related Resources
Five Super Bowl Ideas That Could Win With Found Content (And Won't Cost $8 Million)
Found Content is no longer a backup plan. It’s a bold, culture-driven strategy for Super Bowl spots that want to win hearts, spark conversation, and save serious production dollars. Explore five creative plays brands can use to bring authenticity and impact to the biggest stage in advertising.

