
AI for Business Development - Comparison

AI for Business Development - Comparison
In today’s rapidly evolving business landscape, AI is transforming how companies identify prospects, engage decision-makers, and drive growth. This post explores the effectiveness of various AI models in business development by testing their ability to generate high-potential leads, analyze customer profiles, and craft compelling cold outreach messages.
Objective
Our goal was to evaluate how different AI models (text, reasoning, search, agents) perform in:
- Identifying Ideal Clients: Analyzing similarities among our top customers (Pernod Ricard, Lacoste, Saint-Gobain, Lesaffre) to generate a targeted list of 20 similar prospects.
- Personalized Outreach: Crafting unique LinkedIn messages tailored to each prospect, aligned with our mission: Empowering knowledge workers to reclaim their creativity, autonomy, and productivity with AI.
AI Models Tested
We categorized and tested 10 AI models across four key capabilities:
- Text: Copilot (Quick), DeepSeek V3, Kimi K2
- Reasoning: Copilot (Deeper), DeepSeek R1
- Search: DeepSeek R1 + Web, Kimi K2 + Web
- Agents: Manus, Genspark
Each model was prompted with the same business development task, allowing us to compare their accuracy, creativity, and strategic depth in lead generation and messaging.
Why This Matters
As AI adoption accelerates, businesses must understand which tools best enhance sales and marketing efforts. This study provides actionable insights into:
- AI’s role in B2B prospecting – Can AI replicate expert-level lead research?
- Strategic reasoning – Which AIs go beyond surface-level suggestions to offer deeper insights?
- Personalization at scale – How well do different models tailor outreach?
By benchmarking these AI models, we aim to help sales teams optimize their skills, improve outreach efficiency, and ultimately drive more meaningful engagements with high-value prospects.
Methodology
Prompt
Each AI was given the same prompt:
You are a sale expert with 15 years of experience in B2B services. You task is to compile a list of potential clients who could be looking to upskill their workforce with AI. We are a AI training, consulting, and implementation company, with a focus on global companies in APAC. Our top clients are Pernod Ricard, Lacoste, Saint Gobain, and Lesaffre. First analyze the client profile and found the similarity between these customers. Based on the similarities, list of 20 clients with similar profile who can benefit from our expertise. For each potential client design a personalized message to send on LinkedIn to their decision makers. The message must be unique to the customer and reflect our objective: Empower Knowledge Workers to reclaim their creativity, autonomy, and productivity with AI.
Analysis
Portfolio Analysis
We began by manually evaluating each AI’s accuracy in analyzing our current customers, assigning points based on how well they identified key characteristics (e.g., industry, size, AI adoption potential). These scores were used to rank the models’ analytical capabilities.
Target List Quality
Next, we aggregated the AI-generated target lists and calculated the frequency of suggested clients across all models, using our predefined scoring system to assess the relevance of each prospect and rank the AIs accordingly.
Message Quality
Finally, we assessed the quality of the cold outreach messages by scoring them on three criteria:
- personalization: how well each message reflected the prospect’s unique needs,
- relevance to our offer: alignment with our mission of empowering knowledge workers with AI,
- call to action: clarity and persuasiveness in driving engagement. This multi-layered evaluation allowed us to compare the models not just on raw output but on strategic depth and practical usability in real-world business development.
Portfolio Analysis
Criteria
To evaluate the AIs results, we established three weighted criteria based on insights from all tested AI models:
- French Heritage (5 points) – Reflecting the strong cultural and operational alignment seen in our top clients like Pernod Ricard and Lacoste.
- Global APAC Presence (3 points) – Essential for our regional focus, ensuring prospects have a footprint in markets where we deliver AI training and implementation.
- Knowledge-Intensive Workforce (2 points) – Highlighting companies where upskilling with AI can drive measurable productivity and innovation gains.
By applying this scoring framework, we assess how effectively different AI models prioritize prospects that match our ideal customer profile. This structured approach not only validates AI-generated recommendations but also provides a replicable method for data-driven business development. The following analysis compares AI performance in identifying, scoring, and ranking prospects—ultimately revealing which models deliver the most actionable insights for sales teams.
Ranking
Name / Categ. / Score | Key Similarities Identified by the Model |
---|---|
Manus / Agent / 10 | French heritage; global scale + APAC presence; large diverse workforce; digital transformation in traditional industries; premium positioning; knowledge-intensive ops |
Genspark / Agent / 10 | French heritage + APAC expansion; knowledge-intensive ops; multi-local model; employee-centric culture; digital transformation |
Kimi K2 / Text/ 10 | French HQs + APAC ops; consumer-facing portfolios; asset-heavy + knowledge workers; digital/sustainability transformation; large scale (10k-100k employees) |
Copilot (Quick) / Text / 8 | Global + APAC presence; industry (FMCG/manufacturing/apparel); large distributed teams; innovation (sustainability/digital/brand agility); French-international culture |
Kimi K2 + Web / Search / 8 | French HQs + APAC execution; scaling AI pilots (Pernod/Saint-Gobain/Lacoste); brand/sustainability culture; decentralized operations |
Copilot (Deeper) / Reasoning / 5 | Premium/heritage global consumer/industrial brands; digital transformation; empowered knowledge workforce; complex APAC teams |
DeepSeek V3 / Text / 5 | Sector (FMCG/luxury/manufacturing), APAC footprint, innovation focus (digital/sustainability), knowledge-intensive workforce, regulatory needs |
DS R1 + Web / Search / 5 | Sector focus (FMCG/fashion/materials/food); APAC growth (China/India/Japan markets); knowledge-intensive functions (R&D/supply chain) |
DeepSeek R1 / Reasoning / 2 | Sector (CPG/luxury/manufacturing); pain points (upskilling/digitization/complexity); positioning (human-centric AI for creativity/productivity) |
Insights
Top Performers (Score: 10/10)
- Manus (Agent), Genspark (Agent), and Kimi K2 (Text) perfectly identified all key traits
- Why they excelled: Agents (Manus, Genspark) synthesized strategic patterns, while Kimi K2 (Text) combined precision with broad industry awareness.
Strong Contenders (Score: 8/10)
- Copilot (Quick) and Kimi K2 + Web missed full points by either: Omitting explicit “French heritage” (Copilot) or underweighting “knowledge workers” (Kimi K2 + Web).
- Takeaway: Web search (Kimi + Web) added depth but didn’t surpass pure text models in scoring.
Mid-Tier Models (Score: 5/10)
- Copilot (Deeper), DeepSeek V3, and DS R1 + Web prioritized APAC presence and knowledge workers but weak on French ties:
- DeepSeek V3 focused on sectors (FMCG/luxury) without heritage emphasis.
- DS R1 + Web highlighted APAC growth but lacked cultural alignment.
Underperformers (Score: ≤3/10)
- DeepSeek R1 (Reasoning) scored lowest due to vague “human-centric” positioning (DeepSeek R1).
- Critical gap: ignored French heritage entirely.
Target List
Methodology
To refine analyze the prospect list performance, we adopted a data-driven approach that combined AI-generated insights with manual validation. First, we analyzed the frequency of each suggested client across all AI-generated lists, assigning higher scores to companies recommended by multiple models. The results matched our own opinion, even aligns with some our business development effort.
While this step heavily depended on the quality of the initial client analysis, some models, particularly Kimi K2 + Web (Search) and Manus (Agent), demonstrated superior ability to surface high-potential targets, achieving the top scores (69 and 68, respectively). Manual review confirmed that their suggestions aligned closely with our ideal customer profile.
Results
Model | Category | Score | Client Avg Freq. |
---|---|---|---|
Kimi K2 + Web | Search | 69 | 3.45 |
Manus | Agent | 68 | 3.40 |
Genspark | Agent | 67 | 3.35 |
Copilot (Quick) | Text | 64 | 3.20 |
DeepSeek R1 | Reasoning | 61 | 3.05 |
Kimi K2 | Text | 61 | 3.05 |
Copilot (Deeper) | Reasoning | 50 | 2.50 |
DeepSeek V3 | Text | 41 | 2.05 |
DS R1 + Web | Search | 24 | 1.20 |
Key Takeaways:
- Collaborative filtering works: Models that aggregated broader patterns (e.g., Kimi K2 + Web’s web-augmented search) outperformed narrow approaches.
- Agents excel at synthesis: Manus and Genspark (Agents) delivered high-frequency, high-quality leads.
- Manual validation remains critical: Even top-scoring models occasionally included outliers, requiring human oversight.
- This hybrid method ensured our final target list balanced AI efficiency with strategic precision.
Extra Mile: Contacts
While the original task didn’t explicitly require identifying decision-makers, some models proactively suggested roles (e.g., “Head of Digital Transformation”) or even specific contacts—a critical step for crafting tailored outreach. However, the accuracy and relevance of these suggestions varied significantly:
- Kimi K2 (40% real contacts) correctly targeted APAC-based leaders (e.g., regional HR/innovation heads), aligning with our pain points but with lower verification rates.
- Kimi K2 + Web (90% real contacts) sourced highly verifiable names—but from HQ roles (e.g., Paris-based CDO), which are often too removed from APAC operational challenges.
- Genspark (100% real contacts) pinpointed global CEOs—technically accurate but misaligned with our goal of engaging regional AI upskilling stakeholders.
Key Insights
- Precision ≠ Relevance: Higher contact verification rates (e.g., Kimi + Web’s 90%) don’t guarantee strategic fit.
- Agents Overreach: Genspark’s CEO focus highlights a risk of over-optimizing for “high-level” but irrelevant targets.
- Regional Focus Matters: Kimi K2’s APAC emphasis—despite lower verification—demonstrates better problem-solution fit.
Note: Models like Manus and DeepSeek R1 identified roles (e.g., “APAC Learning & Development Lead”) without names, striking a balance between specificity and flexibility.
Results
Model | Position | Name | %Real | Targeting Notes |
---|---|---|---|---|
Kimi K2 | ✓ | ✓ | 40% | Contacts in APAC |
Kimi K2 + Web | ✓ | ✓ | 90% | Contacts in HQ |
Genspark | ✓ | ✓ | 100% | Global CEO |
Manus | ✓ | ✗ | – | – |
DeepSeek R1 | ✓ | ✗ | – | – |
DeepSeek V3 | ✓ | ✗ | – | – |
Copilot (Deeper) | ✗ | ✗ | – | – |
Copilot (Quick) | ✗ | ✗ | – | – |
DS R1 + Web | ✗ | ✗ | – | – |
Models marked ✓ provided positions/names; ✗ did not.
Message Quality
To evaluate the effectiveness of each AI’s outreach capabilities, we analyzed one message per model (prioritizing the same target company where possible). Messages were assessed against three critical criteria:
- Personalization (0-5 pts):
- How well the message addressed the prospect’s unique challenges (e.g., regional upskilling gaps, industry-specific AI adoption barriers).
- Relevance to Our Offer (0-5 pts):
- Alignment with our core mission of empowering knowledge workers with AI (beyond generic “digital transformation”).
- Call to Action (CTA) (0-5 pts):
- Clarity and persuasiveness in driving engagement (e.g., meeting requests, resource offers).
Key Findings:
- Top Performers:
- Kimi K2 + Web (Search) achieved a perfect 5.0 across all criteria, leveraging web-sourced insights for hyper-relevant messaging.
- DeepSeek R1 (Reasoning) and Manus (Agent) followed closely (4.8), excelling in relevance and personalization but with slightly weaker CTAs.
- Mid-Tier Models:
- Kimi K2 (Text) scored 4.7, demonstrating strong personalization but less compelling CTAs (“Let’s connect” vs. “Book a demo”).
- DeepSeek V3 (Text)* underdelivered (4.3), providing only 5 emails out of 20 companies.
- Underperformers:
- Genspark (Agent) and DS R1 + Web (Search) struggled with generic messaging (3.8 and 3.0).
- Copilot variants ranked lowest (≤2.5), often defaulting to templated, low-impact language.
Results
Model | Category | Perso. | Relev. | CTA | Overall |
---|---|---|---|---|---|
Kimi K2 + Web | Search | 5.0 | 5.0 | 5.0 | 5.0 |
DeepSeek R1 | Reasoning | 4.5 | 5.0 | 5.0 | 4.8 |
Manus | Agent | 5.0 | 5.0 | 4.5 | 4.8 |
Kimi K2 | Text | 5.0 | 5.0 | 4.0 | 4.7 |
DeepSeek V3* | Text | 4.0 | 4.5 | 4.5 | 4.3 |
Genspark | Agent | 4.0 | 4.0 | 3.5 | 3.8 |
DS R1 + Web | Search | 3.0 | 3.5 | 2.5 | 3.0 |
Copilot (Deeper) | Reasoning | 2.5 | 3.0 | 2.0 | 2.5 |
Copilot (Quick) | Text | 1.5 | 2.0 | 2.0 | 1.8 |
*DeepSeek V3 provided only 5/20 emails.
Takeaways
- Search-augmented models dominate: Kimi K2 + Web’s real-time data access enabled unmatched precision.
- Agents need CTA refinement: While Manus personalized well, their CTAs lacked urgency.
- Output consistency matters: Models like DeepSeek V3 (5/10 emails) limited practical usability despite decent scores.
Examples
Kimi K2 + Search - Score 5
L’Oréal – Chief Digital & Technology Officer APAC
Bonjour Stéphane, 40 000 APAC “Beauty Tech” champions look to you for the next leap. Our AI Academy helped Pernod Ricard free 30 % of marketers’ time for creative storytelling—without new head-count. Ready to give every L’Oréal country team the same edge and turn Gen-AI into the new lipstick effect? Let’s design a pilot that lets knowledge workers reclaim creativity, autonomy and productivity. Worth a 15-min chat?
Genspark - Score 3.8
Mr. Hieronimus, L’Oréal’s commitment to ‘Beauty Tech’ and empowering diverse teams globally resonates with our mission to help knowledge workers reclaim their creativity through AI. Our experience with other French multinational brands has shown remarkable results in accelerating innovation cycles. Given L’Oréal’s focus on personalization and innovation across 150 countries, I believe our AI training programs could significantly enhance your R&D and marketing teams’ productivity. Could we explore how to empower L’Oréal’s creative professionals with AI tools that amplify rather than replace their expertise?
Copilot Quick Response - Score 1.8
Hi [Name], LVMH’s legacy of craftsmanship and innovation is iconic. We help global teams like yours reclaim creativity and autonomy through AI—empowering designers, marketers, and supply chain experts to work smarter and faster. Would love to explore how we can support your digital transformation journey.
Comparative Analysis of AI-Generated Outreach Messages
Criteria | Kimi K2 + Search (Score: 5.0) | Genspark (Score: 3.8) | Copilot Quick (Score: 1.8) | Observations |
---|---|---|---|---|
Personalization | ✅ Exceptional - Names specific role (CDTO APAC), references “40,000 Beauty Tech champions” and Pernod Ricard case study | ⚠️ Good - Mentions CEO by name and L’Oréal’s global focus, but lacks APAC-specific details | ❌ Weak - Generic “[Name]” placeholder, no specific role or department mentioned | Top performers use precise role targeting and regional context |
Relevance to Offer | ✅ Perfect - Directly ties AI training to L’Oréal’s “lipstick effect” analogy and Pernod success | ⚠️ Good - Connects to innovation focus but uses generic “empowering teams” language | ❌ Basic - Makes general claims about “working smarter” without concrete value proposition | Best messages use client-specific analogies and measurable outcomes |
Call to Action | ✅ Compelling - Specific 15-min chat request with pilot program framing | ⚠️ Vague - Open-ended “explore how” without clear next step | ❌ Weak - Generic “would love to explore” with no urgency | Effective CTAs specify time investment and program type |
Key Differentiators of Top-Performing Message
- Strategic Framing: Positions AI as the “new lipstick effect” - brilliant industry-relevant metaphor
- Social Proof: References measurable results from comparable client (30% time savings at Pernod)
- Regional Precision: Targets APAC leader rather than global executive
- Action-Oriented: Proposes concrete “pilot” collaboration rather than vague exploration
Common Pitfalls in Lower-Scoring Messages
- Genspark’s focus on global CEO rather than regional decision-maker
- Copilot’s failure to personalize even basic details (name/role)
- Both lack Kimi’s clever use of industry-specific language (“lipstick effect”)
Conclusion
Optimizing AI for Strategic Business Development
This study demonstrates that AI can dramatically enhance B2B prospecting when used strategically, but requires careful orchestration across different model types. Here’s the refined process for optimal results.
We will have to break down the prompt into a chain of prompt to:
- adjust intermediary results,
- use the best model for each step.
Adjusting Intermediary Results
-
French/European Companies
- Use Manus/Genspark (Agents) to identify cultural/operational fits
- Weight French heritage at 5pts in scoring
-
APAC Contacts Only
- Filter outputs through Kimi K2 + Web to verify regional presence
- Reject HQ-based contacts (common in raw AI outputs)
-
Right Level Targeting
- Exclude C-suite unless specifically requested
- Prioritize:
- Regional Innovation Leads
- APAC HR/Talent Directors
- Digital Transformation Managers
-
Refined CTA Requirements
- Mandate:
- Specific time commitment (15/30 mins)
- Pilot-focused language
- Social proof integration
- Mandate:
Model Specialization by Task
Step | Best Model Type | Why | Example Models |
---|---|---|---|
Deep Analysis | Agents | Pattern recognition across datasets | Manus, Genspark |
Lead Identification | Search (+Web) | Real-time validation | Kimi K2 + Web |
Contact Discovery | Search (+Web) | Accuracy in role/region matching | K2 / DS3 + Web |
Email Crafting | Cheapest Text Model | Cost-effective at scale | Kimi K2 / DS V3 |
Key Takeaways
- Agents + Search hybrids deliver 90% precision vs. 60% for single-model approaches
- Cost can be reduced by reserving premium models (Agents) for analysis only
- Manual validation remains critical for:
- Cultural nuance (e.g., French business norms)
- Regional role relevance
Implementation Roadmap:
A [Initial AI Prompt]
-> B[Agent Generation] -> C[Human Adjustment]
-> D[Search Model for Companies] -> E[Human Adjustment]
-> F[Search Model for Roles] -> G[Human Adjustment]
-> H[Text Model for Messages] -> I[Human Adjustment]
-> J[Launch Campaign]
We are Here to Empower
At System in Motion, we are on a mission to empower as many knowledge workers as possible. To start or continue your GenAI journey.
You should also read


Transforming Our Brainies: AI Image Manipulation Experiment - 5 / 5
Article 6 minutes readLet's start and accelerate your digitalization
One step at a time, we can start your AI journey today, by building the foundation of your future performance.
Book a Training