AI for Business Development - Comparison

AI for Business Development - Comparison

AI for Business Development - Comparison

In today’s rapidly evolving business landscape, AI is transforming how companies identify prospects, engage decision-makers, and drive growth. This post explores the effectiveness of various AI models in business development by testing their ability to generate high-potential leads, analyze customer profiles, and craft compelling cold outreach messages.

Objective

Our goal was to evaluate how different AI models (text, reasoning, search, agents) perform in:

  • Identifying Ideal Clients: Analyzing similarities among our top customers (Pernod Ricard, Lacoste, Saint-Gobain, Lesaffre) to generate a targeted list of 20 similar prospects.
  • Personalized Outreach: Crafting unique LinkedIn messages tailored to each prospect, aligned with our mission: Empowering knowledge workers to reclaim their creativity, autonomy, and productivity with AI.

AI Models Tested

We categorized and tested 10 AI models across four key capabilities:

  • Text: Copilot (Quick), DeepSeek V3, Kimi K2
  • Reasoning: Copilot (Deeper), DeepSeek R1
  • Search: DeepSeek R1 + Web, Kimi K2 + Web
  • Agents: Manus, Genspark

Each model was prompted with the same business development task, allowing us to compare their accuracy, creativity, and strategic depth in lead generation and messaging.

Why This Matters

As AI adoption accelerates, businesses must understand which tools best enhance sales and marketing efforts. This study provides actionable insights into:

  • AI’s role in B2B prospecting – Can AI replicate expert-level lead research?
  • Strategic reasoning – Which AIs go beyond surface-level suggestions to offer deeper insights?
  • Personalization at scale – How well do different models tailor outreach?

By benchmarking these AI models, we aim to help sales teams optimize their skills, improve outreach efficiency, and ultimately drive more meaningful engagements with high-value prospects.

Methodology

Prompt

Each AI was given the same prompt:

You are a sale expert with 15 years of experience in B2B services. You task is to compile a list of potential clients who could be looking to upskill their workforce with AI.​ We are a AI training, consulting, and implementation company, with a focus on global companies in APAC. Our top clients are Pernod Ricard, Lacoste, Saint Gobain, and Lesaffre. First analyze the client profile and found the similarity between these customers. Based on the similarities, list of 20 clients with similar profile who can benefit from our expertise. For each potential client design a personalized message to send on LinkedIn to their decision makers. The message must be unique to the customer and reflect our objective: Empower Knowledge Workers to reclaim their creativity, autonomy, and productivity with AI.

Analysis

Portfolio Analysis

We began by manually evaluating each AI’s accuracy in analyzing our current customers, assigning points based on how well they identified key characteristics (e.g., industry, size, AI adoption potential). These scores were used to rank the models’ analytical capabilities.

Target List Quality

Next, we aggregated the AI-generated target lists and calculated the frequency of suggested clients across all models, using our predefined scoring system to assess the relevance of each prospect and rank the AIs accordingly.

Message Quality

Finally, we assessed the quality of the cold outreach messages by scoring them on three criteria:

  1. personalization: how well each message reflected the prospect’s unique needs,
  2. relevance to our offer: alignment with our mission of empowering knowledge workers with AI,
  3. call to action: clarity and persuasiveness in driving engagement. This multi-layered evaluation allowed us to compare the models not just on raw output but on strategic depth and practical usability in real-world business development.

Portfolio Analysis

Criteria

To evaluate the AIs results, we established three weighted criteria based on insights from all tested AI models:

  • French Heritage (5 points) – Reflecting the strong cultural and operational alignment seen in our top clients like Pernod Ricard and Lacoste.
  • Global APAC Presence (3 points) – Essential for our regional focus, ensuring prospects have a footprint in markets where we deliver AI training and implementation.
  • Knowledge-Intensive Workforce (2 points) – Highlighting companies where upskilling with AI can drive measurable productivity and innovation gains.

By applying this scoring framework, we assess how effectively different AI models prioritize prospects that match our ideal customer profile. This structured approach not only validates AI-generated recommendations but also provides a replicable method for data-driven business development. The following analysis compares AI performance in identifying, scoring, and ranking prospects—ultimately revealing which models deliver the most actionable insights for sales teams.

Ranking

Name / Categ. / Score Key Similarities Identified by the Model
Manus / Agent / 10 French heritage; global scale + APAC presence; large diverse workforce; digital transformation in traditional industries; premium positioning; knowledge-intensive ops
Genspark / Agent / 10 French heritage + APAC expansion; knowledge-intensive ops; multi-local model; employee-centric culture; digital transformation
Kimi K2 / Text/ 10 French HQs + APAC ops; consumer-facing portfolios; asset-heavy + knowledge workers; digital/sustainability transformation; large scale (10k-100k employees)
Copilot (Quick) / Text / 8 Global + APAC presence; industry (FMCG/manufacturing/apparel); large distributed teams; innovation (sustainability/digital/brand agility); French-international culture
Kimi K2 + Web / Search / 8 French HQs + APAC execution; scaling AI pilots (Pernod/Saint-Gobain/Lacoste); brand/sustainability culture; decentralized operations
Copilot (Deeper) / Reasoning / 5 Premium/heritage global consumer/industrial brands; digital transformation; empowered knowledge workforce; complex APAC teams
DeepSeek V3 / Text / 5 Sector (FMCG/luxury/manufacturing), APAC footprint, innovation focus (digital/sustainability), knowledge-intensive workforce, regulatory needs
DS R1 + Web / Search / 5 Sector focus (FMCG/fashion/materials/food); APAC growth (China/India/Japan markets); knowledge-intensive functions (R&D/supply chain)
DeepSeek R1 / Reasoning / 2 Sector (CPG/luxury/manufacturing); pain points (upskilling/digitization/complexity); positioning (human-centric AI for creativity/productivity)

Insights

Top Performers (Score: 10/10)

  • Manus (Agent), Genspark (Agent), and Kimi K2 (Text) perfectly identified all key traits
  • Why they excelled: Agents (Manus, Genspark) synthesized strategic patterns, while Kimi K2 (Text) combined precision with broad industry awareness.

Strong Contenders (Score: 8/10)

  • Copilot (Quick) and Kimi K2 + Web missed full points by either: Omitting explicit “French heritage” (Copilot) or underweighting “knowledge workers” (Kimi K2 + Web).
  • Takeaway: Web search (Kimi + Web) added depth but didn’t surpass pure text models in scoring.

Mid-Tier Models (Score: 5/10)

  • Copilot (Deeper), DeepSeek V3, and DS R1 + Web prioritized APAC presence and knowledge workers but weak on French ties:
  • DeepSeek V3 focused on sectors (FMCG/luxury) without heritage emphasis.
  • DS R1 + Web highlighted APAC growth but lacked cultural alignment.

Underperformers (Score: ≤3/10)

  • DeepSeek R1 (Reasoning) scored lowest due to vague “human-centric” positioning (DeepSeek R1).
  • Critical gap: ignored French heritage entirely.

Target List

Methodology

To refine analyze the prospect list performance, we adopted a data-driven approach that combined AI-generated insights with manual validation. First, we analyzed the frequency of each suggested client across all AI-generated lists, assigning higher scores to companies recommended by multiple models. The results matched our own opinion, even aligns with some our business development effort.

While this step heavily depended on the quality of the initial client analysis, some models, particularly Kimi K2 + Web (Search) and Manus (Agent), demonstrated superior ability to surface high-potential targets, achieving the top scores (69 and 68, respectively). Manual review confirmed that their suggestions aligned closely with our ideal customer profile.

Results

Model Category Score Client Avg Freq.
Kimi K2 + Web Search 69 3.45
Manus Agent 68 3.40
Genspark Agent 67 3.35
Copilot (Quick) Text 64 3.20
DeepSeek R1 Reasoning 61 3.05
Kimi K2 Text 61 3.05
Copilot (Deeper) Reasoning 50 2.50
DeepSeek V3 Text 41 2.05
DS R1 + Web Search 24 1.20

Key Takeaways:

  • Collaborative filtering works: Models that aggregated broader patterns (e.g., Kimi K2 + Web’s web-augmented search) outperformed narrow approaches.
  • Agents excel at synthesis: Manus and Genspark (Agents) delivered high-frequency, high-quality leads.
  • Manual validation remains critical: Even top-scoring models occasionally included outliers, requiring human oversight.
  • This hybrid method ensured our final target list balanced AI efficiency with strategic precision.

Extra Mile: Contacts

While the original task didn’t explicitly require identifying decision-makers, some models proactively suggested roles (e.g., “Head of Digital Transformation”) or even specific contacts—a critical step for crafting tailored outreach. However, the accuracy and relevance of these suggestions varied significantly:

  • Kimi K2 (40% real contacts) correctly targeted APAC-based leaders (e.g., regional HR/innovation heads), aligning with our pain points but with lower verification rates.
  • Kimi K2 + Web (90% real contacts) sourced highly verifiable names—but from HQ roles (e.g., Paris-based CDO), which are often too removed from APAC operational challenges.
  • Genspark (100% real contacts) pinpointed global CEOs—technically accurate but misaligned with our goal of engaging regional AI upskilling stakeholders.

Key Insights

  1. Precision ≠ Relevance: Higher contact verification rates (e.g., Kimi + Web’s 90%) don’t guarantee strategic fit.
  2. Agents Overreach: Genspark’s CEO focus highlights a risk of over-optimizing for “high-level” but irrelevant targets.
  3. Regional Focus Matters: Kimi K2’s APAC emphasis—despite lower verification—demonstrates better problem-solution fit.

Note: Models like Manus and DeepSeek R1 identified roles (e.g., “APAC Learning & Development Lead”) without names, striking a balance between specificity and flexibility.

Results

Model Position Name %Real Targeting Notes
Kimi K2 40% Contacts in APAC
Kimi K2 + Web 90% Contacts in HQ
Genspark 100% Global CEO
Manus
DeepSeek R1
DeepSeek V3
Copilot (Deeper)
Copilot (Quick)
DS R1 + Web

Models marked ✓ provided positions/names; ✗ did not.

Message Quality

To evaluate the effectiveness of each AI’s outreach capabilities, we analyzed one message per model (prioritizing the same target company where possible). Messages were assessed against three critical criteria:

  1. Personalization (0-5 pts):
    • How well the message addressed the prospect’s unique challenges (e.g., regional upskilling gaps, industry-specific AI adoption barriers).
  2. Relevance to Our Offer (0-5 pts):
    • Alignment with our core mission of empowering knowledge workers with AI (beyond generic “digital transformation”).
  3. Call to Action (CTA) (0-5 pts):
    • Clarity and persuasiveness in driving engagement (e.g., meeting requests, resource offers).

Key Findings:

  • Top Performers:
    • Kimi K2 + Web (Search) achieved a perfect 5.0 across all criteria, leveraging web-sourced insights for hyper-relevant messaging.
    • DeepSeek R1 (Reasoning) and Manus (Agent) followed closely (4.8), excelling in relevance and personalization but with slightly weaker CTAs.
  • Mid-Tier Models:
    • Kimi K2 (Text) scored 4.7, demonstrating strong personalization but less compelling CTAs (“Let’s connect” vs. “Book a demo”).
    • DeepSeek V3 (Text)* underdelivered (4.3), providing only 5 emails out of 20 companies.
  • Underperformers:
    • Genspark (Agent) and DS R1 + Web (Search) struggled with generic messaging (3.8 and 3.0).
    • Copilot variants ranked lowest (≤2.5), often defaulting to templated, low-impact language.

Results

Model Category Perso. Relev. CTA Overall
Kimi K2 + Web Search 5.0 5.0 5.0 5.0
DeepSeek R1 Reasoning 4.5 5.0 5.0 4.8
Manus Agent 5.0 5.0 4.5 4.8
Kimi K2 Text 5.0 5.0 4.0 4.7
DeepSeek V3* Text 4.0 4.5 4.5 4.3
Genspark Agent 4.0 4.0 3.5 3.8
DS R1 + Web Search 3.0 3.5 2.5 3.0
Copilot (Deeper) Reasoning 2.5 3.0 2.0 2.5
Copilot (Quick) Text 1.5 2.0 2.0 1.8

*DeepSeek V3 provided only 5/20 emails.

Takeaways

  1. Search-augmented models dominate: Kimi K2 + Web’s real-time data access enabled unmatched precision.
  2. Agents need CTA refinement: While Manus personalized well, their CTAs lacked urgency.
  3. Output consistency matters: Models like DeepSeek V3 (5/10 emails) limited practical usability despite decent scores.

Examples

Kimi K2 + Search - Score 5

L’Oréal – Chief Digital & Technology Officer APAC
Bonjour Stéphane, 40 000 APAC “Beauty Tech” champions look to you for the next leap. Our AI Academy helped Pernod Ricard free 30 % of marketers’ time for creative storytelling—without new head-count. Ready to give every L’Oréal country team the same edge and turn Gen-AI into the new lipstick effect? Let’s design a pilot that lets knowledge workers reclaim creativity, autonomy and productivity. Worth a 15-min chat?

Genspark - Score 3.8

Mr. Hieronimus, L’Oréal’s commitment to ‘Beauty Tech’ and empowering diverse teams globally resonates with our mission to help knowledge workers reclaim their creativity through AI. Our experience with other French multinational brands has shown remarkable results in accelerating innovation cycles. Given L’Oréal’s focus on personalization and innovation across 150 countries, I believe our AI training programs could significantly enhance your R&D and marketing teams’ productivity. Could we explore how to empower L’Oréal’s creative professionals with AI tools that amplify rather than replace their expertise?

Copilot Quick Response - Score 1.8

Hi [Name], LVMH’s legacy of craftsmanship and innovation is iconic. We help global teams like yours reclaim creativity and autonomy through AI—empowering designers, marketers, and supply chain experts to work smarter and faster. Would love to explore how we can support your digital transformation journey.

Comparative Analysis of AI-Generated Outreach Messages

Criteria Kimi K2 + Search (Score: 5.0) Genspark (Score: 3.8) Copilot Quick (Score: 1.8) Observations
Personalization Exceptional - Names specific role (CDTO APAC), references “40,000 Beauty Tech champions” and Pernod Ricard case study ⚠️ Good - Mentions CEO by name and L’Oréal’s global focus, but lacks APAC-specific details Weak - Generic “[Name]” placeholder, no specific role or department mentioned Top performers use precise role targeting and regional context
Relevance to Offer Perfect - Directly ties AI training to L’Oréal’s “lipstick effect” analogy and Pernod success ⚠️ Good - Connects to innovation focus but uses generic “empowering teams” language Basic - Makes general claims about “working smarter” without concrete value proposition Best messages use client-specific analogies and measurable outcomes
Call to Action Compelling - Specific 15-min chat request with pilot program framing ⚠️ Vague - Open-ended “explore how” without clear next step Weak - Generic “would love to explore” with no urgency Effective CTAs specify time investment and program type

Key Differentiators of Top-Performing Message

  1. Strategic Framing: Positions AI as the “new lipstick effect” - brilliant industry-relevant metaphor
  2. Social Proof: References measurable results from comparable client (30% time savings at Pernod)
  3. Regional Precision: Targets APAC leader rather than global executive
  4. Action-Oriented: Proposes concrete “pilot” collaboration rather than vague exploration

Common Pitfalls in Lower-Scoring Messages

  • Genspark’s focus on global CEO rather than regional decision-maker
  • Copilot’s failure to personalize even basic details (name/role)
  • Both lack Kimi’s clever use of industry-specific language (“lipstick effect”)

Conclusion

Optimizing AI for Strategic Business Development

This study demonstrates that AI can dramatically enhance B2B prospecting when used strategically, but requires careful orchestration across different model types. Here’s the refined process for optimal results.

We will have to break down the prompt into a chain of prompt to:

  • adjust intermediary results,
  • use the best model for each step.

Adjusting Intermediary Results

  1. French/European Companies

    • Use Manus/Genspark (Agents) to identify cultural/operational fits
    • Weight French heritage at 5pts in scoring
  2. APAC Contacts Only

    • Filter outputs through Kimi K2 + Web to verify regional presence
    • Reject HQ-based contacts (common in raw AI outputs)
  3. Right Level Targeting

    • Exclude C-suite unless specifically requested
    • Prioritize:
      • Regional Innovation Leads
      • APAC HR/Talent Directors
      • Digital Transformation Managers
  4. Refined CTA Requirements

    • Mandate:
      • Specific time commitment (15/30 mins)
      • Pilot-focused language
      • Social proof integration

Model Specialization by Task

Step Best Model Type Why Example Models
Deep Analysis Agents Pattern recognition across datasets Manus, Genspark
Lead Identification Search (+Web) Real-time validation Kimi K2 + Web
Contact Discovery Search (+Web) Accuracy in role/region matching K2 / DS3 + Web
Email Crafting Cheapest Text Model Cost-effective at scale Kimi K2 / DS V3

Key Takeaways

  • Agents + Search hybrids deliver 90% precision vs. 60% for single-model approaches
  • Cost can be reduced by reserving premium models (Agents) for analysis only
  • Manual validation remains critical for:
    • Cultural nuance (e.g., French business norms)
    • Regional role relevance

Implementation Roadmap:

A [Initial AI Prompt]
-> B[Agent Generation] -> C[Human Adjustment]
-> D[Search Model for Companies] -> E[Human Adjustment]
-> F[Search Model for Roles] -> G[Human Adjustment]
-> H[Text Model for Messages] -> I[Human Adjustment]
-> J[Launch Campaign]

We are Here to Empower

At System in Motion, we are on a mission to empower as many knowledge workers as possible. To start or continue your GenAI journey.

You should also read

AI Agents for Compelling Graphs - Comparison

AI Agents for Compelling Graphs - Comparison

Article 5 minutes read
Transforming Our Brainies: AI Image Manipulation Experiment - 5 / 5

Transforming Our Brainies: AI Image Manipulation Experiment - 5 / 5

Article 6 minutes read

Let's start and accelerate your digitalization

One step at a time, we can start your AI journey today, by building the foundation of your future performance.

Book a Training