Modernizing AI Infrastructure: How Directeam Helped Toluna Migrate AI Workloads to AWS Bedrock

Meet Tulona
Toluna is a global leader in digital consumer intelligence and survey research, operating one of the world’s largest online consumer panels with members across 70+ countries. Toluna’s platform connects brands with real consumer opinions through surveys, polls, and a community called Toluna Influencers, enabling data-driven decision-making for enterprises worldwide. To advance its research capabilities, Toluna’s Machine Learning division has built an AI-powered synthetic persona platform, a large-scale agentic system that generates virtual representations of real consumers capable of participating in surveys and simulating authentic human behavior.
The Challenge
As Toluna sought to consolidate and scale its AI workloads on AWS Bedrock, several significant challenges emerged:
- Model Intelligence and Cost Requirements: Toluna’s agentic systems demand high-quality reasoning from the underlying LLMs. Each AI agent requires robust chain-of-thought capabilities a methodology Toluna pioneered internally before “thinking models” became an industry standard. The platform needed foundation models that met a strict intelligence threshold while maintaining competitive per-token pricing and low latency.
- Agent-Level Migration Complexity: Unlike a typical application migration, Toluna’s transition could not be executed at the project or system level. Each individual AI agent required independent evaluation against its own quality benchmarks, assessed through both automated LLM-as-a-judge systems and expert human review.
- Latency-Sensitive Real-Time Processing: Many of Toluna’s AI agents operate in real-time behind user-facing survey interactions. With P75 latency targets of approximately three seconds and a hard maximum cutoff of seven seconds and no use of token streaming total generation latency was a critical acceptance criterion.
- Population-Level Statistical Fidelity: Beyond individual persona consistency, Toluna’s platform must ensure that aggregate response patterns across thousands of synthetic personas accurately reflect real-world demographic distributions, preventing systematic bias from any single foundation model’s inherent tendencies.
- Privacy, Regulatory Compliance, and Vendor Strategy: Toluna operates under stringent data privacy requirements. Each new AI provider relationship triggers extensive client-facing compliance reviews. Toluna’s leadership adopted an AWS-first strategy to minimize vendor sprawl, simplify compliance narratives for clients, and consolidate data processing within a single trusted cloud ecosystem.
- Next-Generation Survey Environment Challenges: Looking ahead, Toluna’s synthetic personas will need to interact with surveys hosted on third-party websites, introducing significant complexity around automated content extraction (video, audio, images, varied DOM layouts) from unknown website structures.
The Solution
Directeam engaged with Toluna’s Machine Learning leadership through in-depth technical discovery sessions to map the full scope of the migration, understand the agent-level architecture, and design a phased approach that maximized Bedrock adoption while respecting the platform’s stringent quality and performance requirements.
- Comprehensive Model Assessment and Selection: Directeam conducted a thorough assessment of the Bedrock model ecosystem alongside Toluna’s engineering team, evaluating models across reasoning quality, latency, and cost from lightweight low-latency models for real-time detection tasks to more capable models for complex reasoning agents.
- Agent-by-Agent Migration Framework: Directeam helped Toluna formalize a structured, repeatable evaluation methodology. Each agent’s migration follows a defined process: it is tested against its existing quality benchmarks using the candidate model, outputs are assessed through automated evaluation and expert review, and accuracy thresholds must be met before production deployment.
- Latency and Parameter Optimization: Directeam benchmarked time-to-last-token performance across candidate Bedrock models and advised on inference parameter differences (temperature, sampling, generation controls) ensuring behavioral consistency during model transitions.
- Population-Level Bias Mitigation Through Model Diversity: Directeam validated and supported Toluna’s approach to reducing aggregate bias through model diversification. Distributing agent inference across multiple Bedrock foundation models each with different inherent tendencies produces a natural anti-biasing effect at the population level.
- Vendor Consolidation and Compliance Simplification: By routing LLM inference through AWS Bedrock, Toluna consolidates AI workloads inside the AWS ecosystem, governed by existing security controls and agreements, significantly simplifying enterprise-client compliance reviews.
- Technical Consulting on Forward-Looking Challenges: Directeam provided initial architectural guidance on Toluna’s next-generation challenge of automated media extraction from third-party websites, drawing on expertise in web content processing and browser automation.
- Migration Program Structure and Funding: Directeam structured a hybrid migration support program combining direct AWS credits applied to Toluna’s accounts with dedicated Directeam technical consulting capacity, with the balance adjustable as the project progressed (MAP-Lite scale).
Quantitative Outcomes
The engagement is measured against the following quantitative outcomes:
- Bedrock Migration Coverage: ≈75–80% of the 20+ agentic workstreams identified as migration candidates against the original assumption of substantially fewer; coverage is being incrementally promoted into production through the agent-by-agent framework.
- Real-Time Agent Latency: P75 ≤3.0s and 100% of invocations under the 7s hard cutoff sustained on migrated real-time agents (no token streaming) measured continuously via CloudWatch p75/p99 dashboards.
- Cost Delta vs. Prior Provider: ≈25–35% lower per-token cost on the Bedrock-tiered routing strategy versus the prior OpenAI/Azure path, captured in the AWS Calculator estimate and tracked monthly against actual usage.
- Compliance Cycle Time: Vendor-consolidation onto AWS reduced the per-client privacy/regulatory review burden by an estimated ≈40–50% (fewer net-new vendor relationships to document per enterprise client onboarding).
- Agent Quality Regression Rate: Zero quality regressions on agent classes promoted through the evaluation framework every promotion is gated on automated LLM-as-judge pass + expert review pass.
The Results
The engagement between Directeam and Toluna established a clear, actionable path for migrating one of the most sophisticated agentic AI platforms in the market research industry to AWS Bedrock. Key outcomes include:
- Broad Migration Coverage Identified: Through comprehensive model assessment, migration pathways were identified for a substantial portion of the platform’s 20+ workstreams well beyond what initial assumptions had suggested was possible.
- Rigorous, Repeatable Migration Methodology: Toluna now operates a proven agent-level evaluation framework combining automated quality assessment with expert validation, enabling confident, incremental migration without exposing customers to regression risk.
- Simplified Vendor Landscape and Strengthened Compliance Posture: Consolidating AI workloads within AWS directly addresses the privacy/regulatory reviews triggered by each enterprise client engagement translating to faster client onboarding, reduced legal overhead, and a clearer data governance narrative.
- Optimized Cost and Performance Profile: By matching specific Bedrock models to specific agent requirements, Toluna achieved an optimized balance of inference quality, latency, and cost real-time agents meet sub-three-second targets while complex reasoning agents leverage more capable models where extended inference time is acceptable.
- Clear Forward-Looking Technical Roadmap: Directeam and Toluna identified collaborative opportunities around third-party content extraction, advanced browser automation, and ongoing model evaluation as the Bedrock ecosystem continues to expand.
- Directeam’s Ongoing Support: Directeam continues to serve as Toluna’s technical partner across the full breadth of Toluna’s AWS infrastructure from the agentic AI platform through compute workloads, custom model training environments, and operational infrastructure.