Labs POC + Rival 2.0
The Labs POC demonstrates a real architectural improvement over RAG. This document shows where it stops and what Rival 2.0 adds to make it work end-to-end.
The Labs POC
The POC ran on the Trade Winds dataset. One corpus, one topic, one ingestion. Here is what it demonstrated and what it did not.
RAG answers each query from scratch against raw documents. The wiki compiles understanding once and improves it over time. For research programmes with repeated waves, that matters: the system gets better rather than starting fresh every time.
The researcher asks a question, gets an analysis page, and can insert it back into the wiki. That analysis is then available to future queries. The POC demonstrated this working and it is the right model.
The POC ingested a snapshot of Trade Winds data. Once done, the wiki is static. When the next wave fields, someone has to export and re-run ingestion. There is no connection back to the platform, so the wiki does not update when new data comes in.
The POC generated a list of question text at the end. No card types, no screener logic, no routing. To get that into Rival requires a programmer to manually translate it, which is the same manual step we have today.
The agent asks questions during ingestion for context. That works for a demo. For a programme running 50 studies, someone has to be present every time new data is ingested. That does not scale.
The demo used one corpus. In a real programme, Wave 2 may contradict Wave 1. Without wave, market, and population metadata attached to each finding, the wiki cannot surface that. It will either smooth over the contradiction or present both without context.
Without Rival 2.0
The POC ends at a text list of questions. Turning it into a survey, fielding it, and getting the results back into the wiki are all manual steps outside the system.
New study data does not update the wiki automatically. Someone has to re-export and re-ingest manually. The wiki reflects one point in time.
Wiki answers are qualitative synthesis. There is no way to ask "how many respondents said that?" The underlying response data is not accessible.
Wiki, survey tool, and analytics have no shared data model. Moving between them requires manual export steps at each handoff.
Attended ingestion, text questions, manual programming, QA. A researcher who wants to act on a wiki finding still waits days before fieldwork can start.
With Rival 2.0
Rival 2.0 provides the structured data, the authoring pipeline, and the deployment infrastructure that the wiki needs to work end-to-end. Each piece addresses a specific gap in the POC.
Wiki finding to create_survey via studio-mcp to FlowDefinition validated and deployed. No programmer, no manual translation step.
When a study closes, the wiki agent ingests the results automatically using the FlowDefinition as metadata. No attended re-ingestion.
DataTalk can hit ClickHouse in the same call. A wiki answer of "brand trust is declining" can be followed immediately by "show me the numbers," same tool, same session.
When fieldwork closes, results are ingested automatically. Wave 3 starts with everything Waves 1 and 2 learned, already in the wiki, without anyone doing anything.
Direct comparison
Where the POC stops and what Rival 2.0 adds at each point.
| Dimension | ✗ Labs POC | ✓ Rival 2.0 |
|---|---|---|
| Source material | Research data exported manually and ingested as a snapshot. When a new wave fields, someone must re-export and re-ingest. | Raw ClickHouse answers table. Every verbatim, every numeric, every session. No intermediate export format. |
| Ingestion | Human must be present for every ingestion to provide contextual detail. Bottleneck at programme scale. | Agent ingests with full study metadata (FlowDefinition, card types, screener criteria). Unattended. Human reviews on demand. |
| Answer quality | Synthesis only. "Most respondents said X" with no n=, no segments, no statistical backing. | DataTalk hits ClickHouse in the same call. Every synthesis grounded in actual cross-tab data. |
| Contradictions across studies | Unhandled in demo. Wiki may silently smooth over conflicting claims from different waves or markets. | Study metadata (wave, market, population) is part of the data model. Contradictions are surfaced with context, not averaged away. |
| From insight to deployed survey | Text questions with no schema. Requires a programmer to translate them manually before anything can be fielded. | preview_survey then create_survey via studio-mcp. FlowDefinition generated, validated, deployed. Minutes, not days. |
| Survey quality gate | None. The text output is unchecked. Logic errors, missing end cards, broken screeners: discovered after fielding. | 11-layer pipeline: Zod schema, 3 semantic passes, forge(), 3 VM execution paths. Broken surveys cannot be deployed. |
| Feedback loop | Survey responses from the next study never update the wiki. The knowledge base ages from the moment fieldwork starts. | Answers flow from WfP to ClickHouse to wiki ingestion cycle. The knowledge base grows with every study fielded. |
| Multi-study synthesis | Depends entirely on the analyst deciding to re-export and re-ingest across studies. Manual, ad hoc. | DataTalk can query across all studies in ClickHouse in one SQL call. Programme-level synthesis is a query, not a project. |
| Scale | Demonstrated on one corpus, one topic. No evidence it handles 50-study, multi-wave, multi-market programmes. | Cloudflare edge, Durable Objects per session, ClickHouse for analytics. Architecture designed for this scale from day one. |
What is not built yet
The Labs POC built something Rival 2.0 does not have yet. These are the pieces still to build.
Labs built the wiki layer. Rival 2.0 built the structured data, the authoring pipeline, the query layer, and the deployment infrastructure. Neither is complete without the other. The wiki without Rival 2.0 is disconnected from the platform that generates the data and fields the next study. Rival 2.0 without the wiki answers per-study questions but has no cross-programme memory.
The full cycle is: structured research data, wiki, deployable survey, fieldwork, data back into the wiki. The Labs POC covers the middle section. Rival 2.0 covers the rest: the structured data layer, the authoring pipeline that turns wiki findings into a live survey, and the infrastructure that brings new results back in.
Agent Ready
Labs is exploring what it would take to make Rival agent-ready: exposing it as MCP and headless CLI so agents can author, field, and query studies programmatically. That is the right direction. But building those capabilities on top of the current platform means Labs is doing infrastructure work that the platform should be providing. Rival 2.0 ships MCP and headless access as first-class features from day one, so Labs can focus on what agents do with the platform, not on how to wire up the platform itself.
Authoring and data are MCP-native in Rival 2.0. Labs agents call the same tools a human researcher uses in Claude. No custom integration layer to maintain.
When MCP and headless access are platform primitives, Labs can focus on what agents do with research data, not on how to get the platform to accept programmatic input.
Rival 2.0's responding layer makes no distinction between a human respondent and a synthetic one at the API level. Labs gets headless fielding without any responding-layer changes.