Labs Vision with Rival 2.0

The Labs POC

What the POC does, and where it ends

The POC ran on the Trade Winds dataset. One corpus, one topic, one ingestion. Here is what it demonstrated and what it did not.

✅

Wiki beats RAG for accumulated knowledge

RAG answers each query from scratch against raw documents. The wiki compiles understanding once and improves it over time. For research programmes with repeated waves, that matters: the system gets better rather than starting fresh every time.

✅

On-demand analysis that persists back into the wiki

The researcher asks a question, gets an analysis page, and can insert it back into the wiki. That analysis is then available to future queries. The POC demonstrated this working and it is the right model.

⚠️

Ingestion is a one-time manual step

The POC ingested a snapshot of Trade Winds data. Once done, the wiki is static. When the next wave fields, someone has to export and re-run ingestion. There is no connection back to the platform, so the wiki does not update when new data comes in.

⚠️

The survey output is a text list, not a deployable survey

The POC generated a list of question text at the end. No card types, no screener logic, no routing. To get that into Rival requires a programmer to manually translate it, which is the same manual step we have today.

⚠️

Someone must be present for every ingestion

The agent asks questions during ingestion for context. That works for a demo. For a programme running 50 studies, someone has to be present every time new data is ingested. That does not scale.

⚠️

No mechanism to handle contradictions across waves

The demo used one corpus. In a real programme, Wave 2 may contradict Wave 1. Without wave, market, and population metadata attached to each finding, the wiki cannot surface that. It will either smooth over the contradiction or present both without context.

Without Rival 2.0

Where the current workflow breaks down

The POC ends at a text list of questions. Turning it into a survey, fielding it, and getting the results back into the wiki are all manual steps outside the system.

Current workflow

💬

Research chats run

Qualitative sessions conducted on current platform. Responses captured as raw chat logs.

Structured data discarded

📂

Research data exported manually

Someone decides when to export and what to include. No automatic trigger when a study closes. Ingestion is a one-time snapshot, not a live feed.

No automatic trigger

🤖

Agent ingests chat files

Claude Code reads chat transcripts, asks questions for context during ingestion. Human must be present to answer.

Bottleneck per study

📚

Wiki compiled

Obsidian markdown files. Concept graph. Pages for entities and findings.

🔍

Researcher queries wiki

Natural language questions answered from compiled wiki. Analysis pages created on demand.

📋

Survey questions drafted

Agent outputs a text list of questions. No platform schema. No card types. No logic.

Not deployable

GAP

✍️

Manual survey authoring

Programmer translates text questions into survey tool format. Days of work. Error-prone.

Manual translation required

🔌

Fielded on separate platform

Survey runs on a different system. Response data lands in a silo. Never feeds back into the wiki.

No feedback loop

🕳️

Wiki goes stale after ingestion

New study data does not update the wiki automatically. Someone has to re-export and re-ingest manually. The wiki reflects one point in time.

📉

No numbers behind the synthesis

Wiki answers are qualitative synthesis. There is no way to ask "how many respondents said that?" The underlying response data is not accessible.

🔀

Three disconnected systems

Wiki, survey tool, and analytics have no shared data model. Moving between them requires manual export steps at each handoff.

⏱️

Days from wiki insight to fielded survey

Attended ingestion, text questions, manual programming, QA. A researcher who wants to act on a wiki finding still waits days before fieldwork can start.

With Rival 2.0

What changes when the wiki is connected to the platform

Rival 2.0 provides the structured data, the authoring pipeline, and the deployment infrastructure that the wiki needs to work end-to-end. Each piece addresses a specific gap in the POC.

Rival 2.0 + wiki

📊

Raw study data in ClickHouse

Every response, every session. Structured row-per-answer model. Queryable directly.

✓ Queryable, not just readable

↓

🔄Wiki updates on study close
When a study closes, the wiki agent ingests the results using the study schema as metadata, revises pages, flags contradictions.
✓ No manual re-ingestion

→

🧠 LLM
Wiki

←

✍️create_survey via studio-mcp
Wiki finding to brief to FlowDefinition to deployed survey. Card types, screener logic, routing: all generated and validated.
✓ Deployable, not just a text list

↑

🔁

Results feed back into the wiki

When fieldwork closes, results are ingested automatically. The wiki reflects the full programme, not just the last manual export.

✓ No manual step to close the loop

↕

💬DataTalk queries
Wiki context + live ClickHouse numbers in the same answer. Synthesis backed by data.
✓ Statistically grounded

⚡

From insight to deployed survey in one conversation

Wiki finding to create_survey via studio-mcp to FlowDefinition validated and deployed. No programmer, no manual translation step.

📈

Wiki stays current as studies close

When a study closes, the wiki agent ingests the results automatically using the FlowDefinition as metadata. No attended re-ingestion.

🔢

Synthesis backed by actual response data

DataTalk can hit ClickHouse in the same call. A wiki answer of "brand trust is declining" can be followed immediately by "show me the numbers," same tool, same session.

🔁

Every study makes the wiki smarter

When fieldwork closes, results are ingested automatically. Wave 3 starts with everything Waves 1 and 2 learned, already in the wiki, without anyone doing anything.

Direct comparison

Labs POC vs. Rival 2.0

Where the POC stops and what Rival 2.0 adds at each point.

Dimension	✗ Labs POC	✓ Rival 2.0
Source material	Research data exported manually and ingested as a snapshot. When a new wave fields, someone must re-export and re-ingest.	Raw ClickHouse answers table. Every verbatim, every numeric, every session. No intermediate export format.
Ingestion	Human must be present for every ingestion to provide contextual detail. Bottleneck at programme scale.	Agent ingests with full study metadata (FlowDefinition, card types, screener criteria). Unattended. Human reviews on demand.
Answer quality	Synthesis only. "Most respondents said X" with no n=, no segments, no statistical backing.	DataTalk hits ClickHouse in the same call. Every synthesis grounded in actual cross-tab data.
Contradictions across studies	Unhandled in demo. Wiki may silently smooth over conflicting claims from different waves or markets.	Study metadata (wave, market, population) is part of the data model. Contradictions are surfaced with context, not averaged away.
From insight to deployed survey	Text questions with no schema. Requires a programmer to translate them manually before anything can be fielded.	preview_survey then create_survey via studio-mcp. FlowDefinition generated, validated, deployed. Minutes, not days.
Survey quality gate	None. The text output is unchecked. Logic errors, missing end cards, broken screeners: discovered after fielding.	11-layer pipeline: Zod schema, 3 semantic passes, forge(), 3 VM execution paths. Broken surveys cannot be deployed.
Feedback loop	Survey responses from the next study never update the wiki. The knowledge base ages from the moment fieldwork starts.	Answers flow from WfP to ClickHouse to wiki ingestion cycle. The knowledge base grows with every study fielded.
Multi-study synthesis	Depends entirely on the analyst deciding to re-export and re-ingest across studies. Manual, ad hoc.	DataTalk can query across all studies in ClickHouse in one SQL call. Programme-level synthesis is a query, not a project.
Scale	Demonstrated on one corpus, one topic. No evidence it handles 50-study, multi-wave, multi-market programmes.	Cloudflare edge, Durable Objects per session, ClickHouse for analytics. Architecture designed for this scale from day one.

What is not built yet

Gaps on the Rival 2.0 side

The Labs POC built something Rival 2.0 does not have yet. These are the pieces still to build.

1

The programme-level wiki itself

DataTalk answers questions about a single study. The Labs wiki synthesises across all studies in a programme: "what do we know about brand trust across all our waves?" That cross-study layer is not built in Rival 2.0 yet. The data is all there in ClickHouse; what is missing is the wiki agent that sits on top of it.

2

Automatic wiki update when a study closes

The right trigger is: study closes in D1, wiki agent wakes, ingests new results using the FlowDefinition as metadata, revises relevant pages, flags contradictions with prior waves. That agent is not built. A researcher would have to trigger ingestion manually today.

3

FlowDefinition as a reviewable artifact in Reachy

When preview_survey returns the FlowDefinition JSON, Claude renders it inline. The better experience is Reachy showing it in a side panel the researcher can inspect and approve before committing to a full pipeline run. MCP has no artifact protocol, so this is a Reachy feature to build.

💡 How the two pieces fit together

Labs built the wiki layer. Rival 2.0 built the structured data, the authoring pipeline, the query layer, and the deployment infrastructure. Neither is complete without the other. The wiki without Rival 2.0 is disconnected from the platform that generates the data and fields the next study. Rival 2.0 without the wiki answers per-study questions but has no cross-programme memory.

Agent Ready

The platform's job is to give Labs primitives, not make Labs do platform work

Labs is exploring what it would take to make Rival agent-ready: exposing it as MCP and headless CLI so agents can author, field, and query studies programmatically. That is the right direction. But building those capabilities on top of the current platform means Labs is doing infrastructure work that the platform should be providing. Rival 2.0 ships MCP and headless access as first-class features from day one, so Labs can focus on what agents do with the platform, not on how to wire up the platform itself.

Capability	Labs POC approach on current Rival	Rival 2.0: first-class primitive
Authoring	An MCP or CLI layer would need to wrap Rival's existing authoring UI, a system designed for human interaction, not programmatic access. Custom integration work with no standard interface.	Studio MCP exposes authoring as a tool any MCP client can call. Claude, Reachy, a Labs agent, or a custom script: `create_survey`, `publish_survey`, `translate_survey`. Designed for programmatic access from the start.
Data & Analysis	Querying study data programmatically means going through Redshift or Sisense APIs, dealing with per-study schemas, and writing custom connectors for each research domain.	DataTalk MCP exposes study data directly. Any MCP client queries ClickHouse in plain English. One schema, every study, no per-domain connectors.
Fielding	Headless survey completion requires bypassing the current responding UI, which is not built for programmatic respondents. Synthetic or automated participants have no clean entry point.	Each study runs as an isolated Worker with a clean JSON API. Synthetic respondents, automated QA runs, and programmatic panel participants all interact through the same interface as human respondents.
Participants	Customer study participants and panel respondents have no programmatic recruitment or routing layer. Connecting an agent to participant acquisition means building outside the platform entirely.	The per-study Worker architecture makes participant routing first-class. Whether the respondent is a panel member, a customer, or a synthetic agent, the entry point is the same clean interface. Labs builds the routing logic on top, not the plumbing underneath.

🔌

MCP from day one, not bolted on

Authoring and data are MCP-native in Rival 2.0. Labs agents call the same tools a human researcher uses in Claude. No custom integration layer to maintain.

🏗️

Platform does the infrastructure, Labs does the innovation

When MCP and headless access are platform primitives, Labs can focus on what agents do with research data, not on how to get the platform to accept programmatic input.

🧪

Synthetic and real participants through the same interface

Rival 2.0's responding layer makes no distinction between a human respondent and a synthetic one at the API level. Labs gets headless fielding without any responding-layer changes.

What Labs built,
and what it still needs.

What the POC does, and where it ends

Where the current workflow breaks down

What changes when the wiki is connected to the platform

Labs POC vs. Rival 2.0

Gaps on the Rival 2.0 side

Labs built half the loop. Rival 2.0 is the other half.

The platform's job is to give Labs primitives, not make Labs do platform work

What Labs built,and what it still needs.

What the POC does, and where it ends

Where the current workflow breaks down

What changes when the wiki is connected to the platform

Labs POC vs. Rival 2.0

Gaps on the Rival 2.0 side

Labs built half the loop. Rival 2.0 is the other half.

The platform's job is to give Labs primitives, not make Labs do platform work

What Labs built,
and what it still needs.