Invisible, continuous verification

Catch sophisticated fraud without slowing real users down.

Proof of Human continuously verifies behavior across surveys, panels, and high-stakes user flows. It helps teams catch bots, AI agents, and low-quality traffic while preserving completion rates and giving reviewers evidence they can actually use.

Agent detection
98.8%

Held-out agent detection on the current benchmark set.

False positive rate
0.2%

Share of legitimate users flagged incorrectly on the current held-out benchmark.

Scoring latency
<200ms

p99 scoring built for survey and onboarding flows that cannot tolerate delay.

Behavioral features
150+

Signals summarized into interpretable evidence instead of a bare score.

01 Demo

See how modern fraud actually shows up.

Watch how quickly a convincing bot can produce plausible answers, then see how Proof of Human surfaces the behavioral evidence static checks miss.

See whether it catches the fraud already contaminating your data

Buyers need confidence that detection claims hold up against the sophisticated sessions their current checks keep missing.

See why protection does not have to slow down your surveys

Detection has to stay fast and invisible, or it creates new friction, drop-off, and false positives for legitimate respondents.

See how quickly it fits into the stack you already run

Teams want proof that rollout is lightweight, platform-friendly, and realistic for live survey and panel operations.

02 Why Legacy Checks Fail

Static defenses are failing.

Buyers are not looking for another blunt filter. They need a way to spot sessions that look acceptable on the surface but fall apart when you inspect how they were completed.

01

CAPTCHA adds friction

Good respondents pay the price, while determined attackers increasingly route around it.

02

IP and device checks stay blunt

They help at the edges, but they still create false positives and miss in-session behavior.

03

Manual review does not scale

Open-end checks and row-by-row QA burn time exactly where fraud is getting harder to see.

04

Plausible answers are cheap now

AI agents can produce convincing outputs. The process underneath is where they still slip.

03 Outcomes

Confidence you can depend on.

Cleaner data, faster review decisions, fewer missed fraud events, and less friction for legitimate users.

01

Catch sophisticated bots and AI agents

Behavior-based detection catches sessions that look polished in the final answer but break on process.

02

Reduce manual review burden

Prioritize the sessions that deserve attention instead of pushing every edge case into a human queue.

03

Defensible data decisions

Give teams evidence they can point to when they need to trust, defend, or challenge the sessions behind a study.

04

Protect completion rates

Preserve the experience for legitimate users by keeping verification invisible during the session.

04 What You Get

A score you can defend.

The value is not a mysterious number. Teams need a top-line recommendation plus enough evidence to automate confidently, review efficiently, and explain what happened when a session gets challenged.

01

Risk score

Top-line score and recommendation for accept, review, block, or route decisions.

02

Heuristic breakdown

Visible reasons the session was flagged, not just an unexplained model output.

03

Action log

Concrete interaction history for reviewers who need to inspect how the session unfolded.

04

Replay context

Playback and motion context that make suspicious behavior easier to understand quickly.

05

Export

Structured output for analysts, suppliers, and post-field QA workflows.

06

API data

Session-level fields that can be consumed in internal dashboards or decision systems.

05 Session Report

See exactly why a session was flagged.

Compare a flagged bot session against a human session in the same interface. Buyers can move from the score to the underlying evidence without guessing what the model saw.

06 Evidence Delivery

The evidence leaves the dashboard.

The same evidence that drives review decisions in the product can be delivered to your own systems through exports and APIs. You are not forced to manage fraud from the dashboard alone.

API response

Session payload your team can act on

Pass risk score, top flags, and supporting evidence into internal decision systems, QA dashboards, or supplier workflows.

{
  "session_id": "sess_01JX4M0J8K1A4M7Y9QF3R2",
  "risk_score": 93,
  "risk_tier": "high",
  "recommendation": "review_or_block",
  "summary_flags": [
    "programmatic_typing",
    "teleporting_mouse",
    "all_pasted"
  ],
  "evidence": {
    "paste_ratio": 0.94,
    "corrections": 0,
    "scroll_pattern": "jump",
    "environment": ["vpn", "bot_browser"]
  }
}
Export view

Export built for review and analysis

Sort by risk, inspect top flags, and join route or supplier context without opening the session UI for every record.

session_id score tier top flag route
sess_01JX4... 93 high programmatic_typing /survey/customer-trust
sess_01JX5... 81 review all_pasted /survey/concept-test
sess_01JX8... 14 low none /survey/brand-lift
07 Integration

One line in. Decision-ready data out.

Proof of Human fits alongside the tools you already use. Add the script to a survey, panel, or web flow, then start collecting behavioral signals without interrupting legitimate users. The live dashboard brings route health, session reports, exports, and API-backed workflows into one operating surface.

SDK install

Lightweight JavaScript setup

Drop the script into a survey or web flow and begin collecting behavioral signals immediately.

<script defer src="YOUR_PROOF_OF_HUMAN_SDK_URL" data-site-id="YOUR_SITE_ID"></script>
01

Real-time, frictionless tracking

Users are never interrupted during the session, and teams can view sessions and risk scores live.

02

Filtering

Filter by originating route or use tags to group traffic by campaign, survey, or product surface.

03

API access

Ingest risk scores and behavioral signals directly into internal dashboards or decision systems.

04

Easy data export

Share from the app or download the data to produce transparent reports for clients and partners.

08 Interaction Examples

See the interaction patterns that give agents away.

Final answers can look plausible. The process underneath still diverges. Compare humans, naive agents, and stealth agents performing the same tasks side by side before you look at the model that scores them.

Human
Naive agent
Stealth agent
Click to expand
Human
Milena
Click to expand
Naive agent
Claude
Click to expand
Stealth agent
Claude
Click to expand
Human
Milena
Click to expand
Naive agent
Claude
Click to expand
Stealth agent
Claude
Click to expand
Human
Milena
Click to expand
Naive agent
Claude
Click to expand
Stealth agent
Claude
09 Proof

Built on 30k human sessions and 10k engineered agent sessions.

The model is trained on verified human behavior and engineered agent runs, then evaluated on held-out data. The result is a score grounded in observed behavior, not guesswork, with benchmark coverage that reflects real interaction patterns and modern attack behavior.

The model is informed by benchmark data, real-world traffic patterns, and the risk thresholds teams use to decide when fraud is becoming operationally expensive.

150+

Behavioral features

Mouse, typing, paste, scroll, click, timing, and environment signals summarized for scoring.

Millions

Monthly sessions observed

Production traffic gives the model and review workflows exposure to real fraud pressure at scale.

99.99%

Uptime

Reliability built for live survey, panel, and onboarding flows that cannot afford detection downtime.

5-30%

Observed contamination range

Across live traffic, contamination can represent a meaningful share of incoming sessions when left unchecked.

Explore the major components behind the score: the model inputs, the benchmark results, and the score distribution across incoming sessions.

Model overview

Signals become features, then a session score

Raw interaction data is transformed into behavioral features and fed into a gradient boosting model that learns to separate human sessions from engineered agent behavior.

Gradient boosting model pipeline from raw behavioral data to final prediction
Signal layer

Readable indicators, backed by dense telemetry

Teams get quick-read behavioral indicators up front, while the model evaluates far richer session telemetry underneath the score.

Quick-read summary indicators
Programmatic TypingTeleporting MouseNo CorrectionsAll PastedJump ScrollingCentered ClicksAgent Behavior
Under these summaries, the system analyzes more than 150 behavioral features and captures thousands of event-level telemetry samples per session, including movement deltas, timing intervals, and interaction events.
The model weights those raw signals in real time to generate a continuous risk score, and the contribution of each feature can be customized to fit the workflow and risk posture of the deployment.
Held-out results

Current benchmark performance

The current held-out benchmark shows a 0.2% false positive rate on legitimate users and strong agent detection at the same time.

Confusion matrix showing a 0.2 percent false positive rate on legitimate users and strong agent detection
0.2%

False positive rate

98.8%

Agent detection

99.3%

Overall accuracy

Distribution view

Separation across incoming sessions

Score distribution matters because it shows separation across the session stream, not just performance at a single cutoff.

Probability distribution for human and agent session scores
11 Next Step

Book a call to see the fraud hitting your system.

Run a pilot on live traffic and get a clear picture of the bots, agents, and high-risk sessions already reaching your routes.