SRM

Category: Marketing

What is SRM (Sample Ratio Mismatch)?

SRM is a statistical problem that occurs when the actual distribution of users between the variants in an experiment significantly deviates from the planned distribution.

Main causes of SRM:

  • Implementation errors - problems with the distribution code
  • Damaged data - loss or duplication of data
  • Bias in distribution - system errors in randomization
  • Bot interference - traffic from artificial sources
  • Problems with third parties - analytical tools or tracking

Methods for detecting SRM:

Chi-square test (Chi-square test)

Statistical test to check the significance of the deviation in the distribution

Visual inspection

Monitoring of the distribution in real time through dashboards

Automated alarms

Systems that report when the deviation exceeds a certain threshold

Stop criteria for experiments:

Statistical criteria:

  • p-value < 0.01 for SRM test
  • Deviation > 5% from the expected distribution
  • Discrepancy in segments - different SRM in subgroups

Business criteria:

  • Significant negative impact on key metrics
  • Technical problems, affecting the user experience
  • Functional errors, which cannot be fixed quickly

Data quality criteria:

  • High drop-out rate in a certain variant
  • Anomalies in traffic - unusual behavior models
  • Tracking problems - missing or incomplete data

Proactive measures to prevent SRM:

  • Regular checks of the distribution during the experiment
  • Automated monitoring systems
  • Pilot tests before full launch
  • Validation of implementation before start
  • Documented procedures for reaction to SRM

Good practices:

  • Check SRM in real time during the first days of the experiment
  • Have clearly defined thresholds for action
  • Document all cases of SRM and the actions taken
  • Train the team to recognize the signs of SRM
  • Introduce a multi-level system for checking quality

SRM checks are critical for the validity of CRO experiments. Early detection and response to distribution problems can save time, resources and prevent taking decisions based on invalid data.