SRM
Category: Marketing
What is SRM (Sample Ratio Mismatch)?
SRM is a statistical problem that occurs when the actual distribution of users between the variants in an experiment significantly deviates from the planned distribution.
Main causes of SRM:
- Implementation errors - problems with the distribution code
- Damaged data - loss or duplication of data
- Bias in distribution - system errors in randomization
- Bot interference - traffic from artificial sources
- Problems with third parties - analytical tools or tracking
Methods for detecting SRM:
Chi-square test (Chi-square test)
Statistical test to check the significance of the deviation in the distribution
Visual inspection
Monitoring of the distribution in real time through dashboards
Automated alarms
Systems that report when the deviation exceeds a certain threshold
Stop criteria for experiments:
Statistical criteria:
- p-value < 0.01 for SRM test
- Deviation > 5% from the expected distribution
- Discrepancy in segments - different SRM in subgroups
Business criteria:
- Significant negative impact on key metrics
- Technical problems, affecting the user experience
- Functional errors, which cannot be fixed quickly
Data quality criteria:
- High drop-out rate in a certain variant
- Anomalies in traffic - unusual behavior models
- Tracking problems - missing or incomplete data
Proactive measures to prevent SRM:
- Regular checks of the distribution during the experiment
- Automated monitoring systems
- Pilot tests before full launch
- Validation of implementation before start
- Documented procedures for reaction to SRM
Good practices:
- Check SRM in real time during the first days of the experiment
- Have clearly defined thresholds for action
- Document all cases of SRM and the actions taken
- Train the team to recognize the signs of SRM
- Introduce a multi-level system for checking quality