Fundamentos de A/B Testing
Descubre el poder de la experimentación digital
Canales, implementación y tipos de experimentos
Conceptos estadísticos para A/B Testing
Quiz: Fundamentos de A/B Testing
Análisis Pre-Test
Crea tu documento de experimento
¿Qué y en dónde probar?
Selecciona tus métricas de evaluación
Establece la duración de tu prueba
Diseña el tratamiento de tu prueba
Quiz: Análisis Pre-Test
Configura y lanza tu experimento
Elige y configura tu herramienta de experimentación
Desarrolla y asegura la calidad de tu prueba
Supervisa y finaliza tu experimento
Quiz: Configura y lanza tu experimento
Análisis Post-Test
Recomendaciones para garantizar la confiabilidad de tu prueba
Analiza tu experimento y comunica los aprendizajes
¿Cómo continuar tu formación?
You don't have access to this class
Keep learning! Join and start boosting your career
Digital experimentation is full of challenges and a vital one is discerning between accurate data and data that could be erroneous. This is where Toyman's law comes into play, which holds that the more striking the data, the more likely it is due to error. In the realm of digital experimentation, where we often look for our experiments to elevate business and alter user behavior, it's easy to get caught up in this desire and overlook the importance of data reliability. This concept suggests that no matter how fantastic the initial results seem, care must be taken, as they may not be representative of a future reality. The implications of this law are crucial, as they ensure that business decisions are made based on reliable data, not on those anomalies that may appear momentarily.
A-B Testing, a powerful tool for validating assumptions in digital products, is not without risks related to data validity.
One of the most frequent problems stems from instrumentation. Here, errors such as incorrect traffic distribution, misconfiguration of metrics or unavailable conversion events can seriously affect the reliability of the results obtained. Ensuring that data is collected correctly is an important first step in mitigating instrumentation-related risks.
Beyond the technical aspect, certain phenomena related to the nature of the business can also compromise the validity of experiments. For example, novelty and temporality effects. Novelty effects occur when a new element in the product attracts the user's attention abnormally, but this behavior is not sustained in the long term. Temporality-related effects occur when tests are conducted at very specific periods of the year that do not reflect consistent behavior.
Fortunately, there are mechanisms and practices that can help mitigate these risks, ensuring that the data collected is as reliable as possible.
Guardrail metrics, or safety metrics, are an essential technique that consists of measuring variables not directly related to the experiment. Examples include the number of product returns or the frequency of contact with customer service, to assess the impact of the experiment on unforeseen aspects. This broader view may reveal negative impacts not initially considered.
Sample Ratio Mismatch, known as SRM, is another valuable tool in ensuring test validity. This mechanism detects alerts when there is an unbalanced traffic distribution. An optimal experiment, for example, should show 50% of users in the control and 50% in the variant. Significant deviations suggest possible errors in the experiment and should be investigated.
Implementing AA tests can be key to evaluate the accuracy of the system before launching a real A-B test. In these tests, the setup of a regular test is replicated, but without changes between control and variation. If after a while, the results are nearly identical and without statistical significance, the data are reliable. If not, it indicates that there is an underlying problem that needs to be addressed before moving forward with more complex testing. This step is essential to ensure that the system is working properly and the data we collect is trustworthy.
Contributions 5
Questions 1
“Entre más llamativos sean los datos observados más probable es que se deban a un error”
Gracias
Riesgo de experimentación:
-Instrumentación
-Confiabilidad.
-Temporalidad.
Mecánismos para mitigar los riesgos:
-Métricas de seguridad: Medir el número de devoluciones o de quejas.
-Detectores de SRM (Sample ratio mismatch): Es un detector que se da, cuando el tráfico no está correctamente distribuido.
Want to see more contributions, questions and answers from the community?