Menu

Stata Tips #20 – Power Analises

A ultima questao em pesquisas aplicadas é: O evento A causa o evento B? A procura por relacionamento de causalidade tem forçado cientistas aplicados a se basearem mais no campo dos experimentos. Para que se possa detectar um efeito de tratamento (ou o impacto da casualidade) quando realizando um experimento, pesquisadores precisam determinar o tamanho reuquerido da amostra. Stata 15 tem um comando poderoso que permite aos usuarios incrivel flexibilidade na determinaçao de amostras, intensidade do teste, e gráficos. Nos exemplos a seguir, nos computamos tamanhos de amostras para uma intensidade especifica de teste, também poderíamos computar intensidades diferentes para um tamanho de amostra especifico.
Considere que queremos calcular o impacto de programa de ensino inovativo em Desenvolvimento de Equivalencia Geral (GED) taxas de complexidade para populaçao jovem (com idade por volta de 17 a 25). Para fazer isso, nos presisamos executar um teste de controle randomizado e aleatoriamente escolher participantes no programa inovativo. A literatura ensina que a proporçao da populaçao em questao com o GED é por volta de 66%. Experts no campo da educaçao esperam que o novo programa aumente as taxas complexas do GED em 20% …

Using the command below Stata helps identify the total number of participants (and also the number in each group: treatment and control) needed in the study.

power twoproportions 0.66 0.86, power(0.8) alpha(0.05)

The twoproportions method is used because we are comparing two proportions, 0.66 is the proportion of the population with GED and 0.86 is what we expect the proportion of the population with GED will be after completing the program. power() is the power of the test and alpha() is the significance level of the test; both values here are the default values.

Performing iteration ...

Estimated sample sizes for a two-sample proportions test
Pearson's chi-squared test
Ho: p2 = p1 versus Ha: p2 != p1

Study parameters:

alpha = 0.0500
power = 0.8000
delta = 0.2000 (difference)
p1 = 0.6600
p2 = 0.8600

Estimated sample sizes:

N = 142
N per group = 71

The result of the power analysis tells us that we need to recruit 142 participants into the study in which we enroll 71 in the new program and rely on the remaining 71 as a control group.

Let us say we now want to relax the 0.8 assumption of the power of the test and allow 4 different values ranging from a low of 0.6 to a high of 0.9

power twoproportions 0.66 0.86, power(0.6 0.7 0.8 0.9) alpha(0.05)

Performing iteration ...

Estimated sample sizes for a two-sample proportions test
Pearson's chi-squared test
Ho: p2 = p1 versus Ha: p2 != p1

+-----------------------------------------------------------------+
| alpha power N N1 N2 delta p1 p2 |
|-----------------------------------------------------------------|
| .05 .6 90 45 45 .2 .66 .86 |
| .05 .7 112 56 56 .2 .66 .86 |
| .05 .8 142 71 71 .2 .66 .86 |
| .05 .9 188 94 94 .2 .66 .86 |
+-----------------------------------------------------------------+

The command gives us the sample size of the 4 different scenarios with a low of 90 participants to a high of 188. As expected, the higher the power of the test the larger the sample size that is required.

We can graph the above table using the below command:

power twoproportions 0.66 0.86, power(0.6 0.7 0.8 0.9) alpha(0.05) graph

[Image: Stata output for a multilevel linear regression]

In some cases, the sample size has been predetermined. For a variety of reasons which could eligibility into a program, financial, etc. the number of participants in a study is known and also fixed. In the above example assume the number of participants eligible for the innovative education program is 120. How much power can we detect from the given sample size and a number of different magnitude effects of the program? As a reminder, power of a test is the probability of making a correct decision (in order words to reject the null hypothesis) when the null hypothesis is actually false.

power twoproportions 0.66 (0.71 0.76 0.81 0.86), n(120) alpha(0.05) graph

In the above command, we fix sample size to 120 and suggest 4 different effect sizes each increasing by 5 percentage points from the initial 66% proportion of GED completion rates.

[Image: Stata output for a multilevel linear regression]

The resulting graph suggests low power with the highest being 73% for an effect of 20 percentage points increase in the proportion of GED completion rate. This is not surprising given that the first command gave us a required 142 participants to get the same effect with a power of 80%. With less number of participants, we expect lower power.

Share this post

Site Navigation