Statistics

Published on April 2017 | Categories: Documents | Downloads: 56 | Comments: 0 | Views: 283
of 66
Download PDF   Embed   Report

Comments

Content

‫ﺑﺴـﻢ ﷲ اﻟﺮﲪﻦ اﻟﺮﺣﻴـﻢ‬

STATISTICS
By Dr. Mohamed Elnagar
[email protected]

IMPORTANT TOPICS
Study Design
Types Evidence and Recommendations New Drugs

Incidence and Prevalence Normal Distribution & Skewed Distributions
The Standard Error of the Mean (SEM)

Funnel Plot Correlation and Linear Regression Significance Tests Sensitivity and Specificity Odds Ratio Pre and Post Test Odds and Probability Relative Risk (RR) Numbers Needed to Treat and Absolute Risk Reduction

Cohort – Case control – Cross sectional
All are Observational Studies Only observe what happens without intervention.

Cohort : Prospective - RR Case Control : Retrospective.Rare diseases Cross sectional = Prevalence. Crossover design: a patient has one drug or treatment, then a washout period, and then another drug, and the effect is compared between the two in a single individual. For this reason it is a good study design for treatment of chronic conditions, but not appropriate for acute conditions.

Intention to treat analysis is a method of analysis for randomized controlled trials in which all patients randomly assigned to one of the treatments are analysed together, regardless of whether or not they completed or received that treatment. Intention to treat analysis is done to avoid the effects of crossover and drop-out, which may affect the randomization to the treatment groups

Q1
You are asked to design a study to assess whether living near electricity pylons is a risk factor for childhood leukaemia. leukaemia. What is the most appropriate type of study design? A.ACross-over trial B.ACohort study C.ACross-sectional survey D.ACase-control study E.ARandomised controlled trial

D.ACaseD.ACase-control study As the outcome (childhood leukaemia) is relatively rare a cohort study would take an extremely long time to provide significant results

Q2 A researcher is trying to design a study to find out the cause (or causes) of a rare disease, about which very little is known. What study design is most likely to be appropriate? A : Cross over B : Cross-sectional C : Cohort D : Intervention E : Case control.

Q3
What level of evidence does a randomised control trial offer? A.Ia B.Ib C.IIa D.IIb E.IV

1B
Levels of Evidence :
Ia - evidence from meta-analysis of randomised controlled trials Ib - evidence from at least one randomised controlled trial IIa - evidence from at least one well designed controlled trial which is not randomised IIb - evidence from at least one well designed experimental trial III - evidence from case, correlation and comparative studies IV - evidence from a panel of experts

Grading of recommendation:
Grade A - based on evidence from at least one randomised controlled trial (i.e. Ia or Ib) Grade B - based on evidence from non-randomised controlled trials (i.e. IIa, IIb or III) Grade C - based on evidence from a panel of experts (i.e. IV)

NEW DRUGS STUDY :
Superiority: whilst this may seem the natural aim of a trial one problem is the large sample size needed to show a significant benefit over an existing treatment Equivalence: an equivalence margin is defined (-delta to +delta) on a specified outcome. If the confidence interval of the difference between the two drugs lies within the equivalence margin then the drugs may be assumed to have a similar effect Non-inferiority: similar to equivalence trials but only the lower confidence interval needs to lie within the equivalence margin (i.e. -delta). Small sample sizes are needed for these trials. Once a drug has been shown to be non-inferior large studies may be performed to show superiority

DISTRIBUTION
Normal : Mean = median = Mode Skewed distributions: Alphabetical order: mean > Median > Mode '>' For positive, '<' for negative Standard deviation: The standard deviation (SD) represents the average difference each observation in a sample lies from the sample mean SD = square root (variance) Standard error of the mean = standard deviation / square root (number of patients) how 'accurate' the calculated sample mean is from the true population mean Mean= X = (sum of all observed values) (n of sample size)

Mean= is the arithmetic average.(sum/ n) Median= is the middle value.(1st + last /2) Mode= is the value that occurs most often.

mean
80 70

mode median

Distribution;
100 90 80 70

60 50 East 40

60 East 50

30

40 30 20

20 10

10

0 1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

5th

0 1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

5th

Skewed distribution = asymmetrical

Normal distribution

Properties of the Normal Distribution: Symmetrical i.e. Mean = Median = Mode 68.3% of values lie within 1 SD of the mean 95.4% of values lie within 2 SD of the mean 99.7% of values lie within 3 SD of the mean This is often reversed, so that within 1.96 SD of the mean lie 95% of the sample values The range of the mean - (1.96 *SD) to the mean + (1.96 * SD) is called the 95% confidence interval, i.e. if a repeat sample of 100 observations are taken from the same group 95 of them would be expected to lie in that range

SD gives a measure of the spread of the data values. S error is a measure of how precisely the sample mean approximates the pop. Mean. E.g. FEV1 in 100 students Mean = 4.5 liter SD = 0.5 liters. -Confidence interval : The interval where 95% of values lie is = mean +/- 2 SD =4.5 +/- 1 =3.5- 5.5 _Standerd error = 0.5 / 10 = 5%

Q4
The serum potassium is measured in a 1,000 patients taking an ACE inhibitors. The mean potassium is 4.6 mmol/l with a standard deviation of 0.3 mmol/l. Which one of the following statements is correct? A.95% of values lie between 4.5 and 4.75 mmol/lia B.95.4% of values lie between 4.3 and 4.9 mmol/l C.99.7% of values lie between 4.0 and 5.2 mmol/l D.68.3% of values lie between 4.5 and 4.75 mmol/l E.68.3% of values lie between 4.3 and 4.9 mmol/l

E.68.3% OF VALUES LIE BETWEEN 4.3 AND 4.9 MMOL/L
68.3% of values lie within 1 SD of the mean So 4.6 – ( 1 SD = 0.3 ) = 4.3 4.6 + ( 1 SD ) = 4.9

Q5
A follow-up study is performed looking at the height of 100 adults who were given steroids during childhood. The average height of the adults is 169cm, with a standard deviation of 16cm. What is the standard error of the mean?

A. Cannot be calculated B. 1.69 C. 0.16 D. 1.6 E. 1.3

The standard error of the mean is calculated by the standard deviation / square root (number of patients) = 16 / square root (100) = 16 / 10 = 1.6

Q6
A study is performed to find the normal reference range for IgE levels in adults. Assuming IgE levels follow a normal distribution, what percentage of adults will have an IgE level above 2 standard deviations from the mean?

A. 1.25 B. 2.3 C. 1.96 D. 5 E. 0.5

2.3
For normally distributed data 95.4% of values lie within 2 standard deviations of the mean, leaving 4.6% outside this range. Therefore 2.3% of values will be higher and 2.3% will be lower than 2 standard deviations from the mean. This figure is sometimes approximated to 2.5%

FUNNEL PLOTS
Funnel plots - show publication bias in meta-analyses Funnel Plot is primarily used to demonstrate the existence of publication bias in meta-analyses. Funnel plots are usually drawn with treatment effects on the horizontal axis and study size on the vertical axis. Interpretation: A symmetrical, inverted funnel shape indicates that publication bias is unlikely Conversely, an asymmetrical funnel indicates a relationship between treatment effect and study size. This indicates either publication bias or a systematic difference between smaller and larger studies ('small study effects')

CORRELATION & REGRESSION
It is the relationship between 2 variables in the same individual (X & Y).

R = correlation coefficient : It is denoted by the value R which may lie
anywhere between -1 and 1. For example
1.

R = 1 - strong positive correlation (e.g. Systolic blood pressure always with age) R = 0 - no correlation (e.g. There is no correlation between systolic blood pressure and age) R = - 1 - strong negative correlation (e.g. Systolic blood pressure always with age)

2.

3.

B = regression coefficient : In contrast to the correlation coefficient,
linear regression may be used to predict how much one variable changes when a second variable is changed.

SIGNIFICANCE TESTS
depends on whether the data is parametric (something which can be measured, or normally distributed) or non-parametric Parametric tests Student's t-test - paired or unpaired Pearson's product-moment coefficient – correlation Non-parametric tests Mann-Whitney - unpaired data Wilcoxon matched-pairs - compares two sets of observations on a single sample Paired Spearman, Kendall rank – correlation McNemar's test is used on nominal data to determine whether the row and column marginal frequencies are equal Chi-squared tests are used to compare 2 (%) or 2 proportions= used to test the difference between 2 nominal variables (count data).

1. 2.

1. 2. 3. 4. 5.

PAIRED & UNPAIED ?
Paired data refers to data obtained from a single group of patients, e.g. Measurement before and after an intervention. Unpaired data comes from two different groups of patients, e.g. Comparing response to different interventions in two groups

Q7
An endocrinologist performs a study to assess whether a patient's HbA1c level is correlated to their LDL level. Assuming both HbA1c and LDL are normally distributed, which one of the following statistical tests is it most appropriate to perform? A. B. C. D. E. Chi-squared test Pearson's product-moment coefficient Mann-Whitney test Spearman's rank correlation coefficient McNemar's test

Pearson's product-moment coefficient test is most appropriate as the data is parametric and the study is assessing the correlation of two variables

Q8
A study measures a patients serum cholesterol before and after a new lipid-lowering therapy has been given. What type of significance test should be used to analyse the data? A. B. C. D. E. Student's paired t-test Student's unpaired t-test Chi-squared test Pearson's test Spearman test

A. Student's paired t-test

NULL HYPOTHESIS

& P VALUE

null hypothesis states that two treatments are equally effective A significance test uses the sample data to assess how likely the null hypothesis is to be correct The p value is the probability of obtaining a result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. The alternative hypothesis is the opposite of the null hypothesis, i.e. there is a difference between the two treatments.

null hypothesis= non significant difference. P value is the probability of no difference If P value >0.05 the results would be found by chance in more than 1:20. The conventional cut-off for significance is P=0.05, or a 1-in-20 chance. Hence if 20 trials were conducted, you would expect to get one that was ‘positive’ by chance alone. The null hypothesis is rejected if the p-value is smaller than or equal to the significance level

Example : null hypothesis: there is no difference in doctors studying for MRCP In This case the P-Value would be less than the significance level and so the null hypothesis would be REJECTED ! alternative hypothesis : There is a difference ☺

Q9
Which one of the following statements regarding the power of a study is correct?

A. Is the probability of rejecting the null hypothesis when it is false B. Decreases with increasing sample size C. Lies within 2 standard deviations of the mean D. Is the chance a significant p value will be reached E. Is equal to 1 - (the probability of a type I error)

1.

2.

Two types of errors may occur when testing the null hypothesis: Type I: the null hypothesis is rejected when it is true - i.e. Showing a difference between two groups when it doesn't exist (= significance level) Type II: the null hypothesis is accepted when it is false - i.e. Failing to spot a difference when one really exists

The power of a study is the probability of (correctly) rejecting the null hypothesis when it is false power = 1 - the probability of a type II error power can be by increasing the sample size

Sensitivity Specificity Odd Ratio Pre& post test odd

Sensitivity is the probability that a test will be positive
when a patient has the condition. Sensitivity= true +ve /all diseased.

Specificity is the probability that a test will be negative
when a patient does not have the condition. Specificity = true -ve /all non diseased (healthy). Positive predictive value of a test (PPP)= post test probability of a +ve test = true+ve/ all +ves. negative predictive value of a test (NPP)= post test probability of a -ve test = true-ve/ all -ves.
Diseased New test +ve New test -ve True +ve False -ve non False +ve True -ve

Q 10
A new test to screen for pulmonary embolism (PE) is used in 100 patients who present to the Emergency Department. The test is positive in 30 of the 40 patients who are proven to have a PE. Of the remaining 60 patients, only 5 have a positive test. What is the sensitivity of the new test? A. 8.33% B. 30% C. 40% D. 66.66% E. 75%

Diseased Positive Negative 30 10

Non Diseased 5 55

30 / 40 = 3 / 4 = 75 %

Q 11 A new diagnostic test for malabsorption the analysed results have yielded the following 2x2 contingency table.

Disease present test result +ve -ve

Yes 0.9 0.2

no 0.1 0.8

Applying this test to a case of chronic diarrhoea from a patients group where the prevalence of malabsorption is known to be 20% (probability = 0.2) what is Likelihood ratio of a positive test ?

Sensitivity = 0.9/(0.9 + 0.2) = 0.818 Specificity = 0.8/ (0.1 + 0.8) = 0.889 Likelihood ratio of a positive test (LR+) = 0.818 / (1 - 0.889) = 7.2

ODD RATIO

Q12: A new diagnostic test for malabsorption the analysed results have yielded the following 2x2 contingency table. Disease present test result +ve -ve Yes 0.9 0.2 no 0.1 0.8

Applying this test to a case of chronic diarrhoea from a patients group where the prevalence of malabsorption is known to be 20% (probability = 0.2) what is the probability of a patient having malabsorption if they have a positive test? 1 )0.16 2 )0.24 3 )0.48 4 )0.64 5 )0.8

ANSWER4
Sensitivity = 0.9/(0.9 + 0.2) = 0.818 Specificity = 0.8/ (0.1 + 0.8) = 0.889 Likelihood ratio of a positive test (LR+) = 0.818 / (1 - 0.889) = 7.2 Pre-test odds = Pre-test prob. / 1- pretest prob. = 0.2 / (1 - 0.2) = 0.25 Post-test odds = pre-test odds X LR+ = 0.25 X 7.2 = 1.8 Post-test probability = post test odd / ( 1 + postest odds) = 1.8 / (1.8 + 1) = 0.64

RELATIVE RISK (RR)
is the ratio of risk in the experimental group (experimental event rate, EER) to risk in the control group (control event rate, CER) Control event rate = (Number who had particular outcome with the control) / (Total number who had the control) Experimental event rate = (Number who had particular outcome with the intervention) / (Total number who had the intervention)

Relative Risk EER/CER Absolute Risk Reduction CER - EER Relative Risk Reduction (CER-EER) / CER NNT = 1 / (CER - EER), or 1 / Absolute Risk Reduction

Numbers Needed to Treat : Numbers needed to treat (NNT) is a measure that indicates how many patients would require an intervention to the expected number of outcomes by 1. It is rounded to the next highest whole number

Q 13
A new drug is trialled for the treatment of lung cancer. Drug A is given to 500 people with early stage non-small cell lung cancer and a placebo is given to 450 people with the same condition. After 5 years 200 people who received drug A had died compared to 225 who received the placebo. What is the number needed to treat to prevent one death?

Control (placebo) event rate = 225 / 450 = 0.5 Experimental (drug A) event rate = 200 / 500 = 0.4 Absolute risk reduction = 0.5 - 0.4 = 0.1 Number needed to treat = 1 / 0.1 = 10

Miscellanous Questions

Q 14
A study examines patients with bowel carcinoma. The mortality rate of those given drug R is 7%, compared with 10% in those not given drug R. What conclusion can be drawn? A. The relative risk of death when given drug R is 1.5 B. The relative risk of death when given drug R is 7/10 C. The number needed to treat to prevent one death is 3 D. The number needed to treat to prevent one death is 10 E. The absolute risk reduction is 3%

ANSWER: E)
the absolute risk reduction is 3%. The absolute risk reduction is 10 - 7 = 3%. The relative risk reduction is 3/10. The number needed to treat is 100/absolute risk reduction which is 33 in this case.

Q15
Which test is the best of the following, to compare two groups of categorical data, e.g. developed MI/ did not develop MI when a drug or placebo is given? A. Pearson's correlation coefficient B. Students t test C. Chi square test D. Wilcoxon rank test E. Multivariate analysis

ANSWER: C)
chi square test. Chi-squared tests are used to compare percentages or proportions of categorical data. Data such as the above can be organised into a 2x2 contingency table. From the chi-squared value a p value is read off a statistical table (depends on degree of freedom) to give the degree of significance.Normally distributed data can be compared with a Student’s t-test. Skewed continuous data can be compared with a Wilcoxon rank-sum test or a Mann-Whitney U-test.

Q 16
A new test has been designed and tested for the diagnosis of rhabdomyolysis. The sensitivity was reported as 90%. Which of these statements is true regarding the test? A. 90% of the patients with rhabdomyolysis will test positive B. 90% of the patients with rhabdomyolysis will test negative C. 90% of patients in the general population who have the test, will test negative D. 90% of patients in the general population who have the test, will test positive E. 90% of patients who test positive will have a correct diagnosis

ANSWER: A)
90% of the patients with rhabdomyolysis will test positive.Sensitivity is the probability that a test will be positive when a patient has the condition. Specificity is the probability that a test will be negative when a patient does not have the condition.

Q 17
Which of these statements is a good description of 'bias'? A. There is a flaw in statistical analysis B. The patients are misled C. The authors want a certain result D. Both study design and statistical analysis are flawed E. There is a flaw in study design that leads to a likelihood that a wrong result may be obtained.

E. There is a flaw in study design that leads to a likelihood that a wrong result may be obtained.

Q 18
Which of the following accurately describe cohort studies? A. There is no randomisation B. They are superior for measuring prevalence in a population C. They are superior for measuring incidence in a population D. Two groups are always required E. They are not useful for rare exposures

C. They are superior for measuring incidence in a population

Q 19
An ethics committee is meeting to discuss the approval of a trial of a new treatment against an established treatment for melanoma. Which one of the following would suggest that the trial was ethical? A. The current treatment is not effective B. The current treatment is less effective than the new treatment C. The new treatment is cheaper D. The new treatment has less side effects E. It is uncertain whether which treatment is more effective

ANSWER: E)
it is uncertain whether which treatment is more effective.A trial is only ethical if there is uncertainty regarding whether the new treatment is better than the current treatment. If it is known that one treatment is better than another, the best treatment would already be the accepted standard treatment. It would therefore be unethical to perform the trial because it would mean some patients would receive care that is known to be worse than standard care outside the trial.

Q 20

Thank You

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close