For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.
 So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)
Outcome  Primary summary measure  Measure of variability  Method of comparison 
Cancer progression (versus stable disease or improvement)  
Days to progression  
Tumor necrosing factor (TNF)  
Days per week with severe pain 
 Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)
Outcome  Control group result  Meaningful difference  Sample size 
Days per week with severe pain  3 (SD=1.732)  ±2  
Cancer progression (versus stable disease or improvement)  0.3  ±0.15 
For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)
Variable  Placebo result – summary measure (measure of variability) 
Drug – Summary measure (measure of variability) 
Overall – summary measure (measure of variability) 
Variable 1  6 (???)  Etc. 
 For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 h. What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)For this exam, you’ll follow the statistical aspects of one research project, start to finish. The project itself is hypothetical but is loosely based on typical cancer trials. When I say “loosely”, I mean that the subject is the same (intestinal cancer), but everything else is different. In fact, the data is entirely simulated and should not be taken to mean anything real even if the variables sound like something real.
For this exam, you may use any resources you have with the understanding that you are doing the work. You may ask questions of me or either of the TAs and, unless the question is “What is the answer?”, we will attempt to answer them. This is how the real world of statistical analysis works. You, the statisticianofthemoment, use your knowledge and consult others, as needed, to be sure you’re meeting the needs. Please be courteous, though, and don’t ask until you’re sure you don’t know and, above all, don’t wait to the last moment to ask. We, like any of your real colleagues in the world, will do our best to answer in a timely informative manner, but we won’t always be able to respond as well or as quickly as you’d like. Bear that in mind, too.
In this exam, if there is a right/wrong answer, you’ll be graded as right or wrong, but this exam is designed to test your ability to reason with and understand the statistical concepts so many of the answers don’t have a distinct right or wrong. Turn in your answers before Friday, August 18 at 11:59 PM. Yes, Friday. That’s when the semester ends.
You’ve been asked to help in the planning of a clinical trial designed to test the effect of a new medication (Improvasol) on a certain kind of intestinal cancer that has moderate risk of shortterm mortality, frequent pain, and no current effective treatment. The research plan currently proposes afiftytwo week trial where subjects are assessed weekly. Subjects will be randomized towards receiving the medication or a placebo and followed for signs of progression, therapeutic benefit, and death. The researchers believe that a particular biomarker, TNF (tumor necrosing factor) is likely to be a sign of therapeutic effect of the new medication.  So far, the research team is undecided on the appropriate outcome to use. Of the following outcome measures, describe the appropriate summary of interest, the measure of variability of the outcome, and how it might be compared between treatment arms. (10 points)

Outcome Primary summary measure Measure of variability Method of comparison Cancer progression (versus stable disease or improvement) Days to progression Tumor necrosing factor (TNF) Days per week with severe pain  Which of the above do you recommend using for the study. Why? (10 points)
 The research team has considered your answers to the above. They’ve narrowed their interest to two outcomes (for now): Days per week with severe pain and cancer progression. However, they want to know how many subjects they should recruit for each outcome. For simplicity, they think it’s best to have equal allocation to each arm. They’ve decided that they’d like to achieve a study power of 0.8, and use an alpha level of 0.05. Since the medication is relatively unknown (it did cause rats with intestinal cancer to report less symptoms of depression), they believe a twosided test is best. They’re not sure what difference to expect between the arms, but have settled on what they think is a reasonable effect size. In the chart below, they’ve summarized the outcome expected in the control group and the clinically meaningful difference. Please provide a sample size estimate assuming, unrealistically, that there is no attrition during the trial. (20 points, 10 each)

Outcome Control group result Meaningful difference Sample size Days per week with severe pain 3 (SD=1.732) ±2 Cancer progression (versus stable disease or improvement) 0.3 ±0.15 For some annoying reason, the research team didn’t contact you after you answered the last question. They waited until the last second to ask your analysis of the study results (less than 2 weeks!). What happened in the meantime? Rather than decide on one outcome, they gathered all of them and have included them in the dataset progression.dta (or progression.csv). This data is summarized in a separate file and should be used to answer the remainder of the questions on this exam.
 Before you actually analyze the data for efficacy, the research team wants to see a general summary of the data for each of their arms. Construct a summary table, reporting the summary measure of interest for each variable. Specifically, they’d like to see age, days per week with severe pain, days observed, progression, and tumornecrosing factor (TNF), and gender. In addition to the primary summary measure, be sure to examine the data for any other notable patterns. The study did encounter some unspecified problems, so there may (or may not) be unusual results. For the table, use a format of your choice, but consider some version of the following: (20 points)

Variable Placebo result –
summary measure (measure of variability)Drug –
Summary measure (measure of variability)Overall –
summary measure (measure of variability)Variable 1 6 (???) Etc.  For each of the following variables, describe what, if anything is notable about the variables in the above table. If there is anything unusual, describe the probable impact of such an issue and what you’d do to address them. Please keep your answers to just a few sentences (or less) (20 points, 4 each)
 Age
 Days per week with severe pain
 Tumor necrosing factor (TNF)
 Cancer progression (versus stable disease or improvement)
 Days to progression
 Of the variables described in the previous two questions, pick which one you feel best addresses the researchers aim to identify a difference of efficacy between the two arms. Answer the following questions
 Why did you choose this variable? (2 point)
 Test for a difference between the arms and, in a short paragraph, describe the observed difference, the test you chose to use, and the results (4 points)
 In the previous test, what value of the difference is associated with the null hypothesis (2 points)
 What value (or values) of the tested difference would be associated with the alternative hypothesis? (2 points)
 How would you interpret the pvalue? (4 points)
 Based on any information you have or can glean from this exam, is this difference a meaningful difference? Why? (2 point)
 What would you conclude about this drug’s efficacy? (2 point)
 What do you believe would have been the likely result if you’d chosen one of the other outcomes (pick one more)? Why? (2 point)
Attachments: