- UID
- 124442
- 热情
- 2150
- 人气
- 735
- 主题
- 2
- 帖子
- 29365
- 精华
- 40
- 积分
- 16261
- 分享
- 0
- 记录
- 0
- 相册
- 0
- 好友
- 0
- 日志
- 0
- 在线时间
- 5488 小时
- 注册时间
- 2007-8-1
- 阅读权限
- 30
- 最后登录
- 2019-8-19
升级 25.22% - UID
- 124442
- 热情
- 2150
- 人气
- 735
- 主题
- 2
- 帖子
- 29365
- 精华
- 40
- 积分
- 16261
- 阅读权限
- 30
- 注册时间
- 2007-8-1
|
2008 Summer School问答贴中的宝贵信息
这里的问题都是我在Summer School问答贴回答过的
为了阅读的方便,问题标为蓝色,回答标为红色。
20x General Questions
odds ratio是不是只需要知道怎麽寫executive summary?
是的,因为odds ratio的technical notes就是R code.
1e-3=0.001?
是的。
20x Tutorial Questions
Tutorial 10 ADD data:
"The plot of Cook’s distance shows none of the observations have an undue influence
on the final model. Observation 78 has a large value of Cook’s distance, relative to the
other observations, but removing it did not change any of the coefficients by more
than a standard error, so it was retained."
是怎么看删去unusual observations后,有没有SIGNIFICANT CHANGE呀?
还有。所说的“did not change any of the coefficients by more than a standard error”是怎么看出来的?
new coefficient - old coefficient
- -!是全部的VARIABLE都要试过去吗?
你能拿这个例子示范一下怎么算给我看吗?谢谢。
> exam.fit3<-lm(Exam~Assign+Test+I(Test^2)+Stage1,
data=course.df[-106,])
> summary(exam.fit3)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 31.42284 6.37936 4.926 2.35e-06 ***
Assign 1.35513 0.24074 5.629 9.65e-08 ***
Test -0.83163 1.11701 -0.745 0.45782
I(Test^2) 0.12584 0.04633 2.716 0.00745 **
Stage1B -5.22603 2.19378 -2.382 0.01856 *
Stage1C -12.78126 2.34233 -5.457 2.16e-07 ***
> summary(exam.fit2)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.69567 6.49900 4.723 5.58e-06 ***
Assign 1.56651 0.23068 6.791 2.93e-10 ***
Test -1.24720 1.12706 -1.107 0.27036
I(Test^2) 0.13846 0.04698 2.947 0.00376 **
Stage1B -4.45745 2.21620 -2.011 0.04621 *
Stage1C -11.84911 2.35973 -5.021 1.54e-06 ***
IF((new coefficient - old coefficient)/old standard error)>1
THEN influential
除了intercept
Assign (1.56651-1.35513)/0.24074=?
下面的test, test2, stage1B和stage1C都要试,如果很多都是>1, 106就是influential.
还有。。。我刚才看CASE STUDY :PERUVIAN INDIANS。。
((new coefficient - old coefficient)/old standard error) 如果是-1 -2什么的是不是也算?
因为我算了下 weight是(0.82103-1.263)/0.31261=-1.4139
嗯,是的,是绝对值。
Tutorial 11: Logistic Regression & Time Series
Meningitis Data那道题。。。
Using the output from the seasonal factor model (page 506), calculate the forecast of
the log of the number of notified cases of meningitis for January 1998.
The seasonal factor log model is:
log.meningitist = 0.628026 + 0.019481 × time – 0.252158 × month2 + … +
0.064355 × month12 + 0.321058 × log.meningitist-1
We are forecasting for January 1998 i.e. time period 97
log.meningitis97 = 0.628026 + 0.019481 × 97 – 0.252158 × 0 + … +
0.064355 × 0 + 0.321058 × log.meningitis96
= 0.628026 + 0.019481 × 97 – 0.252158 × 0 + … +
0.064355 × 0 + 0.321058 × 3.988984
= 3.798378225
我想知道。。。log.meningitis96也就是 3.988984是怎么得出来的?
不是题目里面的54吗?你log(64)=3.988984
20x Past Exam Questions
PAST EXAM PAPER....2007 S2....SECTION A.......
QUESTION 7&10...怎麽算答案都不對....阿童木能演算一下嗎?謝謝...
Question 10
> exp(log((35*155)/(129*11))-1.96*sqrt(1/35+1/11+1/129+1/155))
[1] 1.867189
Answer is the first 1.
Question 7
> exp(2.797203+0.010852*85+0.509277*log(104660))
[1] 14854.93
Answer is the fourth 1.
我想問一下....2007 S1
question 3 為什麽選項3是錯的?
question 7 為什麽選項2是錯的?還有第3個和第5個選項是怎麽算出來的...
question 14 爲什麽選項1是錯的?
謝謝
2007S1
Question 3
(3)We estimate that the odds of dying for a patient admitted to ICU withan infection are about 0.77 times the odds for a patient admittedwithout an infection.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
INF 0.76525 0.41681 1.836 0.066360 .
> exp(0.76525)
[1] 2.149532
得到odds需要back-transform. 0.77没有transform.
Question 7
Coefficients:
Estimate Std. Error t value Pr(>|t|)
time[-1] 1.194e-02 1.792e-03 6.664 9.77e-11 ***
2) For each additional unit of time, the CO2 concentration increases by about 0.012 ppm, on average.
这句是错的因为不是the CO2 concentration increases by about 0.012, 是the mean CO2 concentration increases by about 0.012.
(3) We estimate that the autocorrelation coefficent (rho) is about 0.78
Coefficients:
Estimate Std. Error t value Pr(>|t|)
dioxide.ts[-384] 7.777e-01 3.289e-02 23.648 < 2e-16 ***
7.777e-01, 这个是estimated rho.
(5) A point estimate for the difference in average CO2 concentrationbetween July and August of each year is about 0.887 ppm, having allowedfor the trend and for autocorrelation.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
month[-1]7 -1.636e+00 1.273e-01 -12.846 < 2e-16 ***
month[-1]8 -2.523e+00 9.083e-02 -27.782 < 2e-16 ***
> -2.523--1.636
[1] -0.887
Question 14
(1) A high value for R2 indicates that our model will be useful for prediction.
A regression model will only be useful for prediction if and only ifthe R2 is high and having satisfied the normality assumption.
2007 SS
question 9
option 1是怎麽算出來的?
option 5為什麽是錯的?
2007SS Question 9
(1) Using the regression model, our point prediction for the 1st quarter 2003 is about 15.4 million litres.
> beer.pred
fit upr lwr
2003 Q1 15224.24 19383.60 11064.886
15224大约等于15.4million.
(5) The point prediction for the 4th quarter of 2004, using theregression model does not lie within the prediction interval weobtained using the Holt-Winters technique.
在Holt-winters, point estimate是C.I.的中间值。所以does not lie within是错的。
thanks...
For 2006 Second semester ...
question 10. how to get the answers for option 4 and 5?
2006SC Question 10
> predictions
fit upr lwr
Jan 1973 0.3860656 0.7937101 -0.021578940
(4) Our point prediction for ozone concentration for January 1973 is about 1.47 ppm.
(5) We estimate ozone concentration for January 1973 will be between 0.98 ppm and
2.21 ppm.
> exp(c(0.3860656,0.7937101,-0.021578940))
[1] 1.4711812 2.2115864 0.9786522
For first semester 2006
question 5
the option 2
why we cannot qualify the difference in average ages using this analysis?
That's because the analysis was done on the transformed data.
summer school 2006
question 18 b) 爲什麽是用chi-square test...hourly attendance不是quantitative variable嗎?
因为这是一个trick, hourly attendance你就record(count)一次,所以你没有quantiative variable, it's not a variable, it's only a count.
2007 S2
Q9 option 4 是怎么算的?谢谢
2007SC Question 9
(4) We estimate that if a patient required resuscitation, the odds of survival are
multiplied by about 0.2
Coefficients:
Estimate Std. Error z value Pr(>|z|)
CPR -1.63066 0.61553 -2.649 0.00807 **
> exp(-1.63066)
[1] 0.1958003
我又来了~
想问一下..2006 SUMMER SCHOOL的QUESTION 8
为什么第一个选项是对的...第二个是错的?
2006 SS
Question 8
(1) The residuals appear to be reasonably symmetric.
Residuals:
Min 1Q Median 3Q Max
-0.060829 -0.013325 0.002093 0.012705 0.047020
5 number summary shows the residuals appear to be symmetric.
(2) The t-statistic for testing the hypothesis that there is no trend is about 6 million litres.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
time[-1] 0.0016375 0.0002039 ***** 2.43e-13 ***
t = > 0.0016375/ 0.0002039
[1] 8.030897
2007 SS
question 10:
爲什麽第一個選項是對的,第二個選項是錯的?
還有就是question 18不是會讓我們選擇which form of analysis should be used to investigate嗎?
在什麽情況下 我們會用到logistic regression?
2007SS
(1)The confidence interval for the odds of dying for patients admitted ina semiconscious state compared to the odds of dying for patientsadmitted in a conscious state will contain 1
Coefficients:
Estimate Std. Error z value Pr(>|z|)
LOC1 18.60221 1039.99658 0.018 0.985729
The P-value 0.985729 refers to the hypothesis that the log oddsassociated with LOC1 is zero. After backtransforming, exp(0) = 1, sothe confidence interval should contain 1 as an insiginificant P-valuecontains the hypothesised value.
(2) For each additional year of age of patients admitted to ICU, the odds of dying increase by about 0.03
Coefficients:
Estimate Std. Error z value Pr(>|z|)
AGE 0.03072 0.01268 2.423 0.015377 *
The log odds increases by about 0.03, not the odds.
For question 18, you use logistic regression when your response variable is qualitative binary variable.
[ 本帖最后由 z-score 于 2008-2-28 21:11 编辑 ] |
|