Wednesday, February 7, 2007

jv ch. 3

hereby my hopefully helpful but nevertheless random thoughts as i read back through JV chapter 3. unfortunately our two texts will begin to diverge for a bit at this point. it was a nice bit of synchronicity that JV chapter 2 and G&E chapter 3 largely overlapped in terms of summary statistics, but, for the next little while, each book's author(s) take a bit of a different tack. in a way, i think this is good, because, on their own, i think G&E would be a bit too theoretical, whereas JV would be a bit too pragmatic. i think the two balance each other nicely in this regard, although i do wish the content covered meshed a bit more consistently. looking ahead, we'll return to synchronous treatments of the fairly detailed topics of regression (which we touch on in this chapter), and ANOVA.

imho, some things are better done in spreadsheets, at least until you get the hang of the R way of doing things, so if you find yourself getting bogged down with binding vectors and adding margins in the early part of the chapter, i'd say you can safely skim it, and just be aware that you can do such things. in general, spreadsheets (such as microsoft excel) do this more easily and intuitively, and may be the tool of choice if you wish to do this for a big data set.

i do think it's pretty cool when JV shows you how to produce the side-by-side boxplots and overlapping density plots, and that skill will be useful in the future.

the q-q plots are a bit arcane, and i wouldn't spend too long on them. imho, there are better (if less visual) ways of checking the normality of your data.

scatterplots are particularly important, as are correlation and regression. be aware, though, that we'll be coming back to regression later in the semester. it does make good sense to at least introduce it here, though. i think the short bits on transformations and outliers are worth reading closely, too. though, again, we'll be coming back to them.

2 comments:

Nicole Michel said...

Hey Mike,

Thanks for sharing your thoughts on the chapter and what's important to focus on. 'Course, I didn't think to check this until after I'd already read the chapter and blogged about it (I commented on the qqplots, too - I don't get them). Cést la vie...

Rebecca Hazen said...

Mike,

I'm glad you wrote a bit about the two texts. I have to admit, I was a bit concerned that two stats texts would be a lot to take on in one course, but they have been melding together well so far. Also, as you noted in your posting, they balance eachother nicely since G&E gives a theoretical spin to the material, while JV shows us how to apply it.

One more thing, it sounds like you might be implying that we will be able to make R and Excel "talk" to eachother. Is that possible? That would be such a relief! Even though I don't have a very large data set, I've been dreading the idea of typing all of my data into R.

Rebecca