Statistics

about.me Follow me on Facebook Follow me on Goodreads Follow me on Twitter

Read: How Numbers Rule the World

Disappointing.

How Numbers Rule The World is mostly not about the use and abuse of statistics in global politics as promised on the book’s cover. The lack of good, appropriate statistics, the over-reliance on only seemingly objective measures is a theme of Fioramonti’s book, yes. His focus, however, is on his aversion against market solutions to social challenges and the creeping commodification in all domains of life.

I agree, a badly designed market, a sloppy implementation of a well-designed market, special allowances that circumvent the intention of a well-intentioned policy (like a market for emissions) are all issues that deserve a good rant. (Blind) Management by numbers, fudging of data, badly designed incentives, and rent-seeking behavior are also topics that deserve a good rant. Put these in a book with a title that promises an in-depth analysis of the use and abuse of statistics in politics, how statistics shape the world and you have a serious case of false advertisement, wilful deceit, and a disappointed reader.

Pair this with constant references to non-relevant (to the current issues) quips by prominent and less well-known persons, references to anecdotal screw-ups in support of your rant, and the occasional tangential discussion of philosophical issues and the reader is not just disappointed.

Read: Counterfactuals and Causal Inference

Counterfactuals and Causal Inference is a very practical book that discusses the different approaches to identify causal effects (in non-experimental and experimental data) at a very abstract level. Depending on the reader this may be a good or not so good thing. I had to expend substantial effort to work through the text and I fear that even though I understand directed acyclical graphs I have not developed any intuition in their application that would help me in my applied modelling. Often, the text remains at a too abstract level.

What the text is missing is an even more practical guide with more concrete applied problems and their solutions. Yet, the text is good. It’s not a handbook for a quick how to do it. It’s not a textbook for undergraduates. It’s a critical survey of the state of the art of statistical approaches for the identification of causal effects. It’s a valuable reminder that the regression approach is no magic bullet.

That being said, the text raises the important question of identification and alerted me that some effects that we estimate and report may not be the effects that we would like them to be. I guess I will have to be even more careful when I interpret regressions in the future.

Addendum: I have read the first edition that I had for already some years sitting on my to-read shelf. I just discovered that there is a 2nd, revised edition available.

Read: Statistics Done Wrong

Reinhart’s Statistics Done Wrong is a refreshingly entertaining exposition of typical and embarrassingly widespread problems with the statistical analysis in (published) research.

It is not a textbook. It is non-technical. There are no formulas and only very few numbers. Nevertheless, it teaches the art of statistics. It may even instill the wish in the (un)initiated reader to pick up a statistics textbook and finally learn the stuff. As such it may be a good gift for a first year PhD researcher. Knowing about statistical power and related concepts before any data is collected can dramatically improve any research design and thus the final research (article).

There is nothing new in Statistics Done Wrong. All problems and all the examples chosen to illustrate them are already well known or were at least discussed in the usual blogs on applied statistics and data analysis. It is obvious that Reinhart follows, e. g., Andrew Gelman’s blog. Of course, he does. Everyone interested in the use and abuse, in good and bad practice of statistics follows (or should follow) Andrew’s blog. Nevertheless, Reinhart adds additional value. His writing is clear and accessible.

I have only one quibble: Reinhart states in the preface that he is not advocating any of the recent trends in and attempts to improve the practice of statistics: may this be the complete abandoning of p-values, the use of “new statistics” based on confidence intervals, or a switch to Bayesian methods. Actually, he is advocating rather strongly for the use of the “new statistics”. He advocates the use of effect size estimates and confidence intervals over vanilla p-values. This is absolutely fine. Yet, he should stand openly to this position and not deny it.

Tags: 

Read: Generalized Linear Models for Categorical and Continuous Limited Dependent Variables

On first impression, the small textbook by Smithson and Merkle is a nice companion for Agresti’s Categorical Data Analysis and Analysis of Ordinal Categorical Data. It briefly discusses the theoretical foundation of the applied modelling approaches, explains the models using concrete examples, and provides a brief introduction to the relevant R (and Stata) functions.

On a more careful inspection, however, it becomes clear that the discussions are often too shallow. In particular the applied models would have benefited from more detail. The reader is referred to other textbooks for the missing details that would be necessary to really learn and understand why a certain approach should be taken and how to interpret and check any estimations. The text cannot stand alone. Its contribution is, thus, a mere cursory overview of a few select functions in R (and stata). Some additional functions for R are provided on a accompanying webpage. What, of course, begs the questions why the authors did not package these functions in an R library that is made available an the standard electronic archive for R, CRAN.

What really made me question the text, however, were phrases like: “…its p value is 0.057, which conventionally would not be regarded as not quite significant…”, and “This model is not quite significantly superior to the preceding one (… p=0.068).” This is not quite good scientific practice. In a textbook of all things.

Read: Understanding The New Statistics

Understanding The New Statistics is about understanding statistics and applying statistical methods that are not new at all. They are just under-used in the social and behavioral sciences.

It is all about abandoning Null Hypothesis Significance Tests and replacing them with the more informative Effect Sizes and Confidence Intervals. Targeted at students as a complementary text to their standard textbook the most important and distinguishing feature of Cumming’s book is its attempt to create intuition for the variability of data and derived statistics. The many excercises that rely on simulating (small) data (sets) and observing the variability of summary statistics are a great tool for understanding the properties and interpretation of these statistics.

Nevertheless, beyond facilitating said intuition the text has little additional value. The theory, the necessary math is often not presented. The exercises and indeed much of the book rely on a (free) proprietary software that I cannot use since it depends on another commercial software that I don’t own and would have never used for statistics (excel). Therefore, much of the text remained cryptic. I would have preferred an open source approach, maybe an R package.

Further, for a text that is advocating replacing NHST with substantial statistics on effect sizes and uncertainty there are too many asterisks signifying different levels of statistical significance. More surprising was, however, the absence of any glimpse at Bayesian methods that would fit the bill perfectly, showing likely effect sizes and their corresponding uncertainty. In the context of meta-analysis I would have expected an updating of our beliefs, a Bayesian aggregation of the accumulating evidence. Instead, the text remains 100% frequentist.

In the end, the text is maybe not for the student but for the teacher. And maybe the text should not be read for its content in a narrower sense but for the ideas on pedagogy on how to teach introductory statistics.

Read: How not to be wrong

With “How not to be wrong” being about mathematical thinking I was a bit surprised about how much of it was about statistics. And even though it (may) lack(s) the depth of critique of the (ab)use of statistics that can be found in the works of Ziliak and McCloskey or Gigerenzer it is a very good popular treatment of the topic. Worth the read.

A particular additional added value is – in my opinion – the reminder that most things in the real world are not linear. Linearity is just an approximation, valid for only (very) small ranges. I agree with Ellenberg, we – I – forget this too often.

The only thing that I did not like was the sports references (I can condone idiosyncratic tastes in music). The book includes lots of footnotes and endnotes with references. So many, and so many recent ones that I, indeed, found a few new sources that I added to my to-read list. That is rare.

Pages