Meta-Analysis and the Aggregation of Chaos
December 7th, 2014
by Philip Pilkington
Did you know that if you are male and eat beans every Tuesday morning at exactly 8.30am you are more likely to marry a supermodel? No. That’s not true. I just made that up. But I hear of statistical studies in the media that sound only slightly less ridiculous all the time. Often these have to do with diet, sexual psychology or… economics.
All three of these spheres are, of course, the sorts of things you find dealt with in the religious and mythological texts of old. This is because they are key psychological aspects of how we as humans form our identities. The manner in which we eat, what would today be called our sexual orientation/preferences (it should be noted that this was treated very differently prior to the 19th century…) and how we organise our societies are things that constitute key components of our personal identities.
These are slippery aspects of existence. Because they are effectively moral issues we as humans need to feel that they are constant throughout time and space. But anyone with any historical or cultural understanding knows that these shift this way and that over time. Diet fads fluctuate rapidly, while cuisines of various types go in and out of fashion. Sexual norms change from decade-to-decade (homosexuality was considered a mental disorder in the West until 1973!). And if you need to be told that fads in economic policies are historically contingent and reflective of the politics of the day then you probably shouldn’t be reading this blog.
The Age of Reason and Assumed Constancy
Science dreams of reducing all of this to Reason. It has since at least the 19th century when religion fell by the wayside and science tried to fill the void. In every era there is some hocus pocus thrown up wearing the clothes of the scientist and handing down Moral Truths: about how we should eat, how we should conduct ourselves sexually and how we should run our societies. In the past 40 or so years these questions have increasingly fallen to social science disciplines (and dieticians) who use statistical techniques.
The problem is that the nature of the material that they are dealing with is not suited to the techniques they are using. The nature of the material is that it changes and evolves through time. We cannot anticipate these changes to any large extent either. Doing so would be like trying to predict what style of dress will be popular in 2080. This leads to the statistical literature generally being a mess. Indeed, the literature itself seems to evolve through time together with the data and the ideological fads that emerge and die off. I increasingly think that the statistical literature is coming to mirror the trends themselves but with a lag.
Seeking Order Out of Chaos
The latest attempt to impose some order on this chaos is the practice of so-called ‘meta-regression’. The idea is to take all of the studies showing all of the contradictory results, aggregate them and run regressions on them. In sciences where the material is suited to statistical study — that is, in sciences where causality does not change and evolve through time — this is quite sensible. But where the material doesn’t accommodate this such analysis likely only amplifies the underlying problems.
Take, for example, the following paper ‘Wheat From Chaff: Meta-Analysis As Quantitative Literature Review‘ by T.D. Stanley. In the paper Stanley says that we should use meta-regressions to do our literature reviews. The problem is that this assumes that the regressions on which we run the regressions have some underlying validity in the first place: that is, they can give us information about certain causal laws that will hold into the future.
Some of the examples that Stanley gives where meta-analyses have been applied in the past seem reasonable, others do not.
There are many examples where meta-analyses clarified a controversial area of research. For example, meta-analysis has been used to establish a connection between exposure to TV violence and aggressive behavior (Paik and Comstock, 1994), the efficacy of coronary bypass surgery (Held, Yusuf and Furberg, 1989), the risk of secondhand smoke (He et al., 1999), and the effectiveness of spending more money on schools (Hedges, Laine and Greenwald, 1994; Krueger, 1999). (p133)
The efficacy of coronary bypass surgery seems very reasonable. We know the mechanism through which this is supposed to work. But there still arises the question of environment. I should hope, for example, that the meta-analysis is being run on people in countries with similar diets and weather that some from similar income groups. This raises an issue that we shall encounter more critically in a moment.
The risk of second-hand smoke is slightly more dubious. This, as is well-known, is not something that is particularly easy to prove. I do not know how they do these studies but I would assume that they would look for instances of lung and heart disease in non-smoking people who co-habit with smokers. Something along these lines will be a reasonable approach. Again, this is because we know the mechanism through which smoking causes these diseases and we know that this has relative constancy through time and space.
Spending money on schools is far more difficult. First of all, Stanley doesn’t say what spending more money on schools is effective for. We can only assume that it has to do with educational outcomes. Personally I believe that spending more money on schools is generally effective in this regard simply due to intuition and personal experience. But it is not quite clear that we can meaningfully test it in statistical terms, nor is it clear that we should ever make such claims except in a very general sense. The causal mechanism is not clear here. There are many ways in which this money can be spent. It is also not clear that spending money will fix problems in all schools. Some schools may have issues related to funding. But some may have issues that have little to do with this: the class background of the children who attend or the structure of the testing regime come to mind as issues that may not be related to funding. Here we are beginning to see that the causes and effects become murky. While every smoker suffers from basically the same cause and effect mechanisms, this seems less likely in the case of schools.
The study linking TV violence and aggression sounds the alarm for me. That sounds like garbage. The causal link here seems highly abstract and based on some crude mechanistic stimulus-response view of human psychology. The methodological issues also seem problematic: is this a lab experiment or is it based on survey results? Both suffer from serious problems. I also see no way to establish causation: do people with violent tendencies watch violent TV programs or vice versa? If we cannot establish causation any information we do glean from the study — even if we believe in the study itself — will be largely useless.
I could look at all the studies individually, but I — like you, dear reader — have limited time. We all need some sort of filtering system to sort sense from nonsense and what I just demonstrated above is how I tend to think about these issues; in economics, as well as when I’m reading the newspaper the above approach is how I usually deal with such issues. And I think it is pretty functional.
Anyway, back to meta-regressions. The problem with these is that they aggregate even more than the studies themselves. This is fine when we are dealing with material that is homogenous through time — that is, material where the causality is fairly stable — but it will not work where the causality is slippery. In the above examples again I would highlight the studies linking TV violence to aggression.
I have dealt with this question on here before. But let me give a practical example: that of the multiplier. Let’s say that I need to give a politician a number for the fiscal multiplier in their country. Now, many economists — assuming that causality is constant through time — would get as much time-series data as possible for the country in question and run regressions. But let’s say that some extreme event had happened in the past five years like, oh I don’t know, a financial crisis. I would think that the multiplier would likely have changed from before this crisis. Thus the question is raised whether we should estimate the multiplier using the whole time-series or using the data from after the crisis. My gut would say that we should probably use the data after the crisis but there are probably some ways to look into this in more depth.
Working from an Insular, Consensus-driven and Insecure Basis
The point is that we at least need to raise the question. But economists often do not. They aggregate, aggregate, aggregate. They choose datasets willy-nilly. They assume constant, homogenous causes. Why? Because, I think, they are more often than not already sure of what they are going to say and they use the empirical techniques to dress this up. There is a risk then that using meta-analyses will only give us a reflection of the average opinion of the economics community at any given moment in time. But these opinions are extremely prone to fads because the economics community is insular, pretentious, consensus-driven and ultimately insecure. Today NAIRU, yesterday monetarism. Tomorrow? God knows. Beans and supermodels probably wouldn’t be far off.