This is Idol's opening kill the messenger bit, which he'll back up by throwing a bit of Latin at you (curiously, after the fact) and waving a few terms of art he's reasonably sure won't be understood, both diminishing a number of well educated people, the institution they're authoring the paper for, and the conclusion, if indirectly, while promoting the idea of his authority to judge the work without establishing his credentials.
If I show how it's wrong, then the showing speaks for itself, and obviously obviates any need for my 'credentials,' since I'm in no way appealing to my own authority here.
Everything I wrote regarding statistics is factual. This study is, while valid statistically (which I never denied), insignificant due to the magnitudes of the estimated statistics involved. The study, for one thing, seeks to show that the yearly variance is partly explained by this one binary parameter, whether or not the federal "AWB" was or was not in force.
What someone versed in statistics knows is that this is only one part of the problem, and the other one is far more important, which is the intercept, because in relation to the intercept, the variance is minuscule. If you remove all yearly variance such that we have a straight line from year to year, that straight line is going to be near 15000 murders. So what good is e.g. (Latin
) making an 'AWB' law?
I mentioned the magnitude of the estimate of the coefficient, it's 9 parts in 10000, or 0.09%, so we'd 'permanently' lower the fraction of those murder victims who are murdered in a mass murder, by less than 1%. It wouldn't lower the overall murder rate at all, is only one of the unspoken conclusions of this study.
I also mentioned that there were a variety of things that the "AWB" did, and also some things that it did not do (there were plenty of high-capacity magazines for assault weapons floating around, legally, in the US during the period that the "AWB" was law), so even the selection of this very imprecise binary parameter isn't without its own difficulties as to what exactly it means that the "AWB" was law. Practically, it did not mean nearly as much as the study's authors appear to think it means.
Do you deny that 0.09% is a tiny change, Town?
Do you deny that that 0.09% estimate is actually the midpoint of a confidence interval, that drops down even lower that 0.09%, but excludes zero (which is what is meant by a 0.05 p-value constituting statistical significance)? It means that there's a decent chance that the actual value could be higher---or lower---than the already tiny 0.09%.
Also, you quoted two paragraphs of mine but seem to only respond to the first one, so I'll requote it and ask a followup:
What the study tries to say is that all things being equal (ceteris paribus) there is an upward, unexplained trend every year of 0.7 in the proportion of people killed in mass murders. Controlling for this unexplained trend, they found that during 1994-2004, there was a 9 out of 10000 decrease in people killed by mass murders.
Can you identify a single error in the above?
Also, 'ceteris paribus' is common lingo in certain fields, perhaps especially in fields employing statistical analysis; and I led with the English anyway, only adding the Latin parenthetically to clarify to those in the know what I'm talking about. I don't fear, in the age of wiki, confusion that might in decades past have been caused, by invoking a common Latin phrase in otherwise English. Anybody can check to see what's meant there, and they'll just see that what I wrote in English, is precisely what the Latin means, or 'vice versa' (
).
It's a neat rhetorical trick.
And what I would say is that it's you with the neat rhetorical trick, trying to cast doubt on my brief analysis of the study you copied-and-pasted.
One thing you should be careful about is that superscript '2' next to the R statistic. The study itself has the 2, making it an R-square instead of just an R, which is what your erroneous copy-and-paste has there, and it's incorrect that an R-value of 0.3 means that 'a third' of the variance is explained. The relationship between R and R-square is just as you'd think, the R-value is squared to get the R-square value. And an R-value of 0.3 is closer to a third of a third of the variance explained, not a third. But as I said, I inspected the actual published study and saw that the authors did in fact at least get the R-square interpretation right. So I didn't mention it, since I was only addressing the study and not your copy-and-paste job.
Here comes the second leg of that.
What he sort of noted there, is that each year of the ban contains a set of data.
I didn't say that, because the study doesn't indicate that. The study indicates that they took one summary statistic estimate per year, and that this data set is the 'raw' dependent data used in their analysis, which looks like an extremely simple multiple regression (with two parameters /independent variables; the year, and the binary "AWB").
Each whole year is one datum, iow. As I said, there are about 37 data in the whole analysis. That is enough statistically to make a valid statistical analysis, it's just a very small dataset, 'ceteris paribus' (all other things being equal). To make a better estimate, you need more data.
I did not notice, in reading what is available of the study, which parameters were initially included in their analysis (tested for statistical significance), but were subsequently (and rightly) excluded from their model, as being statistically insignificant contributors. That would have made the study far more helpful, by not only identifying those parameters that do statistically significantly correlate to their dependent variables, but also identifying those ruled out as statistically insignificant, so that we can stop thinking about literally every possible factor, as we discuss this and other statistical studies regarding gun control.
Also, that of the (count em) two parameters that they do include in their resultant model, one is a binary parameter, and the other is an arbitrary 'year' parameter, tells those who understand statistics just how limited an analysis this is. What explaining power is there in identifying the year the data was collected from? Zero. Are we supposed to roll back calendars in order to achieve lower proportional mass murder rates? Even given statistical significance, there's no practical significance at all; nothing actionable; in identifying the year as a statistically significant contributor to the dependent variables. Most statisticians wouldn't include this parameter at all, since it readily shows that we have no idea what causes this upward trend.
By normalizing the dependent variable through dividing by total murders, the study's authors lose what could have been more clear data, if instead they actually counted the number of mass murders per year, multiplied by total victims, or something like that, to get a more sensitive metric to better answer the question that the study sets out to answer.
Something else I noticed, as an uncredentialed person, is that while the study indicates that 'year' explains about 30% of the variance in the dependent variable, they never mentioned how much additional variance is explained by "AWB." Given the extremely small estimate of the coefficient for that factor, my guess would be that it didn't provide a 'significant' enough boost to their model's explanation power to warrant calling it out explicitly, but I could be wrong. 'Just a disinterested and uncredentialed person's thoughts on the matter.
There's actually ten years in the ban period
I gave them more credit than they deserved then. I thought it was 1994-2004 inclusive, but apparently they take 1994's data as part of the '0' setting for the binary "AWB" parameter, instead of a '1.' My bad. But it actually means the study's weaker by a bit, because now one of the two datasets has only 10 data points instead of 11, which would have been marginally, but really, better.
I'll also mention that if we could see plots of the data they used, we might see something like a definitive step change from before and after the "AWB," which would lead a reasonable person to conclude that the real effect of the "AWB" was that it caused an otherwise unexplainable increase in the number or severity of mass murders, as compared with the period leading up the "AWB." iow, perhaps the "AWB" somehow prompted people who wouldn't have otherwise committed mass murder, to do so, once the "AWB" expired. You can't unring that bell.
and two different periods, before and after the ban, to look at and essentially see if there was any statistical difference notable.
Yes, right. An exceptionally facile analysis.
The estimate was 0.09% difference, right? It's still a tiny number. Which is why I said the study while statistically significant, is practically insignificant.
They're saying that if you look at the same rough period before and after the ban you'll see a significant statistical variance occurs during, lower numbers of fatalities by gun.
Lower fatalities proportionally due to mass murders, specifically. 0.09% lower. That tiny number.
Idol says this without any practical demonstration of why it's so, how what appears significant is altered by his introduction of the term "practical."
The magnitude is tiny. It'd be like measuring how high off the ground you can lift a barbell. If I stand upright, then it lifts maybe two or three feet. If you try to do it, maybe you're able to lift it a little bit, and we're given five or ten or twenty tries at it. I lift it up all the way each time, and the estimate of the coefficient is say 24 inches, and the p-value of the measurement is 0.000. Your twenty attempts average a quarter-inch, with many measured at zero, but a handful get an inch or two off the ground, so the estimate is a quarter-inch, and the variability of the measurements work into the estimate of the confidence interval for the average measurement and we'll say that the lower confidence interval is an eighth or a sixteenth of an inch, which excludes zero, and so the p-value of this estimate is also less than 0.05, so it too is statistically significant.
But if you're going to use this analysis to determine who you want lifting something heavy off the ground, you're going to go with me, because the estimate /coefficient when I lift the bar is so much greater than when you try to lift it.
That's the difference between statistical and practical significance, and why statistical significance in and of itself in insufficient to determine whether or not a parameter or model is worth getting excited about.
This study is not worth getting excited about, if you're looking to argue that it's a good idea to 'bring back' an "AWB." It'd be like hiring you instead of me to be a furniture mover, thinking that we'd be equal to the task, based strictly on that we can each statistically significantly lift the barbell off the ground. The magnitude of the estimate of the coefficient matters. That tiny number.
An abstract is not the entirety of the thing, is a summary of sorts. It's examining the years in question to determine a particular impact. It manages to do exactly that.
Whatever that means. It's a simple binary parameter. It's a certain way to carve up a relatively small dataset into two groups. Also as an aside, one could just as easily take the 1994-2004 years and compare them with the other years using a t-test, and the confidence interval and estimate for the difference would have been the same. A multiple regression is one way to essentially multiply out many different simultaneous t-tests, with also being able to include continuous independent variables, like in this study, with one of the two significant factors being 'year.'
To perform a categorical analysis (without any continuous variables), you could use an ANOVA to simultaneously analyze multiple categorical (not continuous) independent variables. For example, the authors could have had three categories of data; the years preceding the "AWB," the "AWB" years, and then the years following the "AWB," to test for a statistically significant signal between these three categories. That wouldn't have been the worst idea, and probably would have captured the general rise in people killed in mass murders more precisely, is my guess. Or, it would have shown that before and after the "AWB" are relatively similar, with the "AWB" years being statistically lower. It just would have provided more insight into the data.
Idol's sorrows notwithstanding, he's wrong.
The years before and after are chosen for a purpose. The rule, the norm, the average that either continues uninterrupted or is impacted by some change is found in that examination. The reason you want years before and after, and why those years are markedly more valuable to understanding the answer to that particular inquiry than randomly chosen sets of data, isn't really difficult to see, unless you don't mean to. If a trend established in the years prior to the years of the bans is suddenly and significantly altered, then resumes with the ending of those bans, it speaks to the impact of them absent some other factor of equal or greater weight that began and ended in the same period.
The study makes its point, Idol's protest aside.
I've already touched on the matter of statistical 'versus' (
) practical significance, and now I'll include some other thoughts that should occur to anyone good with statistical analysis.
2017 is included in the data. In 2017 the suicidal mass murder Braddock murdered 58 people. What I'd want to see is how much of a contributor this single data point was, upon the overall analysis. Since it occurred right at the tail end of the data used in the study (with 1981 being the start), I'd think to check that this one event didn't explain the 'year' independent data coefficient enough to render that arbitrary factor statistically significant. Do you have any idea if they did that? What's available to read doesn't mention it.
Another thing is the residuals of the model---is there any statistically 'out of control' behavior in them? This could be one way to see if 2017 all by itself skewed the analysis in a somewhat counterintuitive way. The residuals of any good model should be in statistical control, and not show behavior indicative of some special cause that isn't captured by the analysis. 'In statistical control' means that the data you're looking at is normally distributed. Any model that leaves residuals that are not normally distributed has a big weakness in it.
That's just some thoughts coming from an uncredentialed person.
And that's before we understand the data we have from every other Western Industrial Democracy with stronger gun laws and safer citizenry
How do you get safer citizenry? This factor seems pretty important to me. And by "safer citizenry," do you really just mean people who aren't murderers? Because for sure, if we could just lower the population of murderers, we can lower the number of murders. Ipso facto, id est, QED, ibid, et cetera. And supra.
, and before we examine the rule established by the top and bottom states here in terms of gun laws
What "rule," and what precisely do you mean by "established?"
, and the substantive difference in gun violence and deaths you find among the weaker laws as the rule.
I've seen pretty much the whole earth's data on the matter, taking murder rate as the dependent data, and gun control (as measured by the inverse of civilian gun ownership), and population density, and their cross product (to test for any interaction between gun control and population density), and none of those parameters is statistically significant in explaining the variance in murder globally.
With the US, there are 120 civilian owned guns per 100 people, which is far and away the biggest number in the gun control dataset (it means the US has the least gun control of all the other nations---the next closest country Yemen, isn't even half as large a number), and so what that does is it makes the US's murder rate (around 5.0 per 100k people) a bigger contributor to the overall analysis, so I excluded the US data to see the difference that it makes.
The difference between including or excluding the US from the global analysis of murders, turns the coefficient from civilian gun ownership from being slightly negative (the more civilian gun ownership, the fewer murders) when the US is excluded, to being slightly more negative (even fewer murders with more civilian gun ownership) when the US is included in the analysis.
But it doesn't matter since in both cases the p-value for the estimate of the coefficient is greater than 0.05, so statistically therefore gun control (inverse of civilian gun ownership) has Zero influence on murders, worldwide, because the null hypothesis is not denied.
It means that we don't know what causes murders, until we can find independent parameters that statistically significantly explain at least some of the variance in murder rate.
In my next round of analysis, I hope to include other readily available and uncontroversial data (civilian gun ownership data is collected from an anti-gun organization that tracks it, so there's no possibility of bias), such as GDP 'per capita' (
) as a measure of a country's overall wealth, along with civilian gun ownership and population density, and all the possible interactions between these parameters, to get an even better idea of how much or how little we know about what we can do to lower murder rates.
What we would love to see is a model that explains something like 90% or more of the variance, not 30% or even less, because if we can identify independent factors that explain a bulk of the variance, then we can act on it and do whatever we can to reduce murder rates. What we have is murder rate data with a lot of variance, that is unexplained by the factors that we're submitting as possible causes of that variance.