Posts tagged data

From kohenari:

As many of my friends who left the Republican party in the past decade can attest, George Bush Lost an Entire Generation for the Republican Party:

[F]or the past 40 years voting patterns haven’t differed much by age. In fact, there’s virtually no difference between generations at all until you get to the George Bush era. At that point, young voters suddenly leave the Republican Party en masse. Millennials may be far less likely than older generations to say there’s a big difference between Republicans and Democrats, but their actual voting record belies that.
Whatever it was that Karl Rove and George Bush did—and there are plenty of possibilities, ranging from Iraq to gays to religion—they massively alienated an entire generation of voters. Sure, they managed to squeak out a couple of presidential victories, but they did it at the cost of losing millions of voters who will probably never fully return. This chart is their legacy in a nutshell.


I’d be cautious about this conclusion. There’s a noticeable drop between 2008 and 2012, which could mean the 2008 election was merely a “spike” or “outlier” in the larger trend. Making a sweeping generalization based on three data points (and 2004 looks to be within the margin of normal error) isn’t a good idea.
Also, we don’t know (based on only six years of data) whether the young people who voted Democratic in 2008 and 2012 won’t eventually vote Republican 20 or 30 years from now. Stranger things have happened (e.g. the “solid South” went from strongly Democratic in the 1940s to strongly Republican after the 1970s). Besides, notice that the 2012 and 1972 gaps (+16%) are identical. The gap has narrowed before. Projecting that it will never narrow again is odd.

From kohenari:

As many of my friends who left the Republican party in the past decade can attest, George Bush Lost an Entire Generation for the Republican Party:

[F]or the past 40 years voting patterns haven’t differed much by age. In fact, there’s virtually no difference between generations at all until you get to the George Bush era. At that point, young voters suddenly leave the Republican Party en masse. Millennials may be far less likely than older generations to say there’s a big difference between Republicans and Democrats, but their actual voting record belies that.

Whatever it was that Karl Rove and George Bush did—and there are plenty of possibilities, ranging from Iraq to gays to religion—they massively alienated an entire generation of voters. Sure, they managed to squeak out a couple of presidential victories, but they did it at the cost of losing millions of voters who will probably never fully return. This chart is their legacy in a nutshell.

I’d be cautious about this conclusion. There’s a noticeable drop between 2008 and 2012, which could mean the 2008 election was merely a “spike” or “outlier” in the larger trend. Making a sweeping generalization based on three data points (and 2004 looks to be within the margin of normal error) isn’t a good idea.

Also, we don’t know (based on only six years of data) whether the young people who voted Democratic in 2008 and 2012 won’t eventually vote Republican 20 or 30 years from now. Stranger things have happened (e.g. the “solid South” went from strongly Democratic in the 1940s to strongly Republican after the 1970s). Besides, notice that the 2012 and 1972 gaps (+16%) are identical. The gap has narrowed before. Projecting that it will never narrow again is odd.

pritheworld:

How old were you when you had your first child? The United Nations gathered this data on when women in developing countries have theirs.

More on pregnancy and childbirth in a series called The Ninth Month.

Developing Countries To Grow 5.3% in 2014

From worldbank:

image

According the latest Global Economic Prospects Report the world economy will strengthen in 2014. Much of the initial acceleration will reflect a pick-up in high income country growth, which after years of extreme weakness and outright recessions, appear to be finally emerging from the global financial crisis.

Data from the Global Economic Prospects Database (GEP). See more infographics from the GEP report here and here.

Good news for “third world” countries! Well, at least many of them. 

I’m curious to know how long the terms “third world” and “developing world” will last, once more and more countries achieve what countries like South Korea has.

"Pixar's Sad Decline—in 1 Chart" | The Atlantic

This is a great example of a terrible use of a linear regression model. Please don’t ever do this. It’s (at best) A passing C (and only because the graph looks like a good graph).

Mapping Billions of Tweets Around the World

worldbank:

Via Mashable:

“Ever wonder what it would look like to plot every single geotagged tweet since 2009 on a map? Twitter has done just that…They use billions of geotagged tweets: Every dot represents a tweet, with the brighter colors showing a higher concentration of tweets.”

Europe: 

image

Tokyo: 

image

Sao Paolo: 

image

Moscow:  

image

North America:

image

Source: Flickr and Mashable

Calculating the World’s Population

From worldbank:

How do demographers figure out how many people live on Earth? Can they accurately calculate the number of people that have ever lived? You asked our data help desk these questions, and our open data whiz drew the answers in this video.

Do you have more questions about how data is calculated? Ask them at data help desk or on Twitter with hashtag #dataquestion

I’ve been happy ever since the World Bank has made its database freely available online. It’s been a great resource for my own work—but especially so for my students.

Visualizing the Relationship Between Internet Usage and GDP per capita

From worldbank:

As a country’s GDP per capita increases, how do internet penetration rates change?

Ramiro Gómez, a Berlin based freelance software developer created a data visualization to show the percentage of Internet users in relation to the GDP per capita for countries from 1990 to 2011.

Get the data from the World Development Indicators

image

Source: Visual.ly

From adam-wola:


Take the annual income of the wealthiest 20 percent, divide it by the annual income of the poorest 20 percent, and you get a number. The larger a country’s number is, the more economically unequal the country is.
Here is Latin America. The United States (wealthiest 20% earns 16 times more than the poorest 20%) is now in the middle of the pack. In 1980, the U.S. number was 10.5.
The source is the UN Economic Commission for Latin America and the Caribbean annual Statistical Yearbook, which was released January 10. All numbers are for 2011, except Bolivia (2009), El Salvador (2010), Guatemala (2006), Honduras (2010), and Nicaragua (2009). The U.S. figure comes from Census Bureau data cited by Congressional Research Service [PDF].


BTW. You can easily do this for any country in the world. Here’s the World Bank data for income share held by lowest 20% and income share held by highest 20%. Find any country you’re interested, and calculate the ratio.

From adam-wola:

Take the annual income of the wealthiest 20 percent, divide it by the annual income of the poorest 20 percent, and you get a number. The larger a country’s number is, the more economically unequal the country is.

Here is Latin America. The United States (wealthiest 20% earns 16 times more than the poorest 20%) is now in the middle of the pack. In 1980, the U.S. number was 10.5.

The source is the UN Economic Commission for Latin America and the Caribbean annual Statistical Yearbook, which was released January 10. All numbers are for 2011, except Bolivia (2009), El Salvador (2010), Guatemala (2006), Honduras (2010), and Nicaragua (2009). The U.S. figure comes from Census Bureau data cited by Congressional Research Service [PDF].

BTW. You can easily do this for any country in the world. Here’s the World Bank data for income share held by lowest 20% and income share held by highest 20%. Find any country you’re interested, and calculate the ratio.

I was compiling some data for some class exercises and lecture presentations for next semester, and wanted to make a small observation. This stems from the difficulty I find in explaining to my students why states (or “governments”—although there’s obviously an important conceptual distinction between state and government) are important. Many of my students have a fairly strong anti-government bias, which is reinforced by their biases against taxes.
Whether strong states (or “central governments”) are a good thing or not is a question for moral or political philosophy. Certainly, strong states can be authoritarian (although, ironically, many authoritarian regimes actually have weak states). 
Similarly, whether taxes are useful or not is also a question for moral or political philosophy. And certainly many countries (including our own) spend tax dollars wastefully and/or on things we’d rather they didn’t (e.g. pacifists still pay taxes that go to military spending).
But the question of whether high taxes erode, weaken, or otherwise undermine democracy is an empirical question. That is, we can test it with existing data. To compare taxes across the world, I used tax burden as percent of GDP (this is better than using tax rates, since it looks at the share of taxes as the share of the total economy). I used data from the conservative Heritage Foundation. I wanted to see if there was any relationship between taxes and democracy. I used the 2011 Democracy Index measure developed by The Economist (the higher the score, the higher the quality of democracy). 
Turns out, there is a relationship between taxes and democracy. But it’s not what some of you might expect. The figure below plots 161 countries on both dimensions; the red line is the statistically estimated relationship (or “trendline”) between the two variables. The quality of democracy increases as tax burden as percent of GDP increases. The relationship is fairly strong (Pearson’s r = 0.6553; p < 0.000). 
There are outliers, obviously. But for the most part, countries w/ high democracy scores have high tax burdens. In contrast, countries w/ the low democracy scores tend to have low tax burdens.
How low? The ten countries with the lowest tax burdens as percent of GDP are:
United Arab Emirates (1.4% of GDP)
Kuwait (1.5% of GDP)
Equatorial Guinea (1.7% of GDP)
Oman (2.0% of GDP)
Qatar (2.2% of GDP)
Libya (2.7% of GDP)
Chad (4.2% of GDP)
Bahrain (4.8% of GDP)
Burma (4.9% of GDP)
Saudi Arabia (5.3% of GDP)
The United States comes in w/ a respectable tax burden of 26.9% of GDP. That ranks as the 57th highest in the world. That puts us just below Bolivia (27% of GDP), tied w/ South Africa, and just above South Korea (26.8% of GDP). Among the 34 wealthy OECD countries, the average tax burden is 36.2% of GDP. Other than South Korea, only Chile (among OECD countries) has a lower tax burden (18.6% of GDP).
Of course, this doesn’t answer the question of whether we should or shouldn’t have higher taxes. But it’s pretty clear that high taxes are not necessarily going to undermine our democracy.

I was compiling some data for some class exercises and lecture presentations for next semester, and wanted to make a small observation. This stems from the difficulty I find in explaining to my students why states (or “governments”—although there’s obviously an important conceptual distinction between state and government) are important. Many of my students have a fairly strong anti-government bias, which is reinforced by their biases against taxes.

Whether strong states (or “central governments”) are a good thing or not is a question for moral or political philosophy. Certainly, strong states can be authoritarian (although, ironically, many authoritarian regimes actually have weak states). 

Similarly, whether taxes are useful or not is also a question for moral or political philosophy. And certainly many countries (including our own) spend tax dollars wastefully and/or on things we’d rather they didn’t (e.g. pacifists still pay taxes that go to military spending).

But the question of whether high taxes erode, weaken, or otherwise undermine democracy is an empirical question. That is, we can test it with existing data. To compare taxes across the world, I used tax burden as percent of GDP (this is better than using tax rates, since it looks at the share of taxes as the share of the total economy). I used data from the conservative Heritage Foundation. I wanted to see if there was any relationship between taxes and democracy. I used the 2011 Democracy Index measure developed by The Economist (the higher the score, the higher the quality of democracy). 

Turns out, there is a relationship between taxes and democracy. But it’s not what some of you might expect. The figure below plots 161 countries on both dimensions; the red line is the statistically estimated relationship (or “trendline”) between the two variables. The quality of democracy increases as tax burden as percent of GDP increases. The relationship is fairly strong (Pearson’s r = 0.6553; p < 0.000). 

There are outliers, obviously. But for the most part, countries w/ high democracy scores have high tax burdens. In contrast, countries w/ the low democracy scores tend to have low tax burdens.

How low? The ten countries with the lowest tax burdens as percent of GDP are:

  1. United Arab Emirates (1.4% of GDP)
  2. Kuwait (1.5% of GDP)
  3. Equatorial Guinea (1.7% of GDP)
  4. Oman (2.0% of GDP)
  5. Qatar (2.2% of GDP)
  6. Libya (2.7% of GDP)
  7. Chad (4.2% of GDP)
  8. Bahrain (4.8% of GDP)
  9. Burma (4.9% of GDP)
  10. Saudi Arabia (5.3% of GDP)

The United States comes in w/ a respectable tax burden of 26.9% of GDP. That ranks as the 57th highest in the world. That puts us just below Bolivia (27% of GDP), tied w/ South Africa, and just above South Korea (26.8% of GDP). Among the 34 wealthy OECD countries, the average tax burden is 36.2% of GDP. Other than South Korea, only Chile (among OECD countries) has a lower tax burden (18.6% of GDP).

Of course, this doesn’t answer the question of whether we should or shouldn’t have higher taxes. But it’s pretty clear that high taxes are not necessarily going to undermine our democracy.

I thought I&#8217;d infuse some (very simple) empiricism into the evolving gun debate. Using data readily available on Wikipedia on gun-related deaths across countries and gun ownership across countries, I simply plotted a simple linear regression using Excel.
(The whole process was very simple. If you ever want to try something like this, just save any Wikipedia entry that has a table of data, then open it w/ Excel. It took me about two minutes to match up the countries, then another minute to make the chart above. So there&#8217;s no excuse for not using empirical data!)
Because (gun-related homicide death rates) data was lacking for many countries, it ended up matching up only 73 countries. I trimmed the sample to limit it to OECD member &amp; observer countries. This is what&#8217;s often used as a &#8220;peer set&#8221; for the United States (countries that we should be on par with on various levels). The result was a dataset of 50 countries. That&#8217;s not a massive dataset, but in comparative politics, that&#8217;s not too bad (after all, there are only about 200 countries in the world, so this is about a quarter of the total &#8220;universe&#8221; &#8212; plus the sample is made up of mostly &#8220;comparable&#8221; countries).   
Here&#8217;s how you read this simple bivariate regression: The red line is a simple regression estimate (you can do bivariate regressions in Excel by choosing the &#8220;trendline&#8221; option). The little equation (also an option in Excel) gives you the estimation. It&#8217;s basically the old y=mx+b equation you learned in basic middle school algebra. The &#8220;slope&#8221; is the estimated relationship between the two variables. Notice that the slope goes up: more guns, more gun-related homicides deaths. The R2 (R-squared) number is an estimation of how much of the total variation is explained by the model (the equation). The R2 value is not too high, but still manages to explain about 25% of the variation in the data.
The red dot is the United States, which has the third highest gun-related homicide deaths in the 50-country sample. The other two are Mexico &amp; South Africa. Notice, however, that even though (individually) Mexico &amp; South Africa have both high gun-related homicide death rates and few guns per capita, the overall relationship among the 50 countries is still clear: more guns, more gun-related homicides deaths.
In particular, it&#8217;s useful to note that all the advanced industrial democracies (that is, all the &#8220;first world&#8221; wealthy countries) are clustered together: few guns, few gun-related homicides deaths. Why does that matter? Because unless you believe that American criminals are somehow much more resourceful than criminals in Japan, Germany, or Australia, the claim that &#8220;if guns were more restricted only criminals would have guns&#8221; is patently ridiculous. Despite strong gun restrictions in Japan, Germany, and Australia, criminals there don&#8217;t seem to get their hands on enough guns to commit as many murders.
ADDENDUM: A correction using logarithmic scales and regression through the origin is posted here.
CORRECTION: I originally misstated the main variable. I actually didn&#8217;t use gun-related deaths, but rather the more specific gun-reated homicides (I had originally used the first, but then corrected myself, but forgot to change the labels on the graph). Using the specific gun-related homicides removes all accidental gun-related deaths &amp; gun-related suicides.

I thought I’d infuse some (very simple) empiricism into the evolving gun debate. Using data readily available on Wikipedia on gun-related deaths across countries and gun ownership across countries, I simply plotted a simple linear regression using Excel.

(The whole process was very simple. If you ever want to try something like this, just save any Wikipedia entry that has a table of data, then open it w/ Excel. It took me about two minutes to match up the countries, then another minute to make the chart above. So there’s no excuse for not using empirical data!)

Because (gun-related homicide death rates) data was lacking for many countries, it ended up matching up only 73 countries. I trimmed the sample to limit it to OECD member & observer countries. This is what’s often used as a “peer set” for the United States (countries that we should be on par with on various levels). The result was a dataset of 50 countries. That’s not a massive dataset, but in comparative politics, that’s not too bad (after all, there are only about 200 countries in the world, so this is about a quarter of the total “universe” — plus the sample is made up of mostly “comparable” countries).   

Here’s how you read this simple bivariate regression: The red line is a simple regression estimate (you can do bivariate regressions in Excel by choosing the “trendline” option). The little equation (also an option in Excel) gives you the estimation. It’s basically the old y=mx+b equation you learned in basic middle school algebra. The “slope” is the estimated relationship between the two variables. Notice that the slope goes up: more guns, more gun-related homicides deaths. The R2 (R-squared) number is an estimation of how much of the total variation is explained by the model (the equation). The R2 value is not too high, but still manages to explain about 25% of the variation in the data.

The red dot is the United States, which has the third highest gun-related homicide deaths in the 50-country sample. The other two are Mexico & South Africa. Notice, however, that even though (individually) Mexico & South Africa have both high gun-related homicide death rates and few guns per capita, the overall relationship among the 50 countries is still clear: more guns, more gun-related homicides deaths.

In particular, it’s useful to note that all the advanced industrial democracies (that is, all the “first world” wealthy countries) are clustered together: few guns, few gun-related homicides deaths. Why does that matter? Because unless you believe that American criminals are somehow much more resourceful than criminals in Japan, Germany, or Australia, the claim that “if guns were more restricted only criminals would have guns” is patently ridiculous. Despite strong gun restrictions in Japan, Germany, and Australia, criminals there don’t seem to get their hands on enough guns to commit as many murders.

ADDENDUM: A correction using logarithmic scales and regression through the origin is posted here.

CORRECTION: I originally misstated the main variable. I actually didn’t use gun-related deaths, but rather the more specific gun-reated homicides (I had originally used the first, but then corrected myself, but forgot to change the labels on the graph). Using the specific gun-related homicides removes all accidental gun-related deaths & gun-related suicides.