I have been critical of Edison/Mitofsky (E/M) for a number of things connected with the 2004 NEP exit poll, but I can’t criticize them for not making their datasets easily available. They have allowed the University of Michigan’s Inter-University Consortium for Political and Social Research (ICPSR) to make all their data immediately available with full documentation as part of ICPSR’s FastTrack data program. That you means that you, Jane and John Q. Public, can now download each and every state dataset, as well as the national dataset, with copious accompanying documentation, simply by visiting this FTP file directory set up by ICPSR. Once downloaded, you can then fool around with them to your heart’s content. Have you been wondering how working class whites nationally or in Ohio (or any other state) voted in 2004? Now you can find out.
Here are a few things to note about these data, before you follow the link to the FTP site and start downloading.
1. If you are expecting to find multiple weight variables that will allow you to recreate the exit poll data as it looked in various stages of the data collection and weighting process (so that, for example, you could create datasets that would match up to the crosstabs published on the New Zealand website, Scoop), you will be disappointed. There is one and only one weighting variable provided that incorporates all the various sequential adjustments to the data–for non-response bias, for oversampling, for changing turnout patterrns and, of course, to match the final reported election outcome. Therefore, these data will not allow you to replicate and pick apart the adjustments made to the data at different times on and shortly after election day.
2. To really get much out of these data, you need to have and know how to use a statistical package such as SPSS. Then you can take the datafiles and analyze them in much more detail than has been made available in crosstabular form on the web and in newspapers. For those who use SPSS, things are particularly easy since E/M provides fully-labelled and documented SPSS files for all states and nationally that are ready to go with no data preparation steps necessary.
But if you don’t use SPSS or something similar, there’s not much here for you beyond the set of final crosstabs that E/M provided to the NEP and clients. These crosstabs have already been widely circulated on the web and provide no new information.
3. In the _ALL directory, E/M provide a datafile that combines the data from all 51 state surveys and even includes a weight that adjusts each survey to represent the portion of the vote cast by each state. Nice! That means you can easily use this combined datafile to create alternative estimates to those generated by national datafile. Speaking as someone who has manually combined state datafiles to make an aggregated file in the past, I particularly appreciate this feature.
4. Documentation for the state datafiles is an improvement over past releases of exit poll data. For example, a map is provided for each state that shows the counties included in each region or “geocode” sampled within that state. And coding for all variables is clearly and thoroughly explained.
So hats off to E/M for providing easy and user-friendly access to their data. I know some will not be satisfied with the release of these data (for example, because of the provision of only one weight variable) and I myself still have many questions about how the polls were conducted and how and why the now-notorious problems with these polls arose. But let’s give credit where credit’s due: E/M and the NEP are providing a valuable resource about the 2004 election for free to all who are interested. Let’s go out there and use it.