The Bell Curve Page
Last updated: July 6, 1998/5 Feb., 2007. URL:
http://rasmusen.org/pacioli/bellcurve/bellcurve.htm
Administered by Eric Rasmusen, [email protected],
Kelley School of Business, Indiana University, BU 456, 1309
East Tenth Street, Bloomington, Indiana 47405-1701,
(812)855-9219.
To return to the Rasmusen homepage, click on:
Rasmusen Home Page.
Charles Murray has provided the data he used in the analysis in
his book, The Bell Curve .
The actual data files are in
the low megabyte range, and are available in two formats. Please
note that the data includes weights for each observation, because the
survey from which it comes sampled different groups with different
weights. Also, some of the data is in the form of `z-scores', which
means that is is measured as standard deviations away from a mean of
zero.
If you encounter problems reading the data, please let me know. I
probably can't help, since I haven't been using this data in the past
few years, but you never know. If you solve your problem, let me
know about that too, so I can post the solution.
Files are saved as Macintosh text files with labels, tab indicating
end of field, and CR indicating end of line.
-
NATION.TXT,
which has 12,686 cases and 50 variables.
This file includes variables scored for all NLSY subjects, one line
per subject.
Size: 3.112MB.
-
CHILD1.TXT,
which has 8,513 cases and 26 variables.
Variables scored for all NLSY children, representing one case per
child for whom data were available through SY90.
Size: 1.312MB.
-
CHILD2.TXT,
which has 17,040 cases and 40 variables.
Each case represents one child for one test year. A given child may
therefore be represented in up to three cases. TY=test year.
Percentiles on the developmental and behavioral indicators all
represent within-gender percentiles.
Size: 3.040MB.
-
WOMEN.TXT,
which has 6,283 cases and 28 variables.
Variables scored for all women in the NLSY (one case per subject).
Size: 1.032MB.
- The Documentation is available in a number of forms. The
original is a 51K file,
1TBC_Documentation5.rtf.
You can get the same thing in Word in 33K at
3TBC_Documentation.doc,
or in Ascii in 25K at 2TBC_Documentation.ascii.
Finally, you can get a 15K version describing just the NATION
variables at 1TBC_Nation.Documentation5.rtf.
I used the EXCEL spreadsheet to change the format into one I
could use more easily, and made a few other small changes. The
output are csv files, with each entry separated by a comma.
-
nation.csv,
which has 12,686 cases and 50 variables.
This 2.980M file includes variables scored for all NLSY subjects,
one line per subject. The variable names are listed in the
file,
nation.hdr.
-
child1.csv,
a 1.201M file. The variable names are listed in a 278-byte file,
child1.hdr.
-
child2a.csv
and
child2b.csv,
1.667M and 1.265M files. The variables names seem to have
gone astray since 1996. They should be listed in
child2.hdr,
but they are not.
-
women.csv,
a 952K file. The variable names are listed in a 291-byte file,
woman.hdr.
I like the STATA program very much, and here include some input
and output files using the data above.
-
bell2a.do,
an input file using nation.csv. The output from this is
the log file,
bell2a.log,
which has regression results, and a STATA data file,
nation1.dta,
which has a subset of the nation.txt variables in the condensed
STATA format.
-
bell2.do,
a 3K input file using nation.csv. The output from this is
the 13K file,
bell2.log.
This do-file wasn't working in May 2002--my present version of
STATA, STATA 7.0, says there is not room enough for all the
observations.
-
jan6c.do,
a 2K input file using nation.csv. The output from this is
the 2K file,
jan6c.log.