[net-gold] Seventeen Statements by Gold-Standard Skeptics

  • From: "David P. Dillard" <jwne@xxxxxxxxxx>
  • To: Temple University Net-Gold Archive <net-gold@xxxxxxxxxxxxxxxxxxx>, Temple Gold Discussion Group <TEMPLE-GOLD@xxxxxxxxxxxxxxxxxxx>, Net-Gold <net-gold@xxxxxxxxxxxxxxxx>, Educator Gold <Educator-Gold@xxxxxxxxxxxxxxx>, Educator Gold <Educator-Gold@xxxxxxxxxxxxxxxx>, K12AdminLIFE <K12AdminLIFE@xxxxxxxxxxxxxxx>, Net-Platinum <net-platinum@xxxxxxxxxxxxxxx>, NetGold <netgold@xxxxxxxxxxxxxxx>, "Net-Gold @ Nabble" <ml-node+3172864-337556105@xxxxxxxxxxxxx>, K-12ADMINLIFE <K12ADMIN@xxxxxxxxxxxxxxxxxxx>, net-gold@xxxxxxxxxxxxx
  • Date: Mon, 26 Apr 2010 22:40:14 -0400 (EDT)





.



Date: Mon, 26 Apr 2010 15:07:39 -0700
From: Richard Hake <rrhake@xxxxxxxxxxxxx>
Reply-To: Net-Gold@xxxxxxxxxxxxxxx
To: AERA-L@xxxxxxxxxxxxxxxxx
Cc: Net-Gold@xxxxxxxxxxxxxxx
Subject: [Net-Gold] Seventeen Statements by Gold-Standard Skeptics




If you reply to this encyclopedic (87 kB) post please don't hit the
reply button unless you prune the copy of this post that may appear
in your reply down to a few relevant lines, otherwise the entire
already archived post may be needlessly resent to subscribers.


***************************************


ABSTRACT: Andy Rudd in an EdResMeth post 6 Apr 2010 titled "Cause and
Effect" wrote: "Today I dealt with a doctoral student who was
adamantly opposed to the idea that causal relationships can be
studied using non experimental designs. . . . . . I am curious what
others think about the use of non experimental designs to study
causal relationships if it is not possible to use an experimental or
quasi-experimental design."


This initiated an 18-post thread of diverse comments on the student's
opinion, accessible at <http://tinyurl.com/y4um3g3>. Rudd's student
may have been influenced by the fact that the "Randomized Control
Trial" has been enthroned by the U.S. Dept. of Education (USDE, 2008)
and Mosteller & Boruch (2002) as the "gold standard" for
demonstrating causality in education research.


For consideration by Rudd's student and others, herewith are
SEVENTEEN STATEMENTS BY GOLD-STANDARD SKEPTICS:

(1) American Education Research Association;

(2) American Evaluation Association;

(3) Hugh Burkhardt & Alan Schoenfeld;

(4) Tom Cook & Monique Payne;

(5) Margaret Eisenhart & Lisa Towne;

(6) European Evaluation Society;

(7) Richard Hake;

(8) Burke Johnson;

(9) Annette Lareau & Pamela Barnhouse;

(10) Joseph Maxwell;

(11) National Education Association;

(12) Dennis Phillips;

(13) Barbara Schneider, Martin Carnoy, Jeremy Kilpatrick, William Schmidt, & Richard Shavelson;

(14) Michael Scriven;

(15) Mack Shelley, Larry Yore, and Brian Hand;

(16) Deborah Stipek;

(17) Carol Weiss.



***************************************




Andy Rudd (2010) in his EdResMeth post "Cause and Effect" wrote:



"Today, I dealt with a doctoral student who was adamantly opposed to
the idea that causal relationships can be studied using non
experimental designs. I tried to explain to him that while there are
much stronger designs, e.g., a randomized design, it is still
possible to study causal relationships with non experimental designs.
This student was upset that I would suggest something so outlandish.
I am curious what others of you think about the use of non
experimental designs to study causal relationships if it is not
possible to use an experimental or quasi-experimental design."


Rudd's post initiated an 18-post thread on 6-7 April 2010 of diverse
comments on the student's opinion, accessible to EdResMeth
subscribers at


<http://tinyurl.com/y4um3g3>


Rudd's student may have been influenced by the fact that the "Randomized Control Trial" has been enthroned by the U.S. Dept. of Education (USDE, 2008) and Mosteller & Boruch (2002) as the "gold standard" for demonstrating causality in education research.



For consideration by Rudd's student and others, herewith are
SEVENTEEN STATEMENTS BY GOLD-STANDARD SKEPTICS [my CAPS; my inserts
at ". . . . . . [[insert]]. . . . ."]:



*************************************


1. AMERICAN EDUCATION RESEARCH ASSOCIATION [AERA (2003)]: "We urge
you. . . . [[Rod Paige, Secretary of Education]]. . . . . . . to
modify the language for a 'Proposed Priority' to be used for 'any
appropriate programs in the Department of Education' in FY 2004 or
later. While we appreciate the value of experimental designs as an
evaluation method, WE BELIEVE THAT A JUDGMENT OF 'BEST,' AS SPECIFIED
IN THE PROPOSED LANGUAGE, DOES NOT ADEQUATELY ACCOUNT FOR OTHER
METHODS OF EVALUATION THAT MIGHT BE AS OR MORE APPROPRIATE DEPENDING
ON THE SPECIFIC EDUCATION PROGRAM. We are concerned that the proposed
priorities for application of scientifically based evaluation methods
(1) invoke an uncommonly narrow definition of evaluation as used in
the government and in the field, and (2) make no reference to the
standards for scientifically valid education evaluation adopted in
the legislation creating the Institute of Education Sciences (IES)."



*************************************


2. AMERICAN EVALUATION ASSOCIATION [AEA (2003)]: "[RCTs] are not the
only studies capable of generating understandings of causality. In
medicine, causality has been conclusively shown in some instances
without RCTs, for example, in linking smoking to lung cancer and
infested rats to bubonic plague. The secretary's proposal would
elevate experimental over quasi-experimental, observational,
single-subject, and other designs which are sometimes more feasible
and equally valid. RCTs ARE NOT ALWAYS BEST FOR DETERMINING
CAUSALITY AND CAN BE MISLEADING. RCTs examine a limited number of
isolated factors that are neither limited nor isolated in natural
settings. The complex nature of causality and the multitude of actual
influences on outcomes render RCTs less capable of discovering
causality than designs sensitive to local culture and conditions and
open to unanticipated causal factors. RCTs should sometimes be ruled
out for reasons of ethics. For example, assigning experimental
subjects to educationally inferior or medically unproven treatments,
or denying control group subjects access to important instructional
opportunities or critical medical intervention, is not ethically
acceptable even when RCT results might be enlightening. Such studies
would not be approved by Institutional Review Boards overseeing the
protection of human subjects in accordance with federal statute. In
some cases, data sources are insufficient for RCTs. Pilot,
experimental, and exploratory education, health, and social programs
are often small enough in scale to preclude use of RCTs as an
evaluation methodology, however important it may be to examine
causality prior to wider implementation."
**NOTE: See the reference "AEA (2003)" in the REFERENCE list for the
"Not AEA Statement" [Lipsey (2003)] signed by 8 prominent AEA
members.**



*************************************


3. HUGH BURKHARDT & ALAN SCHOENFELD (2003, p. 9) in "Improving
Educational Research: Toward a More Useful, More Influential, and
Better-Funded Enterprise": ". . . . it is essential for the research
community to delineate the many good ways of doing high-quality
research, and then live up to the standards it sets. SCIENCE ADVANCES
BY TESTING HYPOTHESES FROM ALL CREDIBLE VIEWPOINTS, NOT BY APPLYING
PREDETERMINED METHODS (e.g., RANDOMIZED CONTROLLED TRIALS)
INDEPENDENT OF CONTEXT. The goal is to provide rigorous,
evidence-based warrants for one's claims; the idea is to match the
method(s) with the issue at hand, and to only draw conclusions
warranted by each method or the methods in combination [see, e.g., .
Schoenfeld (2002), National Research Council (2002). . . [[referenced
here as Shavelson & Towne (2002)]]. . . . .]] .



*************************************


4. TOM COOK & MONIQUE PAYNE (2002, p. 174) in "Objecting to the
Objections to Using Random Assignment in Educational Research": "In
some quarters, particularly medical ones, the randomized experiment
is considered the causal 'gold standard.' IT IS CLEARLY NOT THAT IN
EDUCATIONAL CONTEXTS, given the difficulties with implementing and
maintaining randomly created groups, with the sometimes incomplete
implementation of treatment particulars, with the borrowing of some
treatment particulars by control group units, and with the
limitations to external validity that often follow from how the
random assignment is achieved."



*************************************


5. MARGARET EISENHART & LISA TOWNE (2003) in "Contestation and Change
in National Policy on 'Scientifically Based' Education Research" [see
that article for references other than Shavelson & Towne (2002)]:
"Recent federal education policies (e.g., the No Child Left Behind
[NCLB] Act of 2001 [NCLB, 2001] and the Education Sciences Reform Act
[ESRA] of 2002 [ESRA, 2002]) have generated considerable debate among
education researchers. . . . . . .Much of this public debate has
turned on two questions: 'What constitutes 'scientifically based'
research in education?' and 'Is scientifically based research the
only or the best approach to meaningful studies of educational
phenomena?' In response to a request from the National Educational
Research Policy and Priorities Board (NERPPB), a National Research
Council (NRC) committee took up the first question in late 2000. . .
. . . . . In the spring of 2002, the committee published its report,
SRE (NRC, 2002). . . . . [[referred to as Shavelson & Towne (2002) in
this post]]. . . . . , WHICH ARGUED FOR A POSTPOSITIVIST APPROACH . .
. . .[[see e.g., Phillips & Burbules (2000)]]. . . . TO
SCIENTIFICALLY BASED RESEARCH IN EDUCATION, INCLUDING A RANGE OF
RESEARCH DESIGNS (EXPERIMENTAL, CASE STUDY, ETHNOGRAPHIC, SURVEY) AND
MIXED METHODS (QUALITATIVE AND QUANTITATIVE) DEPENDING ON THE
RESEARCH QUESTIONS UNDER INVESTIGATION. Although SRE recognized the
legitimacy and importance of "nonscientific" ways of knowing for
education research (pp. 26, 74-76), the report attempted a broad,
inclusive answer to the first question and did not address the second
question in any detail."



*************************************


6. EUROPEAN EVALUATION SOCIETY
<http://www.europeanevaluation.org>

[EES (2007)] in "The Importance of a Methodologically Diverse
Approach to Impact Evaluation-Specifically with Respect to
Development Aid and Development Interventions," Nijkerk, The
Netherlands: December 2007; quoted in Donaldson (2009): "The EES,
consistent with its mission to promote the 'theory, practice, and
utilization of high quality evaluation,' notes the current interest
in improving impact evaluation and assessment (IE) with respect to
development and development aid. EES HOWEVER DEPLORES ONE PERSPECTIVE
CURRENTLY BEING STRONGLY ADVOCATED: THAT THE BEST OR ONLY RIGOROUS
AND SCIENTIFIC WAY IF DOING SO IS THROUGH RANDOMIZED CONTROLLED
TRIALS (RCTs). . . . . . ."



*************************************


7. RICHARD HAKE (2008a) in "Randomized Trials (was Can
Pre-to-posttest Gains Gauge Course Effectiveness?): "ABSTRACT: In a
recent post "Can Pre-to-posttest Gains Gauge Course Effectiveness?
#2. . . . [[Hake (2008d)]]. . . .," I wrote: "These [pre/post studies
. . . . .[[demonstrating abut a two-standard-superiority in average
normalized gains <g> for "Interactive Engagement" over "Traditional"
passive-student lecture courses - Hake (1998a,b; 2002, 2008h)]]. . .
. have been carried out on many different instructors, in many
different institutions, using many different texts, and working with
many different types of student populations from rural high schools
to Harvard." In response, AERA-D's Jeremy Miles (2008) asked "WERE
THESE RANDOMIZED TRIALS?" THE SHORT ANSWER IS "NO." The long answer
explains that: (a) RANDOMIZED CONTROL TRIALS (RCT's) ARE ALMOST
IMPOSSIBLE TO CARRY OUT IN UNDERGRADUATE PHYSICS EDUCATION RESEARCH,
and (b) CAREFUL NON-RCT RESEARCH CAN ESTABLISH CAUSALITY TO A
REASONABLE DEGREE - as argued by Shadish, Cook, & Campbell; Shavelson
& Towne; Schneider, Carnoy, Kilpatrick, Schmidt, & Shavelson; and
Michael Scriven.



*************************************


8. BURKE JOHNSON (2010) in EdResMeth post "Re: Cause and Effect": ".
. . if one wants to search for causation of the
scientific/nomological type (which I believe was assumed in the
original question . . . . [[Rudd (2010)]. . . . ) then I teach that
randomized experiments are the best (when they are possible and no
moderator variable has been excluded), and I make the points Bruce
just made. . . . .[[Thompson (2010): "If you use (a) regression
discontinuity designs, or (b) create a control group using
propensity scores, I think you can come reasonably close to a true
experiment]]. . . . However, note that experiments are best for what
Don Campbell called local molar causation or what Shadish, Cook, and
Campbell. . . . .[[(2002)]. . . more recently call descriptive
causation. . . . [[(pp. 9-12]]. . . . . Experiments are weaker on
demonstrating complex processes or what Shadish, Cook, and Campbell
call explanatory causation. . . . [[(pp. 9-12]]. . . . Qualitative
research can be very useful (e.g., grounded theory) for generating
evidence of explanatory causation. Mixed research is especially
interested in connecting the two (descriptive and explanatory
causation) because both are important. Also, there are many variables
in the world that we cannot actively manipulate and we must still
search for causes; scientists do not give up; making a dogmatic claim
that the choice is either (a) an experiment or (b) nothing does not
suffice. Many entire disciplines must deal with this situation of not
being able to conduct experiments on many of their topics/variables
of interest (e.g., archaeology, sociology, economics, political
science, epidemiology, astronomy). In these cases, one has to do the
best one can and there are many strategies that can be used to
provide some warrant for assertions of causation in the absence of
experimentation as scholars in these disciplines will readily
explain. Again, I SUGGEST THAT MAKING A BINARY CLAIM THAT EITHER AN
EXPERIMENT MUST BE DONE (WHICH IS THE BEST SINGLE METHOD) OR ONE CAN
HAVE ZERO EVIDENCE OF CAUSATION IS, SCIENTIFICALLY AND PRACTICALLY
SPEAKING, PROVINCIAL. A mixed research standpoint tends to tend to
replace thinking in binary terms with thinking synechistically (i.e.,
in terms of continua)."



*************************************


9. ANNETTE LAREAU & PAMELA BARNHOUSE (2010) in "What Counts as
Credible Research?": "It is a critical moment in educational policy.
The Obama administration has renewed emphasis on educational policy
and No Child Left Behind is up for renewal. But in the current
debate, there has not been sufficient discussion of a crucial piece
of educational debates: what kinds of research should be considered
to be acceptable? In recent years, RANDOMIZED-CONTROLLED TRIALS WERE
ELEVATED TO THE POSITION AS THE "GOLD STANDARD" FOR EDUCATIONAL
RESEARCH. WE BELIEVE THIS POSITION TO BE HIGHLY PROBLEMATIC. As the
debate about education begins to pick up speed, it is important to
broaden the definition of legitimate educational research.. . . . . .
We suggest that federal department of education decision makers need
to acknowledge that there are many different research questions in
education, and that different research questions call for different
methods. There needs to be a realistic and critical assessment of the
limits of randomized-controlled trials and the relatively narrow
forms of knowledge that can be gained from their use (Phillips,
2009). Investigations that address a rich range of questions that
fall outside the realm of randomized controlled trials need to be
supported as well, such as the mechanisms through which parents
influence children's schooling experiences, the micro-interactional
patterns that build trust among school personnel, or political and
organizational impediments to reform."



*************************************


10. JOSEPH MAXWELL (2004, abstract) in "Causal Explanation,
Qualitative Research, and Scientific Inquiry in Education": "A
National Research Council report, 'Scientific Research in Education'.
. . . . . [[Shavelson & Towne (2002)]]. . . . . , has elicited
considerable criticism from the education research community. . . .
.[[see, e.g. Educational Researcher (2002), Eisenhart & Towne
(2003.]]. . . ., but this criticism has not focused on a key
assumption of the report-its HUMEAN, REGULARITY CONCEPTION OF
CAUSALITY. IT IS ARGUED THAT THIS CONCEPTION, WHICH ALSO UNDERLIES
OTHER ARGUMENTS FOR 'SCIENTIFICALLY-BASED RESEARCH,' IS NARROW AND
PHILOSOPHICALLY OUTDATED, AND LEADS TO A MISREPRESENTATION OF THE
NATURE AND VALUE OF QUALITATIVE RESEARCH FOR CAUSAL EXPLANATION. An
alternative, realist approach to causality. . . . .[[Campbell (1988),
House (1991), Pawson & Tilley (1997), Pawson (2006)]]. . . . . is
presented that supports the scientific legitimacy of using
qualitative research for causal investigation, reframes the arguments
for experimental methods in educational research, and can support a
more productive collaboration between qualitative and quantitative
researchers."



*************************************


11. NATIONAL EDUCATION ASSOCIATION [NEA (2003)]:"The NEA STRONGLY
ENDORSES the National Research Council's study, 'Scientific Research
in Education,'. . . . [[SHAVELSON & TOWNE (2002)]]. . . . . AND
RECOGNIZES THIS TO BE THE "GOLD STANDARD" in terms of selecting
methodology that is most appropriate for the question presented,
rather than framing the question to fit the methodology. If a federal
regulation were to reward or even tacitly endorse the latter
approach, we would no longer have true evidence-based education
initiatives. We also strongly agree with the comments of both the
American Education Research Association . . . .[[see above]]. . . .
and the National Education Knowledge Industry Association on this
point.



*************************************


12. DENNIS PHILLIPS (2009, p. 178) in "A Quixotic quest?
Philosophical issues in assessing the quality of education research":
"As the examples above illustrate, a narrow view of the nature of
science, and crucially of its methods, is fostered if too much
attention is paid to what the philosopher Hans Reichenbach. . . . .
[[<http://en.wikipedia.org/wiki/Hans_Reichenbach>]]. . . . termed the
'context of justification' and too little notice is given to the
vitally important 'context of discovery.' This distinction if
heuristically valuable, but it is too crude to be taken as marking an
absolute dichotomy. In practice, ideas are often tested as they are
formulated, leading to many of them quickly being discarded as
unworthy. There are not two temporally distinct processes occurring,
as a crude understanding of Reichenbach's distinction might suggest,
but one complex one in which probing, hypothesis formation,
critiquing, and testing are intermingled, as the case of William
Harvey amply illustrates. . . . .[['to convince his scientific peers
that blood circulates in arteries and veins and is pumped by the
heart']]. . . . .



Crude as it is, however, the discovery/justification distinction is
extremely helpful when applied to the recent debates concerning the
use of the so-called 'gold standard' in education research. Thus
those who insist the *the* criterion to use in identifying
scientifically rigorous educational research is whether or not the
study in question used randomized controlled field trails or
experiments (RTFs), or quasi-experimental designs that approximate
them, are guilty of focusing on only one-half of Reichenbach's
categorization of the logic of science. RFT methodology is well
suited to throw light only on the *justificatory* issue - that is
whether or not it can be claimed that a treatment actually caused
(produced) a desired effect. This focus is, of course, an important
one, but taken by itself (and it is often put forward by itself) IT
EGREGIOUSLY MISREPRESENTS THE NATURE OF SCIENTIFIC INQUIRY. For what
is omitted is the vital steps leading up to the *initial discovery or
production* of the treatment (or program or hypothesis) whose claim
of effectiveness is being subjected to justificatory investigation by
means of the RFT. It is often in this "phase" of discovery where
scientists display their creative genius, their range and depth of
background knowledge, their 'opportunism,' their ability to 'do their
damnedest."



*************************************


13. BARBARA SCHNEIDER, MARTIN CARNOY, JEREMY KILPATRICK, WILLIAM
SCHMIDT, & RICHARD SHAVELSON (2007, p. 117) in "Estimating Causal
Effects Using Experimental and Observational Designs: A Think Tank
White Paper": "As the NRC's Committee on Scientific Research on
Education makes clear in 'Scientific Research in Education' . . . .
.[[Shavelson & Towne (2002)]]. . . ., the question of causal effects
is but one of three general questions that drive research. This
report has focused on how to establish that there is an effect (i.e.,
'Is there a systematic effect?'). What has been less emphasized are
the two other question identified by the NRC: (1) 'What is
happening?' (i.e., what is occurring in a particular context, usually
documented through thick description); and (2) 'Why or how is it
happening?' (i.e., What mechanisms are producing the effect that is
observed?). These two questions are central to the design of
experiments and their usefulness. They are also important for
developing theories of cognition, learning, and social and emotional
development. A PROGRAM OF EVALUATION BUILT ON A SOLID FOUNDATION OF
CLOSELY LINKED RESEARCH USING A VARIETY OF METHODS IS NEEDED TO
ESTABLISH THE BASIS FOR RELIABLE AND ENDURING KNOWLEDGE ABOUT THE
EFFECTS OF EDUCATIONAL INNOVATIONS."



*************************************


14. MICHAEL SCRIVEN (2008) in "A Summative Evaluation of RCT
Methodology: & An Alternative Approach to Causal Research": "Along
with the attempt to redefine the concepts of-or at least the
acceptable ways to establish-evidence and causation, the RCT campaign
also involves the less-remarked parallel effort, going back further,
to redefine the concept of an experiment. In standard scientific
usage, experiments are just carefully constrained explorations, and
the RCT is simply a special case of these. To call the RCT the only
'true experiment' is part of an attempt at redefinition that distorts
the original and continuing usage, and excludes experiments designed
to test many simple hypotheses about - or simple efforts to find out
- what happens if we do *this*.



This effort at persuasive redefinition is allied with an implicit
denigration of the so-called 'quasi-experimental' designs, which are
in fact perfectly respectable experiments, only 'quasi' with respect
to the one respect in which they have less control over one possible
way of excluding one type of alternative explanation. But in other
respects, equally important in the practical business of selecting
appropriate designs to get definite answers in the given
circumstances, they are often massively superior, e.g., with respect
to the number of subjects required in order to achieve useful
results; the extent to which they avoid intrusion into a natural
course of events that it may be very important not to disturb; their
cost, not just in money terms but in terms of other important values,
etc. Of particular importance, THE COMMONLY ACCEPTED IMPLICATION OF
THE 'QUASI' TERMINOLOGY - THAT THE CONCLUSIONS FROM THEM WILL BE LESS
SECURE - IS, AS ARGUED BELOW, CATEGORICALLY FALSE. It is based on an
abstract concept of proof or certainty that ignores the practical
process and standards used by working scientists and engineers-and by
historians and judges in courts of law, and by everyone when acting
as real people facing crucial decisions- all of whose approaches are
treated with more respect in the present paper. . . . . . . . . . . .
. . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .



SUMMATIVE PROPOSITIONS


A. *The RCT design is a theoretical construct of considerable
interest, BUT IT HAS ESSENTIALLY ZERO PRACTICAL APPLICATION TO THE
FIELD OF HUMAN AFFAIRS.* It is important to be clear that a true RCT
study has to be (at least) double-blind, as are all sound
pharmacological studies, whereas the applications in public health,
education, social services, law enforcements, etc., that are
currently advocated as RCTs are neither double-blind nor even single
blind, but 'zero-blind.' Such studies are of course open to the
unintended explanation of their results by appeal to the Hawthorne
effect or its converse. . . . .[[the "John Henry Effect"]]. . . .,
since it's usually easy for members of the experimental and control
groups to work out which one they are in. HENCE THE COMMON ARGUMENT
THAT THE RCT DESIGNS BEING ADVOCATED IN AREAS LIKE EDUCATION, PUBLIC
HEALTH, INTERNATIONAL AID, LAW ENFORCEMENT, ETC., HAVE THE (UNIQUE)
ADVANTAGE OF 'ELIMINATING ALL SPURIOUS EXPLANATIONS' IS COMPLETELY
INVALID. It was careless to suppose that randomization of subject
allocation would compensate for the failure to blind the subjects (as
in single blind studies), let alone the failure to blind the
treatment dispensers, a.k.a. service providers (the requirement that
distinguishes the double-blind study). The RCT banner in the applied
human sciences is in fact being flown over pseudo-RCTs. This failing
is not the result of carelessness, but of the almost complete
impossibility, at least within the constraints of the usual protocols
governing experimentation with human subjects, of arranging for even
single blind conditions. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .




G. *THE REAL 'GOLD STANDARD' FOR CAUSAL CLAIMS IS THE SAME ULTIMATE
STANDARD AS FOR ALL SCIENTIFIC CLAIMS; IT IS CRITICAL OBSERVATION.*.
. . .. Causation can be directly observed, in lab or home or field,
usually as one of many contextually embedded observations, such as
lead being melted by heating a crucible, eggs being fried in a pan,
or a hawk taking a pigeon. And causation can also be inferred from
non-causal direct observations with no experimentation, as by the
forensic pathologist performing an autopsy to determine the cause of
death. "



*************************************


15. MACK SHELLEY, LARRY YORE, AND BRIAN HAND (2009b) in "Education
Research Meets the 'Gold Standard': Evaluation, Research Methods, and
Statistics after No Child Left Behind": " . . . . .Unfortunately, it
appears as if the Gold Standard for research practice (randomized
control trails, RCTs) is based on the stage 3 drug trial, or medical
model, without duly recognizing the stage 1 and stage 2 trials
necessitated by rarity of disease, risks, development of problem
space, availability of related technologies or innovations, and
costs. . . . . . . . . . .SOME INITIAL AND CURRENT INTERPRETATIONS OF
THE GOLD STANDARD HAVE PRIVILEGED A SINGLE APPROACH AN TYPE OF
EVIDENCE REGARDLESS OF THE DEVELOPMENT OF THE PROBLEM SPACE, SPECIFIC
RESEARCH QUESTION, AVAILABLE TECHNOLOGIES AND INSTRUMENTATION AND
COST AND ETHICAL CONSIDERATIONS. If such interpretations of this
policy exclusively privilege RCT and quantitative evidence, it would
disregard high-quality, qualitative research approaches and other
contemporary approaches and, thus, the evidence flowing from such
inquiries. Such an oversight would not fully recognize education as a
social science that utilizes (a) epistemologies and methods that
involve both hypothetico-deductive inquiry and normal hierarchical
development and (b) inductive, nonexperimental inquiries that insert
new theoretical discourses alongside existing ones (Yore & Lerman,
2008). . . . [[but regarding "new theoretical discourses" see the
insightful "Expanded Social Scientist's Bestiary: A Guide to Fabled
Threats To, and Defenses of, Naturalistic Social Science" by
philosopher D.C. Phillips (2000)]]. . . . . . .



*************************************


16. DEBORAH STIPEK (2005) in "Scientifically Based Practice: It's
About More Than Improving the Quality of Research": ". . . . the
administration is also recommending significant changes in the way
education researchers do business. According to the Institute of
Education Sciences' director, Grover J. 'Russ' Whitehurst, the focus
of research should be on identifying effective teaching practices.
Borrowing from the field of medicine, the federal government has also
put its faith, and its money, in a particular methodology -
randomized field trials. This methodology is considered to be more
rigorous than any other used in education research, and it allows
causal conclusions that no other method can boast.



Also concerned with the quality and reputation of education research,
the National Research Council Committee on Scientific Principles in
Education Research. . . . [[see, e.g., Shavelson & Towne (2002)]]. .
.offers a somewhat different set of recommendations. The committee
suggests that the fit between the method and the questions being
asked is more important than the particular method. Its
recommendations focus primarily on the culture of education research
- the need to foster a greater commitment to objectivity, high
standards of scientific inquiry, replication, and the free flow of
constructive critique.



Yet a third set of recommendations is well articulated in two
documents - one . . .[[NAE (1999)]]. . . . issued by the National
Academy of Education. . . . . . . . .
..[[<http://www.naeducation.org/>]. . . . . . ., and another by the
National Research Council (Strategic Education Research Partnership,
SERP). . .[[see Donovan & Pellegrino (2003)]]. . . .. These reports
promote, as the administration does, research that focuses on the
problems of practice. Their recommendations differ from the
administration's strategy in several important ways, however. First,
they encourage research in what Donald Stokes . . . [[Stokes (1997)]]
. . . . . . .calls Pasteur's Quadrant - research on practical
problems that develops, at the same time, general principles that can
guide future research and practice. The reports suggest particular
qualities of research that they claim will be more useful for
improving education practice. They recommend, for example, research
that is embedded in practice and that involves collaborations between
researchers and practitioners. . . .[[see e.g., Kelly (2003)]]. . .
Unlike the traditional linear model of 'research-into-practice,'
their view of productive research and development involves moving
back and forth between research and practice. Innovations are
developed by researchers collaborating with practitioners. They are
tried out in classrooms, refined or developed by practitioners in
their schools and classrooms, and then systematically studied by
researchers. The link between research and practice is assumed to be
complex, reciprocal, and dynamic.



Productive use of research findings at the policy level also requires
many judgment calls. A policy found to be effective in one context is
not necessarily effective in another, and there are often many
details related to the original conditions of the research that need
to be attended to when applying findings in new contexts.



Consider the example of class-size reduction in California. A large,
random-assignment study in Tennessee demonstrating the benefits of
reducing class sizes to about 15 students was used to support a
policy of reducing class size to 20 in California. But unlike in
Tennessee, where trained teachers were in good supply, in California
there was a serious teacher shortage. Because crucial variables
related to the context of the study were ignored, the implementation
of this very costly policy in California may have done more harm than
good, at least for children in the low-income communities that could
not compete for the limited supply of trained and experienced
teachers.



Another example is a random-assignment study of the High/Scope
preschool intervention in Ipsilanti . . .[[sic, it's Ypsilanti in
Michigan, see e.g.

<http://evidencebasedprograms.org/wordpress/?page_id=65>]]. . . ,


cited repeatedly as support for preschool education. True, the study
has demonstrated impressive and long-term effects of a preschool
experience, but the devil is in the details. Many of the preschool
programs that were spawned by this compelling research evidence look
nothing like the Ipsilanti program. . . .[[sic, it's "Ypsilanti
program"]]. . . . . It is very likely that many of the preschool
programs based on this research do not give anything close to the
same advantages seen in the original High/Scope program.



These examples illustrate the complexity of making evidence-based
policy decisions. Researchers will need to make sure that they
communicate clearly what contextual variables and details of the
intervention or program are necessary to achieve positive results.
And policymakers will need either training or assistance to make
judgments about the implications of research findings for their local
context.


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


The bottom line is that education researchers, like educational
practitioners, are being asked to approach their work differently
from how they did in the past. We are being challenged to impose high
standards of scientific rigor on ourselves, to focus on problems of
practice, and to develop sustained collaborations with practitioners.
If the resources needed to do this kind of research become available
(they currently are not), we should be able to live up to the
challenge.



BUT UNTIL MANY OTHER INSTITUTIONAL CHANGES OCCUR, AND THE
ORGANIZATIONAL STRUCTURES TO SUPPORT EVIDENCE-BASED PRACTICE ARE
DEVELOPED, RESEARCH FINDINGS, HOWEVER CLEAR AND USEFUL, WILL HAVE A
FEATHER'S WEIGHT ON TEACHING AND STUDENT LEARNING IN THE NATION'S
SCHOOLS.



We do need to improve the quality and relevance of education
research, but that's not all we need to do.



Deborah Stipek. . . .
.[[<http://ed.stanford.edu/suse/faculty/displayRecord.php?suid=stipek>
]]. . . . .is the dean of the Stanford University school of education.



*************************************


17. CAROL WEISS (2002) in "What to Do until the Random Assigner
Comes": "The contributions to this volume have largely been
appreciations of random assignment and its many virtues. I agree
that it is ideal for purposes of establishing causality (I'd better
if I don't want to be thrown out of this merry company) because it
shows that the intervention was in fact responsible for the observed
effects. BUT THERE ARE CIRCUMSTANCES WHEN RANDOM ASSIGNMENT IS VERY
DIFFICULT, IF NOT IMPOSSIBLE, TO IMPLEMENT. One of those
circumstances arises when the goal of the intervention is to change
*not* the individuals but the community itself. Many such programs
are currently in existence, programs that aim to 'revitalize,'
'transform,' or 'develop,' the community in the United States, in
Europe with the European Community's 'social funds,' and in the
developing countries. Ultimately, the purpose of the intervention is
to improve the well-being of the residents, but the intervention is
not directed at individual residents so much as the conditions and
workings of the neighborhood. The obvious solution to the difficulty
with randomizing individuals is to randomize communities, that is to
assign communities randomly to program and control conditions. . . .
. However, at the community level randomization faces three almost
intractable problems: (a) small numbers, (b) funders insistence on
control of selection, and (c) variability across sites. "



Richard Hake Honorary Member, Curmudgeon Lodge of Deventer, The Netherlands
President, PEdants for Definitive Academic References which Recognize
the Invention of the Internet (PEDARRII)

<rrhake@xxxxxxxxxxxxx>

<http://www.physics.indiana.edu/~hake>

<http://www.physics.indiana.edu/~sdi>

<http://HakesEdStuff.blogspot.com>

<http://iub.academia.edu/RichardHake>



"In science education, there is almost nothing of proven efficacy."
Grover Whitehurst, RCT apostle and former director of the
USDE's Institute of Education Sciences, as quoted by
Sharon Begley (2004b)


"Physics educators have led the way in developing and using objective
tests to compare student learning gains in different types of
courses, and chemists, biologists, and others are now developing
similar instruments. These tests provide convincing evidence that
students assimilate new knowledge more effectively in courses
including active, inquiry-based, and collaborative learning, assisted
by information technology, than in traditional courses."
Wood & Gentile (2003)


"It is fruitful to view scientists as making convincing cases, cases
that appeal to a wide variety of evidence. This assessment of
scientific cases is called the 'platinum standard'."
Dennis Phillips (2006)


"I can't resist sharing a correlation/causation comic to this thread:

<http://xkcd.com/552/>"

Sharon Osborn Popp (2010)




REFERENCES [Tiny URL's courtesy <http://tinyurl.com/create.php>. All
URL's accessed on 26 April 2010. The formatting is not commonly
employed, but should be. It employs a blend of the *best* formatting
features from the style manuals of the AIP (American Institute of
Physics <http://www.aip.org/pubservs/style/4thed/toc.html>), APA
(American Psychological Association <http://apastyle.apa.org/>), and
CSE (Council of Science Editors
<http://www.councilscienceeditors.org/publications/style.cfm>)].



AEA. 2003. American Evaluation Association, Response to U. S.
Department of Education's "Scientifically Based Evaluation Methods:
Studies capable of determining causality," online at
<http://www.eval.org/doestatement.htm>. NOTE: Some prominent AEA
members disagreed with the above statement and issued a "Not AEA
Statement" [Lipsey (2003)]: "This statement is intended to support
the. . . .[[USDE's]]. . . definition and associated preference for
the use of such designs for outcome evaluation when they are
applicable. It is also intended to provide a counterpoint to the
statement submitted by the AEA leadership as the Association's
position on this matter. The generalized opposition to use of
experimental and quasi-experimental methods evinced in the AEA
statement is unjustified, speciously argued, and represents neither
the methodological norms in the evaluation field nor the views of the
large segment of the AEA membership with significant experience
conducting experimental and quasi-experimental evaluations of program
effects." The statement was signed by Leonard Bickman, Robert F.
Boruch, Thomas D. Cook, David S. Cordray, Gary Henry, Mark W. Lipsey,
Peter H. Rossi, & Lee Sechrest.]



AERA. 2003. American Educational Research Association, letter to the
Honorable Rod Paige, Secretary of Education, online at
<http://www.eval.org/doeaera.htm>.



Begley, S. 2004a. "The Best Ways to Make Schoolchildren Learn? We
Just Don't Know," Wall Street Journal 10 December, page B1; online to
Wall Street Journal subscribers (and possibly others) at
<http://tinyurl.com/26bmsn4>. I thank Keith Tipton for bringing this
article and its sequel [Begley (2004b)] to my attention.



Begley, S. 2004b. "To Improve Education, We Need Clinical Trials To
Show What Works," Wall Street Journal, 17 December, page B1; online
to Wall Street Journal subscribers at <http://tinyurl.com/34v4uss>
and to discussion-list followers in the APPENDIX of Hake (2005a). See
also Begley (2004a).]



Bernhardt, P.C. 2008. Re: Randomized Trials (was Can Pre-to-posttest
Gains Gauge Course Effectiveness?)," TIPS post of 22 Oct 2008
05:22:44-0700; online on the OPEN! TIPS archives at
<http://tinyurl.com/5a7hzk>.



Burkhardt, H. & A.H. Schoenfeld. 2003. "Improving Educational
Research: Toward a More Useful, More Influential, and Better-Funded
Enterprise," Educational Researcher 32(9): 3-14; online to
subscribers at <http://www.aera.net/publications/?id=401>.



Campbell, D.T. 1988. "Methodology and epistemology for social
science: Selected papers (S. Overman, ed.). Chicago: University of
Chicago Press, publisher's information at
<http://tinyurl.com/25xpu9p>. Amazon.com information at
<http://tinyurl.com/2dshbfa>. An expurgated Google Book Preview is
online at <http://tinyurl.com/22jm3c4>.



Christensen, L.B. , R.B. Johnson, & L.A. Turner. 2010. "Research
Methods, Design, and Analysis," 11th edition. Allyn and Bacon.
Amazon.com information at <http://tinyurl.com/y5yoxnp>. Notes for the
3rd edition are at <http://www.sagepub.com/bjohnsonstudy/index.htm>.



Cook, T.D. & M.R. Payne. 2002. "Objecting to the Objections to Using
Random Assignment in Educational Research" in Mosteller & Boruch
(2002).



Cronbach, L.J., S.R. Ambron, S.M. Dornbusch, R.D. Hess, R.C. Hornik,
D.C. Phillips, D.F. Walker, and S.S. Weiner. 1980. "Toward reform of
program evaluation." Jossey Bass. Amazon.com information at
<http://tinyurl.com/y42s7ra>.



Crouch, C.H. & E. Mazur. 2001. "Peer Instruction: Ten years of
experience and results," Am. J. Phys. 69: 970-977; online at
<http://tinyurl.com/sbys4>.



DeHaan, R.L. 2005. "The Impending Revolution in Undergraduate Science
Education," Journal of Science Education and Technology 14(2):
253-269. The abstract, online at <http://tinyurl.com/ymwwe3>. reads:
"There is substantial evidence. . . . .[[little, if any from RCT's]].
. . . .that scientific teaching in the sciences, i.e. teaching that
employs instructional strategies that encourage undergraduates to
become actively engaged in their own learning, can produce levels of
understanding, retention and transfer of knowledge that are greater
than those resulting from traditional lecture/lab classes. But
widespread acceptance by university faculty of new pedagogies and
curricular materials still lies in the future. In this essay we
review recent literature that sheds light on the following questions:
(1) What has evidence from education research and the cognitive
sciences told us about undergraduate instruction and student learning
in the sciences?
(2) What role can undergraduate student research play in a science curriculum?
(3) What benefits does information technology have to offer?
(4)What changes are needed in institutions of higher learning to
improve science teaching?
We conclude that widespread promotion and adoption of the elements of
scientific teaching by university science departments could have
profound effects in promoting a scientifically literate society and a
reinvigorated research enterprise."



Donaldson, S., T.C. Christie, & M.M. Mark, eds., 2009. "What counts
as credible evidence in applied research and evaluation?" Sage,
publisher's information at
<http://www.sagepub.com/booksProdDesc.nav?prodId=Book231785&;>.
Amazon.com information at <http://tinyurl.com/ygtt6gs>, note the
"Look Inside" feature. An expurgated Google Book Preview is online
at <http://tinyurl.com/y2ezueg>. See also the "Credible Evidence in
Evaluation Website at
<http://sites.google.com/site/credibleevidence/Home>, with these
headings: Contents, Reviews, About the Editors, Key Features, Buy the
Book, Free Resources, Training, & Contact. Chapter 1 "In Search of
the Blueprint for an Evidence-Based Global Society" is online at
<http://www.cgu.edu/PDFFiles/sbos/Donaldson_Credible_Evidence_1.pdf>
(193 kB).



Donaldson, S. 2009. "The Epilogue "A Practitioner's Guide for
Gathering Credible Evidence in the Evidence-Based Global Society," in
Donaldson et al. (2009, pp. 239-251), online at
<http://www.cgu.edu/PDFFiles/sbos/Donaldson_Credible_Evidence_Epilogue.pdf>
(172 kB).



Donovan, M.S. & J. Pellegrino, eds. 2003. "Learning and Instruction:
A SERP Research Agenda," Academies Press; online at
<http://books.nap.edu/catalog/10858.html>.



Educational Researcher. 2002. "Theme Issue on Scientific Research and
Education" 31(8); online to subscribers at
<http://www.aera.net/publications/?id=438>.



EES. 2007. European Evaluation Society
<http://www.europeanevaluation.org>, EES Statement: The Importance of
a Methodologically Diverse Approach to Impact Evaluation -
Specifically with Respect to Development Aid and Development
Interventions. Nijkerk, The Netherlands: December; quoted in
Donaldson (2009).



Eisenhart, M. & L. Towne. 2003. "Contestation and Change in National
Policy on 'Scientifically Based' Education Research," Educational
Researcher 32(7): 31-38; online to subscribers as a 176 kB pdf at
<http://edr.sagepub.com/cgi/reprint/32/7/31>: "In this article, we
examine the definitions of 'scientifically based research' in
education that have appeared in recent national legislation and
policy. These definitions, now written into law in the No Child Left
Behind Act of 2001 and the Education Sciences Reform Act of 2002, and
the focus of [Shavelson & Towne (2002)], are being used to affect
decisions about the future of education programs and the direction of
education research."



English, L.D. 2008. "Handbook of international research in
mathematics education," 2nd edition, Routledge, publisher's
information at <http://tinyurl.com/3szytm>. Amazon.com information at
<http://tinyurl.com/y5uf5tq>. An expurgated Google Book Preview is
online at <http://tinyurl.com/y3euscn>.



Feuer, M.J., L. Towne, & R.J. Shavelson. 2002a. "Scientific Culture
and Educational Research," Educational Researcher 31(8): 4-14; online
to subscribers at <http://edr.sagepub.com/cgi/reprint/31/8/4>.



Feuer, M.J., L. Towne, & R.J. Shavelson. 2002b. Comments on responses
to Shavelson & Towne (2002) in Educational Researcher (2002) - see
also Feuer et al. (2002a).



Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: A
six-thousand-student survey of mechanics test data for introductory
physics courses," Am. J. Phys. 66: 64-74; online at
<http://www.physics.indiana.edu/~sdi/ajpv3i.pdf> (84 kB). [A Google
search for "Interactive-engagement vs traditional methods" (with the
quotes) netted 7,060 hits on 24 April 2010.]



Hake, R.R. 1998b. "Interactive-engagement methods in introductory
mechanics courses," online at
<http://www.physics.indiana.edu/~sdi/IEM-2b.pdf> (108 kB). A crucial
companion paper to Hake (1998a).



Hake, R.R. 2002. "Lessons from the physics education reform effort,"
Ecology and Society 5(2): 28; online at
<http://www.ecologyandsociety.org/vol5/iss2/art28/>. For an update on
six of the lessons on "interactive engagement" see Hake (2007).



Hake, R.R. 2004. "Direct Instruction Suffers a Setback in California
- Or Does It?" contributed to the 129th National AAPT meeting in
Sacramento, CA, 1-5 August 2004; online at
<http://www.physics.indiana.edu/~hake/DirInstSetback-041104f.pdf>
(420 kB).



Hake, R.R. 2005a. Re: "To Improve Education, We Need Clinical Trials
To Show What Works," AERA-L post of 10 Jan 2005 16:01:05 -0800;
online at <http://tinyurl.com/yzjz5vp>. The APPENDIX contains a copy
of Begley (2004b) as allowed by the "fair use" provision of
copyrighted material under section 107 of U.S. Copyright Law - see
e.g.,
<http://www.law.cornell.edu/uscode/17/107.shtml>.



Hake, R.R. 2005b. "Should Randomized Control Trials Be the Gold
Standard of Educational Research?" online on the OPEN! AERA-L
archives at <http://tinyurl.com/ybcexn8>. Post of 5 Apr 2005
20:28:30 -0700 to AERA-C, AERA-D, AERA-G, AERA-C, AERA-H, AERA-J,
AERA-K, AERA-L, AP-Physics, ASSESS, Biopi-L, Chemed-L, EvalTalk,
Math-Learn, Phys-L, Physhare, PhysLrnR, STLHE-L, & TIPS.



Hake, R.R. 2005c. "Scientifically Based Practice," online on the
OPEN! AERA-L archives at , post of 15 Apr 2005 17:01:00-0700. The
APPENDIX contains a copy of Stipek (2005) in accord with the "fair
use" provision of copyrighted material under section 107 of U.S.
Copyright Law. See also Hake (2005a,b).



Hake, R. R. 2005d. "The Physics Education Reform Effort: A Possible
Model for Higher Education?" online at
<http://www.physics.indiana.edu/~hake/NTLF42.pdf> (100 kB). This is a
slightly edited version of an article that was (a) published in the
National Teaching and Learning Forum 15(1), December, online to
subscribers at <http://www.ntlf.com/FTPSite/issues/v15n1/physics.htm>
(if your institution doesn't subscribe, then it should), and (b)
disseminated by the "Tomorrow's Professor" list
<http://ctl.stanford.edu/Tomprof/postings.html> as Msg. 698 on 14 Feb
2006.



Hake, R.R. 2005e. "Will the No Child Left Behind Act Promote Direct
Instruction of Science?" Am. Phys. Soc. 50: 851 (2005); APS March
Meeting, Los Angles, CA. 21-25 March; online at
<http://www.physics.indiana.edu/~hake/WillNCLBPromoteDSI-3.pdf> (256
kB). The abstract reads: "The No Child Left Behind (NCLB) Act
requires testing in science achievement starting in 2007. Will such
testing tend to propagate California's Direct Science Instruction
(DSI) [Hake (2004)] throughout the entire nation? After discussing
the evidence for the superiority of "interactive engagement" or
"guided inquiry" methods over DSI in conceptually difficult areas of
science, I indicate seven reasons why NCLB might promote DSI, and one
reason - possible *effective* intervention by the National Research
Council - why it might not.



Hake, R.R. 2006. "Possible Palliatives for the Paralyzing Pre/Post
Paranoia that Plagues Some PEP's" [PEP's = Psychologists, Education
Specialists, and Psychometricians], Journal of MultiDisciplinary
Evaluation, Number 6, November, online at
<http://survey.ate.wmich.edu/jmde/index.php/jmde_1/article/view/41/50>.
This even despite the admirable anti-alliteration advice at
psychologist Donald Zimmerman's site
<http://mypage.direct.ca/z/zimmerma/> to "Always assiduously and
attentively avoid awful, awkward, atrocious, appalling, artificial,
affected alliteration."



Hake, R.R. 2007. "Six Lessons From the Physics Education Reform
Effort," Latin American Journal of Physics
<http://journal.lapen.org.mx/sep07/HAKE%20Final.pdf> (124 kB).



Hake, R.R. 2008a. "Randomized Trials (was Can Pre-to-posttest Gains
Gauge Course Effectiveness?)" online on the OPEN! AERA-D archives at
<http://tinyurl.com/yc3dg6z>. Post of 21 Oct 2008 17:03:30-0700 to
AERA-D, ASSESS, EdResMeth, EvalTalk, PhysLrnR, POD, and TIPS. See
also Hake (2008b).



Hake, R.R. 2008b. "Randomized Trials - ADDENDUM" [response to
Bernhardt (2008)], online on the OPEN! AERA-D archives at
<http://tinyurl.com/yhxcqbq>. Post of 22 Oct 2008 11:34:36-0700 to
AERA-D, ASSESS, EdResMeth, EvalTalk, PhysLrnR, POD, and TIPS.



Hake, R.R. 2008c. "Can Pre-to-posttest Gains Gauge Course
Effectiveness?" online on the OPEN! AERA-D archives at
<http://tinyurl.com/6a393m>. Post of 18 Oct 2008 12:05:53-0700 to
AERA-D, ASSESS, EdStat-L, EdResMeth, EvalTalk, PhysLrnR, and POD.



Hake, R.R. 2008d. "Can Pre-to-posttest Gains Gauge Course
Effectiveness? #2," online on the OPEN! AERA-D archives at
<http://tinyurl.com/27d2bwt>. Post of 19 Oct 2008 16:08:08-0700 to
AERA-D, ASSESS, EdResMeth, EvalTalk, PhysLrnR, & POD. Most of
academia is either oblivious or dismissive of pre/post testing
demonstrations of causality in education research, but see Stokstad
(2001), DeHaan (2005), Wood & Gentile (2003), Michael (2006), and
Hake (1998a,b; 2002; 2005d,e; 2006; 2007; 2008c,e,f,g,h; 2010a,b);



Hake, R.R. 2008e. "Can Pre-to-posttest Gains Gauge Course
Effectiveness? #2," online on the OPEN! POD archives at
<http://tinyurl.com/2emx4e8>. Post of 20 Oct 2008 10:17:40-0700 to
EvalTalk, PhysLrnR, & POD. Contains Ed Nuhfer's cogent comment on
Dennis Roberts's vacuous statement ". . . when we try to use gain
scores (whatever form) to decide about course effectiveness, one is
in a funk as to being able to know what effectiveness is a result
of." Shortly thereafter Roberts kicked me of his EdStat list.



Hake, R.R. 2008f. "Can Pre-to-posttest Gains Gauge Course
Effectiveness? #3," online on the OPEN! AERA-D archives at
<http://tinyurl.com/2bb6u3y>. Post of 24 Oct 2008 17:44:44 -0700 to
AERA-D. The abstract and link to the full post was transmitted to
ASSESS, EdStat-L, EdResMeth, EvalTalk, & POD. Therein I wrote: "In my
opinion, one should treat Bill Becker's discussion of assessment in
areas outside his own field of economics with caution."



Hake, R.R. 2008g. "Can Pre-to-posttest Gains Gauge Course
Effectiveness? #3 - ADDENDUM," online on the OPEN! AERA-D archives at
<http://tinyurl.com/29h3nbv>. Post of 27 Oct 2008 12:45:40 -0700 to
AERA-D. The abstract and link to the full post was transmitted to
ASSESS, EdStat-L, EdResMeth, EvalTalk, & POD. The abstract reads [see
that post for references other than Hake (1998a, 2002)]:"Bill
Becker's (2001) criticisms of my survey of introductory physics
courses [Hake (1998a)] were shown to be problematic in the section
'Criticisms of the Survey' of 'Lessons from the physics education
reform effort' [Hake (2002)]. But Becker, in most of his more
recent criticisms [Becker (2004, 2008)] of Hake (1998a) has
essentially replayed his earlier statements, essentially ignoring
the counters to his criticism contained in Hake (2002) - I give
herewith six examples. In my opinion: (a) such non-recognition of
counter arguments hardly serves as a model for 'The Scholarship of
Teaching and Learning in Higher Education' [Becker & Andrews (2004)];
(b) like biology [Klymkowsky et al. (2003)] and engineering [Smith
et al. (2005)], economics education might have something to learn
from physics education research [Simkins & Maier (2008)]."



Hake, R.R. 2008h. "Design-Based Research in Physics Education
Research: A Review," in Kelly, Lesh, & Baek (2008)]. A
pre-publication version of that chapter is online at
<http://www.physics.indiana.edu/~hake/DBR-Physics3.pdf> (1.1 MB). The
abstract reads: "In this chapter I argue that some physics education
research (PER) is design-based research (DBR) and that an important
DBR-like facet of PER, the pre/post testing movement, has the
potential to improve drastically the effectiveness of undergraduate
instruction generally, the education of pre-service teachers in
particular, and, as a net result, the education of the general
population."



Hake, R.R. 2010a. "Should We Measure Change? Yes!" online at
<http://www.physics.indiana.edu/~hake/MeasChangeS.pdf> (2.5 MB) and
as ref. 43 at <http://www.physics.indiana.edu/~hake>. To appear as a
chapter in "Evaluation of Teaching and Student Learning in Higher
Education" [Hake (in preparation)]. The abstract reads (slightly
updated): "Formative pre/post testing is being successfully employed
to improve the effectiveness of courses in undergraduate astronomy,
biology, chemistry, economics, engineering, geoscience, mathematics,
and physics. But such testing is still anathema to many members of
the psychology-education-psychometric (PEP) community. I argue that
this irrational bias impedes a much needed enhancement of student
learning in higher education. I then review the development of
diagnostic multiple-choice tests of higher-level learning; normalized
gain and ceiling effects; the documented two-sigma superiority of
interactive engagement (IE) to traditional passive-student pedagogy
in the conceptually difficult subject of Newtonian mechanics; the
probable neuronal basis for such superiority; education's lack of a
community map; higher education's resistance to change and its
related failure to improve the public schools; and, finally, why we
should be concerned with student learning."A severely truncated
version is online at Hake (2006).



Hake, R.R. 2010b. "Re: Quality Research in Literacy and Science
Education: International Perspectives and Gold Standards," online on
the OPEN! AERA-L archives at <http://tinyurl.com/yhhbu72>. Post of
22 Feb 2010 14:04:43-0800 to AERA-L and Net-Gold. The abstract was
sent to various discussion list and also appears at
<http://hakesedstuff.blogspot.com/2010/02/re-quality-research-in-literacy-and.html>
with a provision for comments. In the abstract I wrote: " . . . . the
authors contributing to [this book] appear to be either dismissive
or oblivious of physics education research, *inconsistent* with the
generally positive opinions of most observers. . . . .[[e.g.,
Stokstad (2001), DeHaan (2005), Wood & Gentile (2003), Michael
(2006)]]. . . . .For example, Millar and Osborne make the following
erroneous claims (paraphrasing): "No standard or commonly agreed
outcome measures exist for any major topic. Published assessment
tools such as the 'Force Concept Inventory' have not been subjected
to the kind of rigorous scrutiny of factorial structure and content
validity that would be standard practice for measures of attainment
or learning outcome in other subject areas."



House, E.R. 1991. "Realism in research," Educational Researcher
20(6): 2-9, 25; online to subscribers at
<http://edr.sagepub.com/cgi/reprint/20/6/2>.



Howe, K.R. 2009a. Educational Researcher 38(6): 428-440; online to
subscribers at <http://edr.sagepub.com/cgi/reprint/38/6/428>. This
article is in a section titled "Epistemology, Methodology, and
Education Sciences," online to subscribers at
<http://edr.sagepub.com/content/vol38/issue6/> that also contains
responses to Howe from: Eric Bredo , R. Burke Johnson, and Linda C.
Tillman; plus comments on those responses by Howe (2009b).



Howe, K.R. 2009b. "Straw Makeovers, Dogmatic Holism, and Interesting
Conversation," response to comments by Bredo, Burke, and Tillman,
Educational Researcher 38(6): 463-466; online to subscribers at
<http://edr.sagepub.com/cgi/reprint/38/6/463>.



Johnson, R.B. 2001. "Toward a New Classification of Nonexperimental
Quantitative Research," Educational Researcher 30(2): 3-13; online to
subscribers as a 1.1 MB pdf at <http://tinyurl.com/25nenq3>.



Johnson, R.B. 2010. "Re: Cause and Effect," EdResMeth post of 6 Apr
2010 15:21:56-0500; online at <http://tinyurl.com/235aedh>. To
access the archives of EdResMeth one needs to subscribe, but that
takes only a few minutes by clicking on
<http://listserv.uconn.edu/edresmeth-l.html> and then clicking on
"Join or leave the list (or change settings)." If you're busy, then
subscribe using the "NOMAIL" option under "Miscellaneous." Then, as a
subscriber, you may access the archives and/or post messages at any
time, while receiving NO MAIL from the list! See also Christensen,
Johnson, Turner (2010) and Johnson (2001).



Kelly, A.E. 2003. "Research as Design," Educational Researcher 32(1):
3-4; online to subscribers at
<http://www.aera.net/publications/?id=393>. See also Kelly, Lesh, &
Baek (2008).



Kelly, A.E., R.A. Lesh, & J.Y. Baek. 2008. "Handbook of Design
Research Methods in Education: Innovations in Science, Technology,
Engineering, and Mathematics Learning and Teaching." Routledge.
Publisher's information at <http://tinyurl.com/4eazqs>; Amazon.com
information at <http://tinyurl.com/5n4vvo>.



Lareau, A. & P. Barnhouse. 2010. "What Counts as Credible Research?"
Teachers College Record, 01 March; online at
<http://www.susanohanian.org/show_research.php?id=343>. See also
Walters, Lareau, Ranis (2009).



Lipsey, M. 2003. "NOT the AEA statement on Scientifically Based
Evaluation, EvalTalk post of 3 Dec 2003 13:22:10-0600; online at
<http://tinyurl.com/y5v2fg9>. To access the archives of EvalTalk one
needs to subscribe, but that takes only a few minutes by clicking on
<http://bama.ua.edu/archives/evaltalk.html> and then clicking on
"Join or leave the list (or change settings)." If you're busy, then
subscribe using the "NOMAIL" option under "Miscellaneous." Then, as a
subscriber, you may access the archives and/or post messages at any
time, while receiving NO MAIL from the list!



Mark, M. 2009. "Credible Evidence," In Donaldson et al. (2009, pp.
214-238), portions are accessible at the Google Book Preview of
Donaldson et al. (2009) at <http://tinyurl.com/y2ezueg>, including
most of Mark's discussion on pp. 221-232 of Scriven's (2009)
*hypothetical* pre/post test demonstration of causality. In my
opinion it would have been more relevant to the real world of
evaluation if Mark had discussed the pre/post-test experiments [Hake
(1998a,b), Crouch & Mazur (2001), Mazur (2010)] which approximate an
actualization of Scriven's hypothetical example.



Maxwell, J.A. 2004. "Causal Explanation, Qualitative Research, and
Scientific Inquiry in Education," Educational Researcher 33(2): 3-11;
online to subscribers at <http://edr.sagepub.com/cgi/reprint/33/2/3>.



Mazur, E. 2010. "Confessions of a Converted Lecturer" talk at the
University of Maryland on 11 November 2009. The abstract reads: "I
thought I was a good teacher until I discovered my students were just
memorizing information rather than learning to understand the
material. Who was to blame? The students? The material? I will
explain how I came to the agonizing conclusion that the culprit was
neither of these. It was my teaching that caused students to fail! I
will show how I have adjusted my approach to teaching and how it has
improved my students' performance significantly." That talk is now on
UTube at <http://www.youtube.com/watch?v=WwslBPj8GgI> (click on the
view number to see a graph of "Total Views" vs Time); and the
abstract, slides, and references - sometimes obscured in the UTube
talk - are at <http://tinyurl.com/ybc53jw> as a 4 MB pdf. As of 26
April 2010 10:20:00-0700 Eric's talk had been viewed by 18,093 UTube
fans, up from 12,800 on 16 March 2010. In contrast, serious articles
in the education literature, often read only by the author and a few
cloistered academic specialists, usually create tsunamis in
educational practice equivalent to those produced by a pebble dropped
into the Pacific Ocean.



Michael, J. 2006. "Where's the evidence that active learning works?"
Advances in Physiology Education 30: 159-167, online at
<http://tinyurl.com/ykzp7lt>. The abstract reads: "Calls for reforms
in the ways we teach science at all levels, and in all disciplines,
are wide spread. The effectiveness of the changes being called for,
employment of student-centered, active learning pedagogy, is now well
supported by evidence. The relevant data. . . . [[little, if any from
RCT's]]. . . . . have come from a number of different disciplines
that include the learning sciences, cognitive psychology, and
educational psychology. There is a growing body of research within
specific scientific teaching communities that supports and validates
the new approaches to teaching that have been adopted. These data are
reviewed, and their applicability to physiology education is
discussed. Some of the inherent limitations of research about
teaching and learning are also discussed."



Miles, J. 2001. "Research Methods and Statistics" Crucial Publishers.
Amazon.com information at
<http://www.amazon.co.uk/exec/obidos/ASIN/1903337151/jeremymiles>.



Miles, J. 2008. Re: Can Pre-to-posttest Gains Gauge Course
Effectiveness? #2, AERA-D post of 19 Oct 2008 20:39:41 -0700; online
on the OPEN! AERA-D archives at <http://tinyurl.com/29ekn4o>. Jeremy
Miles manages a Psychology Research Methods Wiki:
<http://www.researchmethodsinpsychology.com/wiki/index.php?title=Main_Page >
based on his book "Research Methods & Statistics [Miles (2001)].



Mosteller, F. & R. Boruch, eds. 2002. "Evidence Matters: Randomized
Trials in Education Research." Brookings Institution. Amazon.com
information at <http://tinyurl.com/59gp6o>.



NAE. 1999. National Academy of Education report "Recommendations
Regarding Research Priorities: An Advisory Report to the National
Educational Research Policy and Priorities Board," online at
<http://www.naeducation.org/Research_Priorities_Publication.pdf> (217
kB).



NEA. 2003. National Education Association, letter to the Honorable
Rod Paige, Secretary of Education at
<http://www.eval.org/doe.nearesponse.pdf > (88 kB).



Pawson, R. & N. Tilley. 1997. "Realistic Evaluation." Sage,
publisher's information at
<http://www.sagepub.com/booksProdDesc.nav?prodId=Book205276&;>.
Amazon.com information at <http://tinyurl.com/22kqar9>. Note the
searchable "Look Inside" feature. An overview by Tilley, presented at
the Founding Conference of the Danish Evaluation Society, September
2000, is at <http://tinyurl.com/26rjcs2> as a 53 kB pdf.



Pawson, R. 2006. "Evidence-Based Policy: A Realist Perspective."
Sage, publisher's information at
<http://www.sagepub.com/booksProdDesc.nav?prodId=Book227875&;>.
Amazon.com information at <http://tinyurl.com/24ppavv>. Note the
searchable "Look Inside" feature.



Phillips, D.C. 2000. "Expanded Social Scientist's Bestiary: A Guide
to Fabled Threats To, and Defenses of, Naturalistic Social Science."
Rowman & Littlefield - information at <http://tinyurl.com/ycmlvy>.
The late Paul Meehl <http://en.wikipedia.org/wiki/Paul_E._Meehl>
wrote: "Should be required reading for all Ph.D. candidates in
social science. It is a mind clearing analysis of the highest order,
prophylactic and curative of the numerous methodological and
substantive ills that afflict us. It is especially needed today when
the 'positivist-bashers' are using the Vienna Circle's mistakes and
Kuhn's exaggerations for obscurantist purposes."



Phillips, D.C. & N.C. Burbules. 2000. "Postpositivism and Educational
Research." Rowman & Littlefield; publisher's information at
<http://tinyurl.com/yncvls >. Amazon.com information
<http://tinyurl.com/yelju39>. See especially "Mistaken accounts of
positivism," pp. 11-14.



Phillips, D.C. 2006. "A guide for the perplexed: Scientific
educational research, methodolatry, and the gold versus platinum
standards," Educational Research Review 1(1): 15-26; an abstract,
online at <http://tinyurl.com/yhztz6w>, reads: ". . . .the main
discussion focusses upon the end of this continuum where there are
located the recent attempts to restore rigor to educational research
by using the so-called 'gold standard' of randomized field trials. It
is argued that . . . .[[this misrepresents the nature of science]]. .
. . ., and some examples are briefly mentioned in order to covey the
point that IT IS FRUITFUL TO VIEW SCIENTISTS AS MAKING CONVINCING
CASES, cases that appeal to a wide variety of evidence. This
assessment of scientific cases is called the 'PLATINUM STANDARD'."
See also Phillips (2009).



Phillips, D.C. 2009. "A Quixotic quest? Philosophical issues in
assessing the quality of education research," in Walters et al.
(2009, pp. 163-195), accessible at Amazon's "Look Inside" feature at
<http://tinyurl.com/yyc8jd9>.



Popkewitz, T.S. 2004. "Is the National Research Council Committee's
report on Scientific Research in Education scientific? On trusting
the manifesto." Qualitative Inquiry 10(1): 62-78; an abstract is
online at <http://qix.sagepub.com/cgi/content/abstract/10/1/62>.



Popp, S.O. 2010. EdResMeth post of 7 Apr 2010 07:42:42-0700; online
at <http://tinyurl.com/29d6jzt>.



Rudd, A. 2010. "Cause and Effect," EdResMeth post of 6 Apr 2010
13:44:43-0400; online at
<http://tinyurl.com/y7ffx54>. To access the archives of EdResMeth one
needs to subscribe, but that takes only a few minutes by clicking on
<http://listserv.uconn.edu/edresmeth-l.html> and then clicking on
"Join or leave the list (or change settings)." If you're busy, then
subscribe using the "NOMAIL" option under "Miscellaneous." Then, as a
subscriber, you may access the archives and/or post messages at any
time, while receiving NO MAIL from the list!



Schneider, B., M. Carnoy, J. Kilpatrick, W.H. Schmidt, & R.J.
Shavelson. 2007. "Estimating Causal Effects Using Experimental and
Observational Designs: A Think Tank White Paper" AERA, publisher's
information and FREE download at
<http://www.aera.net/publications/Default.aspx?menu_id=46&id=3360>.



Schoenfeld, A.H. 2002. "Research methods in (mathematics) education."
in L.D. English, ed. "Handbook of international research in
mathematics education," (pp. 467-488). Erlbaum. Amazon.com
information at <http://tinyurl.com/y6ssmn3>. See also English (2008).
For those who can read between the lines the Google Book Preview
<http://tinyurl.com/y3euscn>of English (2008) contained the following
pages of Schoenfeld's article when I examined it on 24 April 2010:
pp. 467-469, 471, 473-476, 481-485, 489-492, 496, 498-501, 506-507,
510-514. But which pages can and cannot be seen may depend on the
circumstances. At <http://tinyurl.com/6nl27k> Google states: "Many of
the books you can preview on Google Books are still in copyright, and
are displayed with the permission of publishers and authors. You can
browse these 'limited preview' titles just as you would in a
bookstore, but you won't be able to see more pages than the copyright
holder has made available. When you've accessed the maximum number of
pages allowed for a book, any remaining pages will be omitted from
your preview.



Scriven, M. 2008. "A Summative Evaluation of RCT Methodology: & An
Alternative Approach to Causal Research" Journal of Multidisciplinary
Evaluation 5(9): 11-24; online at
<http://survey.ate.wmich.edu/jmde/index.php/jmde_1/article/view/160/186>.



Scriven, M. 2009. "Demythologizing Causation and Evidence" in
Donaldson et al. (2009, pp. 134-152). Two points: (1) Scriven gives a
hypothetical pre/post test demonstration of causality which has been
discussed by Mark (2009). But, in my opinion, it would have been more
relevant to the real world of evaluation if Mark had discussed the
pre/post-test experiments [Hake (1998a,b), Crouch & Mazur (2001),
Mazur (2010)] which approximate an actualization of Scriven's
hypothetical example. (2) Scriven's reference to "A critical
appraisal of the case against using experiments to assess school (or
community) effects" [Cook (2001)] states that it's online at
<http://www.hoover.org/publications/ednext/3399216.html>. But that
URL yields a redirect to <http://educationnext.org/> where a search
for "Cook" yields only the non-scholarly popularization
"Sciencephobia: Why education rejects randomized experiments,"
Education Next 1(3): 62-68 (2001) at
<http://educationnext.org/sciencephobia/>. For the non-Hooverized
article "A critical appraisal of the case against using experiments
to assess school (or community) effects" [Cook (2001)] as originally
written by Cook click on
<http://media.hoover.org/documents/ednext20013unabridged_cook.pdf>
(131 kB)



Scriven, M. 2010. "Rethinking Evaluation Methodology," Journal of
MultiDisciplinary Evaluation 6(13), online at
<http://survey.ate.wmich.edu/jmde/index.php/jmde_1/article/view/264/253>.



Shadish, W.R., T.D. Cook, & D.T. Campbell. 2002. "Experimental and
Quasi-Experimental Designs for Generalized Causal Inference."
Amazon.com information at
<http://www.amazon.com/dp/0395615569/?tag=katlin-20> A goldmine of
references to social-science research. Portions of the book are
online at <http://depts.washington.edu/methods/readings/Shadish.pdf>
(1.8 MB).



Shavelson, R.J. & L. Towne, eds., 2002. "Scientific Research in
Education" (SRE), National Academy Press; online at
<http://www.nap.edu/catalog/10236.html>. On page 114 it is stated:
"IN SOME SETTINGS, WELL-CONTROLLED QUASI-EXPERIMENTS MAY HAVE GREATER
'EXTERNAL VALIDITY' - GENERALIZABILITY TO OTHER PEOPLE, TIMES, AND
SETTINGS'- THAN EXPERIMENTS WITH COMPLETELY RANDOM ASSIGNMENT
(Cronbach et al., 1980; Weiss, 1998)." Among members of the Academy's
"Committee on Scientific Principles for Education Research" that
authored SRE were (aside from Shavelson & Towne): Robert Boruch,
Jere Confrey, Robert DeHaan, Margaret Eisenhart, Eugene Garcia,
Norman Hackerman, Eric Hanushek, Ellen Condliffe Lagemann, Dennis
Phillips, and Carol Weiss. See also: (a) the Educational Researcher
(2002) theme issue 31(8), online to subscribers at
<http://edr.sagepub.com/content/vol31/issue8/> carrying responses to
SRE and a reply to the responses by Feuer, Towne, & Shavelson
(2002b); (b) Eisenhart & Towne's (2003) review of definitions of
"scientifically based research"; (c) Maxwell's (2004) critique of the
SRE's emphasis on quantitative research as the sole warrant for
causality; (d) Towne, Wise, & Winters (2004) sequel to SRE
"Advancing Scientific Research in Education"; (e) Popkewitz's (2004)
"Is the National Research Council Committee's report on Scientific
Research in Education scientific?"; (f) Howe's (2009a,b) discussion
of SRE as "the new scientific orthodoxy." Shavelson &Towne (2002; p.
3-5) wrote: The Committee argued that ALL THE SCIENCES, INCLUDING
SCIENTIFIC EDUCATIONAL RESEARCH, SHARED A SET OF EPISTEMOLOGICAL OR
FUNDAMENTAL GUIDING PRINCIPLES, and that all scientific endeavors
should:
(a) pose significant questions that can be investigated empirically,
(b) link research to relevant theory,
(c) use methods that permit direct investigation of the questions,
(d) provide a coherent and explicit chain of reasoning,
(e) attempt to yield findings that replicate and generalize across studies, and
(f) disclose research data and methods to enable and encourage
professional scrutiny and
critique."



Shelley, M.C., L.D. Yore, & B. Hand, eds. 2009a. "Quality Research in
Literacy and Science Education: International Perspectives and Gold
Standards." Springer, publisher's information at
<http://www.springerlink.com/content/g2447682464446x2/>. Amazon.com
information at <http://tinyurl.com/yf7efra>, note the searchable
"Look Inside" feature. Barnes & Noble information at
<http://tinyurl.com/y8n9pe9>. An expurgated (teaser) version is
online as a Google "book preview" at <http://tinyurl.com/yddphh3>.
For a lukewarm review see Hake (2010b).



Shelley, M.C., L.D. Yore, & B. Hand, eds. 2009b. "Education Research
Meets the 'Gold Standard': Evaluation, Research Methods, and
Statistics after No Child Left Behind," Chapter 1, pages 3-18, in
Shelley et al. (2009a). Surprisingly, the Google book preview of
Shelley et al. (2009) at <http://tinyurl.com/yddphh3> contains all of
pages 3-15. To see this use the ">" at the top of the first page to
go to page vi and then click on chapter 1.



Stipek, D. 2005. "Scientifically Based Practice: It's About More Than
Improving the Quality of Research," Education Week 24(28): 33-34;
online to discussion-list followers in the APPENDIX of Hake (2005c)
at <http://tinyurl.com/29almt3>, as allowed by the "fair use"
provision of copyrighted material under section 107 of U.S. Copyright
Law - see e.g., <http://www.law.cornell.edu/uscode/17/107.shtml>.



Stokes, D.E. 1997. "Pasteur's quadrant: Basic science and
Technological Innovation." Brookings Institution Press, publisher;s
information at
<http://www.brookings.edu/press/Books/1997/pasteur.aspx>. Amazon.com
information at <http://tinyurl.com/lto97>.



Stokstad, E. 2001. "Reintroducing the Intro Course," Science 293:
1608-1610, 31 August; an abstract is online at
<http://www.sciencemag.org/cgi/content/summary/293/5535/1608>.
Stokstad wrote: "Physicists are out in front in measuring how well
students learn the basics, as science educators incorporate hands-on
activities in hopes of making the introductory course a beginning
rather than a finale."



Thompson, B. 2010. "Re: Cause and Effect," EdResMeth post of 6 Apr
2010 13:00:55-0500; online at <http://tinyurl.com/yb2t3o9>: Thompson
wrote: "If you use (a) regression discontinuity designs. . . . .
.[[see. e.g. Shadish et al. (2002, pp. 207-245)]]. . . ., or (b)
create a control group using propensity scores . . . .[[op cit, p.
122 & pp. 161-165]]. . . . I think you can come reasonably close to a
true experiment." For Scriven's dour view of the phrase "true
experiment" see Gold Standard Skeptic Statement #14 in this post.



Towne, L., L.L.Wise, & T.M. Winters, eds. 2004. "Advancing Scientific
Research in Education." National Academies Press; online at
<http://www.nap.edu/catalog.php?record_id=11112>.



USDE. 2005. "Scientifically Based Evaluation Methods; Notice. Federal
Register 70(15), 25 January, Part II, Dept. of Education online as a
111 kB pdf at <http://tinyurl.com/y4w3ygm>.



USDE. 2008. U.S. Dept. of Education, "What Works Clearinghouse
Evidence Standards For Reviewing Studies, Version 1.0, Revised May
2008,"online at
<http://ies.ed.gov/ncee/wwc/pdf/wwc_version1_standards.pdf> (147 kB).
See also USDE (2005).



Walters, P.B., A. Lareau, & S. Ranis, eds. 2009. "Education research
on trial: Policy reform and the call for scientific rigor."
Routledge, publisher's information at
<http://www.routledge.com/books/details/9780415989893/>. Amazon.com
information at <http://tinyurl.com/yyc8jd9>. Note the searchable
"Look Inside" feature.



Weiss, C.H. 1997. "Evaluation: Methods for studying programs and
policies." Prentice Hall, 2nd edition. Amazon.com information at
<http://tinyurl.com/2bv6239>.



Weiss, C.H. 2002."What to Do until the Random Assigner Comes" in
Mosteller & Boruch (2002). See also Weiss (1997).



Wood, W.B., & J.M. Gentile. 2003. "Teaching in a research context,"
Science 302: 1510; 28 November; online to subscribers at
<http://www.sciencemag.org/cgi/reprint/302/5650/1510.pdf>. A summary
is online to all at
<http://www.sciencemag.org/cgi/content/summary/302/5650/1510>.



Yore, L.D. & S. Lerman. 2008. "Metasynthesis of qualitative research
studies in mathematics and science education" (editorial).
International Journal of Science and Mathematics Education 6(2):
217-223. The first page is online at
<http://www.springerlink.com/content/m2201uq34245670g/>.



Yore, L.D. & P. Boscol. 2009. "Why 'Gold Standard' Needs Another "s":
Results from the Gold Standard(s) in Science and Literacy Education
Research Conference," Chapter 2, pages 17-39, in Shelley et al.
(2009). On 25 April 2010 I was able to access this entire article
except for pages 18 & 19 at the Google book preview of Shelley et al.
(2009) at <http://tinyurl.com/yddphh3>. Which pages can and cannot be
seen may depend on the circumstances. At <http://tinyurl.com/6nl27k>
Google states: "Many of the books you can preview on Google Books are
still in copyright, and are displayed with the permission of
publishers and authors. You can browse these 'limited preview' titles
just as you would in a bookstore, but you won't be able to see more
pages than the copyright holder has made available. When you've
accessed the maximum number of pages allowed for a book, any
remaining pages will be omitted from your preview. You can order full
copies of any book using the 'Get this book' links to the side of the
preview page. See also Yore & Lerman (2008).



.



Other related posts:

  • » [net-gold] Seventeen Statements by Gold-Standard Skeptics - David P. Dillard