or Connect
AVS › AVS Forum › News Forum › Community News & Polls › Are Blind Audio Comparisons Worthwhile?
New Posts  All Forums:Forum Nav:

Are Blind Audio Comparisons Worthwhile? - Page 2

Poll Results: Are Blind Audio Comparisons Worthwhile?

 
  • 86% (118)
    Yes, blind audio comparisons are worthwhile
  • 13% (19)
    No, blind audio comparisons are not worthwhile
137 Total Votes  
post #31 of 119
Don't care
post #32 of 119
^^^ since they are subjective, then why even bother? Buy what you like, even if it's only placebo.

If a person who is sick is cured although he/she is only given a placebo, what difference does it make?

It's entertainment, as long as you like it, then go buy it. If you like to have the centre channel located behind you, the right channel on the left and the right channel inside a cabinet in the room next door, then so be it.
post #33 of 119
Quote:
Originally Posted by David Susilo View Post

Reasons ABX is pointless: the human element.

1. A person needs to get used to the acoustic environment.

Why? What do you mean by that? And why couldn't someone "get used to" the acoustic environment for a test? You do know an ABX test can be done in someone's home that they are "used to," right?
Quote:
Originally Posted by David Susilo View Post

2. Barometric pressure affects how sounds is perceived.

So? The same would go for non-ABX attempts to discern audible differences too, so how would that be a particular strike against ABX? Further, if one is serious about eliminating such variables, then
you could design the place to be tested to have consistent Barometric pressure, making it even more reliable.
Quote:
Originally Posted by David Susilo View Post

3. Familiarity of the recorded materials.

There's nothing stopping anyone from using familiar recorded material in a blind listening test.
Quote:
Originally Posted by David Susilo View Post

4. Every person perceives things differently and have different hearing acuity.

So? Same goes for non-blind testing. But when you are being more careful in your testing, you will control for these things. For instance, how many audiophiles report to each other their impressions of how a new piece of gear sounds...along with their auditory tests? Whereas when scientists are trying to discern what type of audible differences people can hear, they often do preliminary tests graphing the participants hearing, to weed out or account for such variables.
Quote:
Originally Posted by David Susilo View Post

5. In the end it's still a subjective comparison.

No it's not. ABX tests, generally for whether someone can hear an audible difference between A or B. That's an objective measurement, a test of someone's capabilities to hear a difference. IF they CAN hear a difference between A and B then whichever they like is their subjective assessment.
post #34 of 119
Yes it is subjective as a person's hearing is different from another. The frequency response of everybody varies wildly. Unless you can get a sample of people with the exact hearing frequency response and acuity, it's subjective.
post #35 of 119
The value of blind testing is subjective to a person's point of view. The more expensive the equipment you sell is, the less likely you'll like blind testing.
post #36 of 119
redundant
Edited by snyderkv - 10/27/13 at 6:59pm
post #37 of 119
Quote:
Originally Posted by David Susilo View Post

Yes it is subjective as a person's hearing is different from another. The frequency response of everybody varies wildly. Unless you can get a sample of people with the exact hearing frequency response and acuity, it's subjective.

The problem is you are using the term "subjective" in an awkward, non-standard way. Subjectivity normally refers to that which changes with a persons opinion or inclination, which is why the same thing can be called "good" or "tasty" to you, but "bad" or "yucky" to me.

The physical facts of how a component is actually altering a signal, or not, are objective.

The physical facts of how any individual's hearing may be defective, is objective. If someone has a -8dB notch in their hearing at 10K that doesn't alter with their opinion, likes or dislikes. It's objectively true.

The impact an individual's hearing has on his ability to discern between A and B, is objective.

At the end of all that, whatever he values or likes better is subjective. But that's not normally what is being tested. (Though, that can be tested for as well, if desired).
post #38 of 119
Quote:
Originally Posted by pottscb View Post

I don't think its any more worthwhile than watching test patterns to test real world display content...useful to a certain extent, but no one sits down with a bowl of popcorn to watch a marathon of test patterns (or, maybe they do?). Either way, too many other physical variables that affect sound much more than sight.

Not a good analogy. You could use the same music you would use for a sighted test. It's not like you're going to use audio test patterns for a blind test. One might not find it to be worth the effort because they are going to buy the item that pleases them aesthetically or is the brand they want to identify with, but if you are buying based on sound quality, how could it possibly be any less worthwhile than sighted testing?
post #39 of 119
Quote:
Originally Posted by R Harkness View Post

The problem is you are using the term "subjective" in an awkward, non-standard way. Subjectivity normally refers to that which changes with a persons opinion or inclination, which is why the same thing can be called "good" or "tasty" to you, but "bad" or "yucky" to me.

The physical facts of how a component is actually altering a signal, or not, are objective.

The physical facts of how any individual's hearing may be defective, is objective. If someone has a -8dB notch in their hearing at 10K that doesn't alter with their opinion, likes or dislikes. It's objectively true.

The impact an individual's hearing has on his ability to discern between A and B, is objective.

At the end of all that, whatever he values or likes better is subjective. But that's not normally what is being tested. (Though, that can be tested for as well, if desired).

You are absolutely correct in interpreting my usage of the term "subjective". Thank you for clearing it up for everybody.
post #40 of 119
We need to differentiate between hearing a difference and preference.

ABX is a wonderful tool for determining differences. Far too many times people criticize it without doing the simplest research about it. You can do it at home, for as long as you want, with whatever music you want, under whatever circumstances you prefer (e.g. alone, at night, etc.). One of the most hypocritical stances of "golden eared" individuals in high end audio is to describe differences between properly designed cables (e.g. power cables, interconnects, speaker cables) as night and day, but then decry the unfairness of ABX when "large" audible differences disappear in a blind test (despite the fact that ABX can address all of the concerns they raise).

Regarding speakers, yes blind studies do have value, because unlike the OP's claim, the work of Dr. Toole did show cognitive bias did affect how individuals perceived and ranked speakers in sighted listening tests.
post #41 of 119
As you say blind testing is a TOOL...not an ABSOLUTE.... Everyone argues that blind testing is the only way to prove differences in sound....and I'm one of many that disagree....but it can be useful... Just not for absolute proof...smile.gif
post #42 of 119
Quote:
Originally Posted by David Susilo View Post

^^^ since they are subjective, then why even bother? Buy what you like, even if it's only placebo.

If a person who is sick is cured although he/she is only given a placebo, what difference does it make?

It's entertainment, as long as you like it, then go buy it. If you like to have the centre channel located behind you, the right channel on the left and the right channel inside a cabinet in the room next door, then so be it.

placebo usually wears off...

until there's a better method, cause obviously blind testing isn't perfect, I still think it's worthwhile. if the speakers are SO close that you can't really tell while doing a blind test, then you can make your buying decision based on other factors (style, price, size) and be confident you didn't give up good sound.

maybe i'm reading this differently than you. i'm reading this as a method to shop for and ultimately buy speakers. so how I personally like them is of high importance, and blind testing is the only way I can remove my personal bias to focus on the sound first. after I've decided what speakers sound best to me, I can factor in the other characteristics and make my final decision based on all of those.
if you're reading this as a way to rate and review speakers, then I would agree, it's as misleading as it is helpful. what a reviewer, or panel of reviewers preferred on their equipment, in their room, is only going to be a loose guide to help me decide. for rating purposes i'd be more interested in measured results from a calibrated microphone in an anechoic chamber. that being said, i'd still find it interesting to see which speaker was preferred by more people in a blind test done of the top measured speakers. and I am not at all interested in that result if it's not a blind test. blind testing may be less than perfect, but doing a non-blind comparison is completely useless if not extremely misleading.
post #43 of 119
Quote:
Originally Posted by josh6113 View Post

As you say blind testing is a TOOL...not an ABSOLUTE.... Everyone argues that blind testing is the only way to prove differences in sound....and I'm one of many that disagree....but it can be useful... Just not for absolute proof...smile.gif

I agree completely big picture. but for my personal use, if I can't tell the difference, there is no difference.

so again, I think it depends on your frame of reference. if i'm shopping for myself, blind testing is an important tool, and I would consider there to be no benefit to a speaker sonically if I can't hear it. but as a reviewer, I would never assume my ability to hear is equal to everybody else's and that there couldn't be an audible difference i'm simply not hearing. the objective mic test would be my basis for any such claims, not the subjective blind test
post #44 of 119
I personally think they are pointless. Each of us have our own preferences sound wise and even though a speaker may have horrible reviews because it doesn't have the sound 75% of people like doesn't mean your one of them.

Audio is a subjective subject and while we can attempt to define how speakers sound, it's still based on the reviewers ears, his room, his preferences, and equipment, all if which are variables that will be different user to user.

A blind audio test by the purchaser in his room with his music and his setup is ideal, however is only valid for that one user.
post #45 of 119
Quote:
Originally Posted by EndersShadow View Post

I personally think they are pointless. Each of us have our own preferences sound wise and even though a speaker may have horrible reviews because it doesn't have the sound 75% of people like doesn't mean your one of them.

Audio is a subjective subject and while we can attempt to define how speakers sound, it's still based on the reviewers ears, his room, his preferences, and equipment, all if which are variables that will be different user to user.

A blind audio test by the purchaser in his room with his music and his setup is ideal, however is only valid for that one user.

Blind tests have nothing to do with preference, only difference.  

post #46 of 119
Quote:
Originally Posted by primetimeguy View Post

Blind tests have nothing to do with preference, only difference.  
prove it...smile.gif
post #47 of 119
Quote:
Originally Posted by primetimeguy View Post

Blind tests have nothing to do with preference, only difference.  

Ok... I will rephrase...

For those interested in a very intriguing paper and willing to read the entire thing, look here:
A Historical Overview of Stereophonic Blind Testing

Its a phenomenally detailed paper with cited legitimate sources within the industry and IMHO well worth the read.

Heck, not that anyone will actually read the whole thing, but I will just go ahead and post it here:
Quote:
Introduction

The application of blind and double-blind tests is thought by a small, but vocal, minority in the audio community to be the supreme evaluation standard for detecting audible differences in audio systems. It is true that some types of audio systems are well suited for blind and double-blind A/B or A/B/X type tests. A/B and A/B/X tests are useful in scenarios when the two audio signals being compared are simple in nature. For example, telephone company engineers have routinely used, and continue to use, A/B and A/B/X tests to evaluate improvements in voice circuit quality. [1] [2] [3] [4] However, we must realize and understand that a test that is suitable for one type of audio system might not be suitable for another. It is worth noting that the same company (the Bell Telephone System) that was responsible for the invention and implementation of telephone service was the same company that was responsible for the invention and implementation of home stereophonic audio systems. [5] [6] [7] It is even more interesting to note that while A/B and A/B/X tests were found to be appropriate for evaluating voice quality improvements on bandwidth-limited telephone circuits, subjective, non-blind listening tests based on careful listening, evaluator training and realistic home listening conditions were the scientific standards for the evaluation of stereophonic audio systems. [5] [6] [8] [9] [10] [11] [12] [13] [14]

It should not be too difficult to understand that a testing methodology that is appropriate for evaluating simple band-limited monophonic signals would most probably not be appropriate for evaluating complex stereophonic signals that cover the full range of human hearing and which are designed to convey aural, spatial and tactile information. Telephone systems are audio systems, but they are audio systems which are primarily designed to convey clear voice communication. Stereophonic systems are audio systems, but they are audio systems which are designed to convey a weighty, complex, realistic illusion of a three-dimensional music concert performance.

Origins Of Blind Stereophonic Audio Testing

A paper published by Jon Boley and Michael Lester in the proceedings of the 127th Convention of the Audio Engineering Society, October 2009, stated:

"ABX tests have been around for decades and provide a simple, intuitive means to determine if there is an audible difference between two signals."

Within the audio engineering community, the ABX methodology has become the standard psychoacoustic test for determining if an audible difference exists between two signals." [15]

The first statement is true if the signals are very simple in nature, especially if they are monophonic signals. The second statement is questionable since both founders of the ABX audio testing religion wrote ten-year follow-up papers lamenting the widespread unacceptance of ABX testing by audio engineers and the audio press. [19] [20]

Ethan Winer, at his "Audio Myths, Artifact Audibility and Comb Filtering" workshop presented at the 127th Convention of the Audio Engineering Society in October 2009 stated:

"Double blind tests are the gold standard in every field of science."

and

"It amazes me when some people claim that double blind testing is not valid for assessing audio gear." [16]

What is truly amazing is that some people would stray so far from the scientifically valid subjective listening evaluation procedures developed at Bell Telephone Laboratories and other electronics firms that participated in the invention and early development of home stereophonic systems (e.g. General Electric, Radio Corporation of America, etc.).

Another amazing feature of the Winer presentation is that he included a staged purse-snatching demonstation (at time 9:56) to illustrate the unreliability of short term visual memory. None of the audience members could accurately identify the "purse-snatcher", even though some were sure that they could. The purpetrator was only in the room for 10 seconds. Mr. Winer later contradicts himself (at time 27:50) by advocating the use of an audio evaluation test that uses short term aural memory.

As far as I have been able to determine, the seminal papers in the application of ABX methodology to stereophonic systems are a paper presented by Stanley Lip****z, Ph.D. and Dr. John Vandekooy Ph.D. in 1980 to the 65th Convention of the Audio Engineering Society in London and a paper presented by David Clark in 1981 to the 69th Convention of the Audio Engineering Society in Los Angeles.

Drs. Lip****z and Vanderkooy stated:

"In order for subjective tests to be meaningful to others, the following should be observed...The test must be blind or preferably double-blind. To implement such tests we advocate the use of A/B switchboxes." [17]

Mr. Clark stated:

"Listening tests used to evaluate audio equipment can seldom be considered scientific tests".

and

"A system for practical implementation of double-blind audiobility tests is described. The controller is a self-contained unit, designed to provide setup and operational convenience while giving the user maximum sensitivity to detect differences." [18]

The contoller that Mr. Clark mentioned was an "ABX Comparator" system that he and some associates were marketing through the "ABX Company".

It is curious to note that neither of these seminal papers present a discussion of how the proposed ABX methodology relates to the evaluation of the primary performance metrics of stereophonic sound systems, such as:

1. Optimization of sound stage width,
2. Optimization of sound stage depth,
3. Stable stereo image placement,
4. Clarity,
5. Detail,
6. Dynamics (dynamic range),
7. Tactile impact,
8. Sonic realism.

Whereas the founders of stereophonic audio systems emphasized listener education and ear training with music, Mr. Clark proposed a different training paradign for increasing the resolution sensitivity of listeners ([18, p. 332]):

"Great improvements in resolution can be achieved if the listener knows what to listen for. Sensitizing tests can use pink noise, sine waves, or pulses as appropriate to hear a difference. Sometimes an artificially enhanced distortion can be produced by reducing feedback or connecting multiple devices in series for distortion buildup. The listener is then more able to hear the difference in music."

Really?

Ten years after writing this, Mr. Clark, in a paper presented to the 91st Audio Engineering Society Convention in 1991 stated:

"Ten years ago [in 1982] the present author presented a paper to the AES on double-blind testing using the A/B/X technique. For the next five years, a device to conveniently implement this test was commercially available. It was thought by the author and his associates that general use of this system would resolve "The Great Debate" of whether or not small differences in audio components were audible." [19]

In the same paper, Mr. Clark stated:

"It becomes an ethical and perhaps legal question when it is claimed that improved sound quality is delivered despite failure of tests to prove it.

This would be less of an issue if the number of engineers who dismiss double-blind test results were small, but this is not the case. As Chairman of an AES Workshop on Esoteric Audio in 1988, I asked, by a show of hands, who in the audience believed that different gain-matched amplifiers of modern design sound different from each other. It was stated that all would measure good in conventional tests and all were operated below clipping or other gross distortion levels. Approximately 70% of the audience indicated they believed the amplifiers would likely sound different. This is an amazing response from members of an engineering society which failed to support the claim." [19]

Ten years after his seminal double-blind audio testing advocacy paper was published, Dr. Lip****z presented a paper to the AES 8th International Conference in 1991 in which he stated:

"It is now ten years since my initial involvement in the controversy surrounding double-blind subjective testing in audio and twelve years since this subject first hit the headlines with the Quad power amplifier comparison challenge in England.

A lot of water has passed under the bridge in the intervening years, but our hopes of a decade ago, that the validity of the method would be generally accepted by the audio press and adopted wherever feasible, have not been realized." [20]

At the same 1990 AES conference where Dr. Lip****z lamented the lack of widespread acceptance of blind testing in stereophonic system evaluation, Tom Nousaine, in a paper entitled "The Great Debate: Is Anyone Winning?", was decidely more upbeat:

"This paper simply presents a compilation of the twenty two blind and double blind listening tests [from 1978 to 1990] of power amplifiers for which numerical results have been published. There is a rather large collection of data which contains some surprising information and ultimately confirms that one side of the debate seems to have a commanding lead."

Mr. Nousaine concluded:

"Many factors can contribute to the subjective enjoyment of a given amplifier but sound quality differences are not among them.

This does not suggest that amplifiers are perfect and they will never be found to sound different. It does suggest to purchasers of today's audio amplifiers that as long as the product in question meets basic traditional measured performance standards, has enough output capability, and adequate quality of construction, it will be sonically indistinguishable from all others meeting those criteria." [21]

It is a shame that Mr. Nousaine and some others view the pursuit of stereophonic audio (stereophony) as some sort of contest to be won rather than an attempt to recreate a realistic reproduction of the live concert experience in the listener's home.

At the same 1991 AES conference where Mr. Clark lamented the lack of widespread acceptance of blind testing in stereophonic system evaluation and the demise of his company which was offering ABX Comparator devices, Tom Nousaine, in a paper entitled "Can You Trust Your Ears?", was decidely more upbeat:

"In assembling a summary of 22 blind listening tests published between 1978 and 1990 we discovered that subjects consistently reported preferences or differences in sound quality when given two identical alternatives."

and

"The evidence clearly demonstrates a bias in listeners who have an interest in audio to report differences which do not exist." [22]

Whereas the inventor of sterephonic home audio systems, Dr. Harvey Fletcher, and the other early researchers in the field advocated the use of comfortable, stress-free listening environments similar to a typical consumer's home, Mr. Nousaine preferred a different evaluative environment. The "test design" section of the paper stated:

"Our basic test procedure requirements included:

1. Listeners with a strong interest in audio sound quality ("Audiophiles")

2. Listeners with no audio background ("Consumers")

3. Single or Double Blind presentations

4. "Preference Scoring (Prefer A, B or Neither?")

5. Reasonably large sample size (30 by category and 300 overall)

6. Controlled introduction of loudness differences (1 dB)

7. "Purchase-like" conditions/scientific controls"
* Short musical selections
* Single listeners or small groups
* Written scores
* A/B Presentations with no repeat
* Maximum 10 trials/15 minute sessions (Preserve listener freshness)" [22]

In blind and double-blind telephone voice quality trials, the tests are administered under conditions representative of the way consumers actually use telephones. In blind and double-blind pharmaceutical trials, medicine is administered under conditions representative of the way patients would actually use it. However, when we see blind and double-blind trials applied to stereophonic audio systems, they are consistently used in a manner that is highly unrealistic and detrimental to an accurate stereophonic presentation. Why would a purportedly serious audio evaluation study seek to replicate "Purchase-like" listening conditions? This is not optimal, reasonable, realistic or scientific for a study which purports to be a critical analysis of a listener's ability to discern sound quality differences in stereophonic sound systems. "Purchase-like" listening conditions would be more appropriate for a marketing study. Why not replicate typical "home-like" listening conditions? That is what the scientists did in [9], [10], [11], [12], [13], [14], [23], [24] and [25].

Why would a purportedly serious audio evaluation study only allow very short musical selections? This was actually answered in the paper:

"Subjects also consistently requested shorter evaluation intervals. Sixty seconds seemed too long and some became impatient with 30 seconds. Subjects were paid a small stipend for participation." [22]

Really? Thirty seconds was "too long" to listen to a musical selection during a critical listening session conducted as part of a "scientific study"? I wonder if any of the "audiophiles" preferred the 30 second music snippets. One might also wonder if the participants were just there to collect a check as quickly as possible.

Why would a purportedly serious audio evaluation study use a "small group" listening arrangement? The optimal sound quality of stereophonic audio systems is only found in the stereo sweet spot, which can only be occupied by one person at a time. This is a fundamental design aspect of stereophonic audio systems which must never be violated in critical listening situations unless the intent is to acquire performance data for off-axis (out of the sweet spot) listening.

The extremely short listening intervals (30-60 seconds) and group listening sessions provide some insight into the contributing factors toward different listening impressions from identical pieces of equipment.

Mr. Nousaine provides some statistical analysis of his test results. One of the references in Mr. Nousaine's "Can You Trust Your Ears?" study is "Introduction to Probability and Statistics" (1967) by William Mendenhall. Mr. Nousaine cites Mendenhall's chapter 11. However, chapter 9 provides some important insight:

"The reader will note that we have employed two different statistical tests to test the same hypothesis. Is it not peculiar that the t test, which utilizes more information (the actual sample measurements) than the binomial test, fails to supply sufficient evidence for rejection of the hypothesis u1 = u2?

The explanation of this seeming inconsistency is quite simple. The t test described in Section 9.3 is not the proper statistical test to be used for our example." [Emphasis mine.]

Comparisons Of A/B/X and Blind Test Procedures To Basic Sterephonic Principles

Dr. Harvey Fletcher said that his motivation for inventing sterophonic sound systems (initially called auditory perspective systems) was to provide an exciting home concert experience:

"This symposium describes principles and apparatus involved in the reproduction of music in large halls, the reproduction being of a character that may give even greater emotional thrills to music lovers than those experienced from the original music." [5]

Therefore, high quality stereophonic sound systems are based on the following principles:

1. Realistic and accurate reproduction of the sonic and tactile sensations of the live concert experience,
2. Stable and realistic sonic imaging,
3. Sonic images distributed throughout a three-dimensional sound stage that is analogous to the way real instruments and singers are distributed throughout an actual concert stage,
4. Lifelike instrumental and vocal clarity,
5. Lifelike instrumental and vocal detail,
6. Listener education and training,
7. Emotional thrills.

In the 1981 seminal paper on blind testing for stereophonic systems, Drs. Lip****z and Vanderkooy appear to mock the founding design principles of stereophonic sound systems:

"The last half-dozen years have seen a remarkable proliferation in the number of adjectives used to describe the alleged audible qualities of audio components by the audio press, both above and below ground-words such as depth, air, graininess and liquidity spring to mind. The differences supposedly characterized by these emotive epithets are generally discovered during the course of a listening test, in which the component under test is either heard in isolation, or else is compared with other components of the same type." [17]

Seemingly unknown to Drs. Lip****z and Vanderkooy, much of the basic descriptive vocabulary used by the audio press came directly from peer-reviewed scientific journal papers written by the inventor of stereophonic sound and subsequent researchers in the field. For example, Steinberg and Snow devote considerable discussion to stereophonic depth perception in [9]. T. Somerville in [12] devotes considerable discussion to concert hall and home listening room acoustics. This is the paper in which the term "sound stage" was coined. Somerville discusses reverberation effects (which later came to be called "air" and "ambiance"), sound stage width and depth and "aesthetic presentation". One of the most profound comments made by Somerville was:

"A listener in a concert hall, because of the binaural characteristics of hearing, is able to distinguish between the various sections of the orchestra, and, in particular, he can pick out the solo part. In this, hearing is also assisted by sight." [12] [Emphasis mine.]

After over fifty years of scientific brilliance in the field of stereophonic sound, some people came in with "better" ideas:

1. According to Lip****z and Vanderkooy, It is better to blindfold or otherwise visually handicap music listeners. This directly conflicts with the fact that sound image localization is one of the basic principles of stereophonic reproduction. Somerville and other scientists found that hearing (sound localization and sonic depth perception) is assisted by sight. [12, p. 205]

2. According to Clark, the best ear training is gained from listening to pink noise, sine waves, pulses and artificially enhanced distortion. According to Nousaine and other blind-listening test proponents, critical listening sessions are best conducted when music is listened to in 30 to 60 second snippets. This directly conflicts with the fact that spatially correct reproduction of a live concert experience is one of the basic principles of stereophonic reproduction. Gaining evaluative expertise by listening to actual musical performances was the preferred training regimen of the founding scientists working on stereophonic systems. Stereophonic sound systems were invented for music lovers, not test signal lovers.

3. According to Clark, Nousaine, and other A/B/X and blind listening test proponents, sitting off axis and far outside the stereo sweet spot and/or listening in a group environment is a proper method of evaluating stereophonic sound systems. This is absurd and gives the impression that blind listening proponents are desperate to prove a point: That audiophiles are a delusional lot and that the perceived differences in audio components are largely imaginary.

The highly regarded (among blind audio testing proponents) A/B/X test is not scientifically applicable to stereophonic systems. According to the worldwide standard basic textbook on sensory evaluation techniques [26], the A/B/X test, which is usually referred to in the scientific literature by its proper name: duo-trio balanced reference test, is appropriate under the following conditions:

"Duo-Trio Test - Scope and Application

The duo-trio test (ISO 2004a) is statistically less efficient than the triangle test because the chance of obtaining a correct result by guessing is 1 in 2. On the other hand, the test is simple and easily understood.

Use this method when the test objective is to determine whether a sensory difference exists between two samples. This method is particularly useful in situations

1. To determine whether product differences result from a change in ingredients, processing, packaging, or storage.

2. To determine whether an overall difference exists, where no specific attributes can be identified as having been affected.

The duo-trio test has general application whenever more than 15, and preferably more than 30, test subjects are available. As a general rule, the minimum is 16 subjects, but for less than 28, the beta error is high. Discrimination is much improved if 32, 40, or a larger number [of subjects] can be employed."

[Beta error is a statistical error in testing when it is concluded that something is negative when it is actually positive. Beta error is often referred to as "false negative".]

"Two forms of the test exist: the constant reference mode, in which the same sample, usually drawn from regular production, is always the reference, and the balanced reference mode [ABX], in which both of the samples being compared are used at random as the reference.

Use the constant reference mode with trained subjects whenever a product well known to them can be used as the reference.

Use the balanced reference [ABX] mode if both samples are unknown or if untrained subjects are used." [26]

Conclusion

We now have nearly thirty years of documented experience in the application of blind and A/B/X testing to stereophonic audio systems. The A/B/X test setup arrangements and test results have been consistently absurd and consistently statistically similar to guessing. One would think that, after all these years of "all amplifiers sound alike", A/B/X and blind listening test proponents would begin to question the validity of their testing methodology. I don't expect this to ever happen because ridiculing audiophiles is so much fun.

The literature promoting these tests display a profound lack of understanding of how stereophonic systems work and a profound lack of understanding of how the human senses work. Some senses (sight/hearing and taste/smell) are closely interrelated. For a given sensory exercise, more than one sense may be employed. The primary sense for a given stimulus is typically enhanced by the sensory contributions of secondary senses. Compromising one secondary sense can adversely affect the perception of the primary sense. [27] [28] [29] [30] [31] During food consumption, flavor (taste) is the primary stimulus, but the perception of flavor is enhanced by the appearance (sight), aroma (smell), texture (touch), and sound (hearing) of the food. Soggy cereal is every bit as nutritious and tastes the same as crunchy cereal, but most people won't eat a bowl of soggy cereal if the texture is expected to be crunchy. They might perceive it as tasting bad, even though the chemical composition that affects taste receptors is the same. This is an example of an impairment in the sense of touch (secondary sense) affecting the sense of taste (primary sense) during food consumption.

The literal meaning of "stereophonic" is "solid sound" (from the Greek "stereos" for "solid" and "phone" for "sound"). [32] Solid things can be seen and felt. Therefore, "stereophonic" audio systems produce "solid sound" or sound which can be heard, seen and felt. The live concert experience is a sensory exercise which employs the senses of hearing, sight and touch. These same senses are employed in the home stereo experience. For stereophonic listening, the sense of hearing is primary, but the senses of sight and touch play important secondary roles in the creation of the stereophonic illusion of "solid sound". Compromising a person's sense of sight and/or touch, especially in the short term, is stressful and can have detrimental affects on sound localization ability. This stress can exert undue influence on listening evaluation results.

Stereophonic music reproduction is designed according to the principles of sound localization, long term sonic memory of actual musical events, and the reception of tactile sensations from the sound stage. Blind audio testing, which includes visually obscuring all or part of the sound stage, rapid switching of musical selections and off-axis and group seating, impairs the listener's ability to localize sounds (seeing), to internalize and evaluate aural cues (hearing) and to receive correct stereophonic tactile information (touching). Any stereophonic audio system testing methodology which compromises and hinders the processes of human sensory perception is useless.

ABX and blind testing proponents say that they want to apply a scientifically rigorous testing methodology to stereophonic audio in order to determine if the claimed differences in audio components actually exist. However, they ignore decades of scientifically and mathematically rigorous subjective listening techniques that were developed by the inventor and subsequent researchers in the field of stereophonic sound.

References

[1] Snow, W. B., "Audible Frequency Ranges of Music, Speech and Noise", Journal of the Acoustical Society of America, Vol. 3, Issue 1A, July 1931, pp. 155-166.
[2] Farhid, M. and Tinati, M.A., "Robust voice conversion systems using MFDWC", Proceedings of the 2008 IEEE International Symposium on Telecommunications, Tehran, Iran, August 2009, pp. 778-781.
[3] Kun, Liu Jianping and Zhang, Yonghong Yan, "High Quality Voice Conversion through Phoneme-Based Linear Mapping Functions with STRAIGHT for Mandarin", Proceedings of the IEEE Fourth International Conference on Fuzzy Systems and Knowledge Discovery, Haikou, China, August 2007, pp. 410-414.
Chinese Acad. of Sci., Beijing
[4] Cheng-Yuan Lin and Jang, J.-S.R. "New Refinement Schemes for Voice Conversion", Proceedings of the IEEE 2003 International Conference on Multimedia and Expo, Baltimore, MD, July 2003, pp. 25-728.
[5] Fletcher, Harvey, "Symposium on Wire Transmission of Symphonic Music and Its Reproduction in Auditory Perspective-Basic Requirements", Bell System Technical Journal, Vol. 13, 1934, pp. 239-244.
[6] Fletcher, Harvey, "Hearing, The Determining Factor for High-Fidelity Transmission", Proceedings of the I.R.E., Columbus, OH, June 1942, pp. 266-277.
[7] Hilliard, John K., "The History of Stereophonic Sound Reproduction", Proceedings of the Institute of Radio Engineers, Vol. 50, No. 5, May 1962, pp. 776-780.
[8] Fletcher, Harvey, “Hearing Aids and Deafness”, Bell Laboratories Record, Vol. 5, No. 2, October 1927, p. 33.
[9] Steinberg, J. C., and Snow, W. B., "Symposium on Wire Transmission of Symphonic Music and Its Reproduction in Auditory Perspective-Physical Factors", Bell System Technical Journal, Vol. 13, 1934, pp. 245-258.
[10] Harvey, F. K. and Schroeder, M. R., "Subjective Evaluation of Factors Affecting Two-Channel Stereophony", Journal of The Audio Engineering Society, Vol. 9, No. 1, January 1961, pp. 19-28.
[11] Moir, J. and Leslie, J. A., "The Stereophonic Reproduction of Speech and Music", Journal of the British Institution of Radio Engineers, London, September 1951, pp. 360-366.
[12] Somerville, T., "Survey of Stereophony", Proceedings of the Institution of Electrical Engineers, Convention on Stereophonic Sound Recording, Reproduction and Broadcasting, London, March 1959, pp. 201-208.
[13] Moore, H. B., "Listener Ratings of Stereophonic Systems", Institute of Radio Engineers Transactions on Audio, September-October 1960.
[14] Beaubein, W. H. and Moore, H. B., "Perception of Stereophonic Effect as a Function of Frequency", Journal of the Audio Engineering Society, Vol. 8, No. 2, April 1960, pp. 76-86.
[15] Boley, Jon and Lester, Michael, "Statistical Analysis of ABX Results Using Signal Detection Theory", Proceedings of the 127th Convention of the Audio Engineering Society, October 2009, New York, NY.
[16] Winer, Ethan, "Audio Myths, Artifact Audibility and Comb Filtering" workshop presented at the 127th Convention of the Audio Engineering Society in October 2009, New York, NY. http://www.youtube.com/watch?v=BYTlN6wjcvQ, YouTube video accessed 8/3/2010.
[17] Lip****z, S. P. and Vanderkooy, J., "The Great Debate: Subjective Evaluation", Journal of the Audio Engineering Society, Vol. 29, No. 7/8, July/August 1981, pp. 482-491.
[18] Clark, D., "High-Resolution Subjective Testing Using a Double-Blind Comparator", Journal of the Audio Engineering Society, Vol. 30, No. 5, May 1982, pp. 330-338.
[19] Clark, David, "Ten Years Of A/B/X Testing", 91st Audio Engineering Society Convention, New York, NY, October 1991.
[20] Lip****z, S., "The Great Debate-Some Reflections Ten Years Later", Audio Engineering Society 8th International Conference, Washington, D.C., May 1990, pp. 121-123.
[21] Nousaine, Tom, "The Great Debate: Is Anyone Winning?", Audio Engineering Society 8th International Conference, Washington, D.C., May 1990, pp. 117-119.
[22] Nousaine, Tom, "Can You Trust Your Ears?", 91st Audio Engineering Society Convention, New York, NY, October 1991.
[23] Schjonneberg, K. and Olson, F., "Listening Test Methods and Evaluation", Journal of the Audio Engineering Society, Vol. 9, No. 1, January 1961, pp. 29-36.
[24] Olsen, Harry F., "Stereophonic Sound Reproduction in the Home", Journal of the Audio Engineering Society, Vol. 6, No. 2, April 1958, pp. 80-90.
[25] Snow, W. B., "Auditory Perspective", Bell Laboratories Technical Journal, Vol. 12, No. 7, March 1934, pp. 194-198.
[26] Meilgaard, Morten, Civille, Gail Vance, Carr, B. Thomas, "Sensory Evaluation Techniques", 4th Ed., CRC Press, Boca Raton, FL, 2007, pp. 72-73.
[27] Hull, J., "Touching the Rock: An Experience of Blindness", Vintage Publishing, London, 1992.
[28] Montagu, A., "Touching: The Human Significance of the Skin", Columbia University Press, New York, London, 1971.
[29] Wright, D., "Deafness: A Personal Account", Faber, London, 1990.
[30] Yuan, Yi-Fu, "Sight and Picture", Geographical Review, Vol. 69, pp. 413-422, 1979.
[31] Rodaway, Paul, "Sensuous Geographies: Body, Sense and Place", Routledge, London, New York, 1994.
[32] World Book Dictionary, 1973 ed. Vol. 2, p. 2033.
post #48 of 119
Wow ! I'm glad medicine doesn't adhere to this sort of thing, we would still be using leaches.

Art
post #49 of 119
Quote:
Originally Posted by EndersShadow View Post

I personally think they are pointless. Each of us have our own preferences sound wise and even though a speaker may have horrible reviews because it doesn't have the sound 75% of people like doesn't mean your one of them.

Audio is a subjective subject and while we can attempt to define how speakers sound, it's still based on the reviewers ears, his room, his preferences, and equipment, all if which are variables that will be different user to user.

A blind audio test by the purchaser in his room with his music and his setup is ideal, however is only valid for that one user.

but does the other alternative work better?

the question is should the reviewers know what they're listening to first or not? i think no, they shouldn't know, they should also do blind tests if they are doing comparisons. what we draw from their 'preferences' is equally useless in either scenario, but at least with a blind test it's not also confused with brand bias.
post #50 of 119
Quote:
Originally Posted by Art Sonneborn View Post

Wow ! I'm glad medicine doesn't adhere to this sort of thing, we would still be using leaches.

Art

Hey doc, i'm here for my monthly blood letting... tongue.gif
post #51 of 119
Quote:
Originally Posted by EndersShadow View Post

Ok... I will rephrase...

For those interested in a very intriguing paper and willing to read the entire thing, look here:
A Historical Overview of Stereophonic Blind Testing

Its a phenomenally detailed paper with cited legitimate sources within the industry and IMHO well worth the read.

Heck, not that anyone will actually read the whole thing, but I will just go ahead and post it here:

Raife's conclusions are so poorly drawn out based on his citations that it's not funny. And let me tell you there are a TON of citations. Luckily my wife works at a university so she has access to all the paywalls that some cited papers are behind.

Raife is perfectly welcome to come here and defend his 'work'.



But here is a good example of the real deal and why blind testing is PERFECTLY valid and why ears only, sighted subjectivists will not put $$ where their mouth is:


The gift that keeps on giving:
Cable burn in challenge at PF


If anyone has been able to find a single valid reason as to WHY my bet to these people doesn't pass scientific rigors and also the subjectivist need to have long term, SIGHTED testing, then it hasn't been aired.
post #52 of 119
Quote:
Originally Posted by Jinjuku View Post

Raife's conclusions are so poorly drawn out based on his citations that it's not funny. And let me tell you there are a TON of citations. Luckily my wife works at a university so she has access to all the paywalls that some cited papers are behind.

Raife is perfectly welcome to come here and defend his 'work'.



But here is a good example of the real deal and why blind testing is PERFECTLY valid and why ears only, sighted subjectivists will not put $$ where their mouth is:


The gift that keeps on giving:
Cable burn in challenge at PF


If anyone has been able to find a single valid reason as to WHY my bet to these people doesn't pass scientific rigors and also the subjectivist need to have long term, SIGHTED testing, then it hasn't been aired.

Hello again Jin smile.gif. Given who and where my post came from I figured some folks would find fault with what Raife has written and in the thread I posted the link too there was some very rigorous debate about this subject with members from this particular forum. It went round and round and didn't really accomplish anything. There was also another thread here with similar results.

Everyone has the right to be heard in a public forum. I have said my piece you have said yours. In order to keep the peace and civility, I wont argue with you as we both know we have a bit of a history that doesn't need to be rehashed here. Neither of us needs to argue points on this one.

Also please do not start a flame war between two forums by cross posting threads from Club Polk. I know your not happy being banned from there and that subject doesnt need to be rehashed here.

Raife may or may not come into this thread to discuss his thoughts, I did let him know I posted his thread here, I will simply say I personally agree with him, others may not. Everyone's entitled to their own opinion.
post #53 of 119
Quote:
Originally Posted by EndersShadow View Post

I know your not happy being banned from there and that subject doesnt need to be rehashed here.

Good attempt at diverting. But still anyone can read that Polk thread and laugh their tails off at the idiocy displayed. So when Scott Wilkinson asks "are blind audio comparisons worthwhile" I like to answer with a concrete, real world , subjectivist beating challenge.

Again you are free to legitimately point out any procedural errors in my challenge to the sighted subjectivists. Joe Skubinski can shipped a pre-burned in cable in a anti-static bag. Why can't I?

All I know is I have offered to bring over a Pro-Audio Crown XLS drive core for you to evaluate blind vs your Carver and all anyone gets is " no thanks it's pro-audio and I have my reasons ". How does one really have a open discussion with someone of that mindset?
Edited by Jinjuku - 10/28/13 at 9:46am
post #54 of 119
Quote:
Originally Posted by josh6113 View Post

As you say blind testing is a TOOL...not an ABSOLUTE.... Everyone argues that blind testing is the only way to prove differences in sound....and I'm one of many that disagree....but it can be useful... Just not for absolute proof...smile.gif

Just speaking for myself: I think blind testing is the most honest way to prove differences in sound for the human ear. Also you have to realize that claims are made. And those claims are certainly testable. If you are a cable burn in proponent then usually the claim is made that burned in cables sound better than non-burned in cables. This is easily testable in a blind matter.

You ship out two sets of randomly labeled cables. Two burned in and two not burned in. The end user has full sighted, no test administrator, and in my particular offer 30 days. So all the subjectavists crutches get knocked out one by one and guess what you are left with?
post #55 of 119
Quote:
Originally Posted by EndersShadow View Post

Ok... I will rephrase...

For those interested in a very intriguing paper and willing to read the entire thing, look here:
A Historical Overview of Stereophonic Blind Testing

Its a phenomenally detailed paper with cited legitimate sources within the industry and IMHO well worth the read.

The subject of a fair amount of discussion before, here is but one quick counterpoint found in this forum alone. As with anything, it's good to honestly examine both sides before defining your own position.
post #56 of 119
I think it's something kinda in the middle.

If I see a big tower, I expect it to go down to atleast 40hz, if it doesnt, then I am most likely going to say it is not worth the money, even if it is one incredible sounding speaker from 80hz on up....

Same goes for the opposite, If I see a small book shelf that has suprising bass I will easily say that it is a good speaker, even if it doesnt sound as good overall compared to the above mentioned scenario.

So in that aspect, looks do affect what I am going to think of the speaker.

However, I think if you put similar speakers together to compare then I thnk the blind test becomes more and more important. However, I think the room is the biggest issue. The only way I think a blind test really would matter is if it was done in my house with my equipment in the same positions as my current speakers using music that I listen to on a regular basis. Otherwise the room and equipment has too much influence that I am not familiar with.
post #57 of 119
Quote:
Originally Posted by Jinjuku View Post

Good attempt at diverting. But still anyone can read that Polk thread and laugh their tails off at the idiocy displayed.

No attempt to divert anything, and again please note at no point have I called you or anyone else that has a different opinion any sort of name. I am attempting to stay civil here and will continue too. Please refrain from name calling as it does nothing but change it from a discussion to a heated argument.
Quote:
Originally Posted by Jinjuku View Post

All I know is I have offered to bring over a Pro-Audio Crown XLS drive core for you to evaluate blind vs your Carver and all anyone gets is " no thanks it's pro-audio and I have my reasons ". How does one really have a open discussion with someone of that mindset?

You are getting a bit off point with your comments directed at me personally that are only an attempt to discredit me, or draw me into a argument. While I would love to respond in kind, it would do no justice to this thread, and again you and I have our differences of opinions and they don't need to be brought up or discussed here as it benefits no one.

Let me put it this way, I have my opinion, you have yours. Neither of us are really going to make the other think differently. In addition personally I am a nobody when it comes to name recognition. I am a standard audio enthusiast, me endorsing a product or not, or agreeing with a point or not doesnt really carry any weight in the audio world. Its not like I am Nelson Pass.

I will not respond to any additional comments here. I posted my opinion, I posted a thread I thought was worth mentioning. Everyone is welcome to take my word as crap and disregard it, to look further and make the own opinion. Additional attempts to get me to respond will not work.
post #58 of 119
Quote:
Originally Posted by EndersShadow View Post

No attempt to divert anything, and again please note at no point have I called you or anyone else that has a different opinion any sort of name. I am attempting to stay civil here and will continue too. Please refrain from name calling as it does nothing but change it from a discussion to a heated argument.
You are getting a bit off point with your comments directed at me personally that are only an attempt to discredit me, or draw me into a argument. While I would love to respond in kind, it would do no justice to this thread, and again you and I have our differences of opinions and they don't need to be brought up or discussed here as it benefits no one.

The thread speaks for itself. Hypocrisy would be a more apt name. I point out that thread in particular to point people as to a real world example of an irrefutable sighted test that claimants still won't take.
Quote:
Originally Posted by EndersShadow View Post

Let me put it this way, I have my opinion, you have yours. Neither of us are really going to make the other think differently. In addition personally I am a nobody when it comes to name recognition. I am a standard audio enthusiast, me endorsing a product or not, or agreeing with a point or not doesnt really carry any weight in the audio world. Its not like I am Nelson Pass.

I will not respond to any additional comments here. I posted my opinion, I posted a thread I thought was worth mentioning. Everyone is welcome to take my word as crap and disregard it, to look further and make the own opinion. Additional attempts to get me to respond will not work.


Unfortunately your opinions are based on someone elses dogma. I offered twice now the next time I'm in Indy to come by and test your preconceived notion about 'pro-audio' amps vs your Carver and see what your reaction or opinion would be after. You've dug into a poor position very early and unfortunately it is going to have a negative impact on your journey in all of this.
post #59 of 119
Should there be a warning label on this thread for those with high blood pressure? wink.gif

It seems to me that to really know if something sounds better than another thing that to remove all input except the hearing part would allow for a better, less cluttered, decision: by the one person deciding, for him (or her) self. Be as objectively scientific as possible to reach an admittedly subjective opinion. That is, if sound is the only objective.

Fun to read and interesting thread.
post #60 of 119
If blind testing is substantially meaningless or invalid because of issues such as room acoustics, repeated/successive listening sessions, the nature of the selected software, selection of qualified listeners, etc., why would not these same problems/limitations apply just as much to sighted listening? - It would seem that the same limitations would be equally applicable, and perhaps more so, since the listener(s) would "expect" the $10,000 speaker (or one from his favorite manufacturer) to be "better" than the $6,000 speaker.

Some would disparage blind testing because the results are sometimes ambiguous, and the listeners may have conflicting opinions, etc. (Months of listening may be necessary.) However, it would seem to me that if differences between a very expensive speaker and a moderately priced speaker were considered ambiguous, or if the (expert/trained) listeners in a blind test disagreed substantially, that ambiguity itself would be a factor of interest for consideration by a potential buyer.

My own opinion is that we should have the benefits of both. - The better approach, of course, would include the opportunity to compare speakers of interest in one's own listening environment, with music including the types one usually listens to, etc. - That approach might have been feasible when we had lots of high-end dealers around the country, but seldom any more.

A further consideration, of course, is the matter of costs and administration difficulties, and who will be willing to pay for and conduct such testing. - And, to be realistic, I highly doubt that extensive blind-testing of new and older speakers would be favorably received by advertisers in Stereophile, for example, even if the costs and administration issues were not a problem.

Jim
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Community News & Polls
AVS › AVS Forum › News Forum › Community News & Polls › Are Blind Audio Comparisons Worthwhile?