AVS Forum | Home Theater Discussions And Reviews (https://www.avsforum.com/forum/)
-   Audio Theory, Setup, and Chat (https://www.avsforum.com/forum/91-audio-theory-setup-chat/)
-   -   The Dishonesty of Sighted Listening Tests - by Sean Olive (https://www.avsforum.com/forum/91-audio-theory-setup-chat/1857498-dishonesty-sighted-listening-tests-sean-olive.html)

NorthSky 01-19-2015 10:43 PM

The Dishonesty of Sighted Listening Tests - by Sean Olive
 
Just a simple short link people can read, and comment on. ...If they wish. ...Me, I found it quite captivating.

http://seanolive.blogspot.ie/2009/04...o-product.html

FMW 01-20-2015 05:44 AM

There simply is no rational explanation for how a sighted evaluation can be more accurate than a blind one. But audiophiles continue invent ways.

Olive's tests were more complicated than the ones we did. We were just testing for the existence of audible differences. It was black or white. Yes or no. Olive tests for preference among products that definitely have audible differences. So his tests deal with differences in the audible differences as they are affected by hearing bias.

JHAz 01-20-2015 08:21 AM

my only comment is that the title of the blog is a bit off putting, potentially. I guess it depends on who you think is being dishonest. Certainly the persons participating in the test can be presumed to be honestly reporting their reactions. They just don't know the extent to which their reactions are influenced by conscious and subconscious factors in the sighted test.

To take it to a different kind of double blind testing, if I have a headache and take a sugar pill as part of a test of effectiveness of some painkiller, and my headache goes away, it is in no way dishonest for me to say my headache went away. The question is whether the headache went away because of the sugar pill (unlikely) or psychological factors associated with my belief that I've ingested a painkiller (possible) or it was going away all by itself regardless (possible), or it went away for some other reason (presumably possible even if I can't think of what it might be).

Stephan Mire 01-20-2015 08:36 AM

Something I have been thinking of more and more lately, is the art of comparing audio gear which I thought was pretty straightforward, but the more I think about it, the more complex it is.

The idea is not to compare the conditions under which the equipment is to be compared, but rather, the equipment and the more I think about it, the more I realise that people in sighted, casual conditions are comparing the conditions, not the equipment. There seems to be a big distinction between these two things.

So in my mind, unless the conditions are equalised in some way, you cannot compare equipment as the operating assumption is that the conditions are apples to apples which in many cases is not the case. Am I on the right track here? It kind of just hit me one day after reading subjective anecdotes from someone about his amplifier comparison.

And of course, in addition to all of that, you have all these other psychological confounders to deal with which are outside our control. So there is very little to control. Again, I'm speaking for myself and what I think (suspect) is happening, but I could very well be wrong.

Eyleron 01-20-2015 08:47 AM

Quote:

Originally Posted by JHAz (Post 30987105)
my only comment is that the title of the blog is a bit off putting, potentially. I guess it depends on who you think is being dishonest. Certainly the persons participating in the test can be presumed to be honestly reporting their reactions. They just don't know the extent to which their reactions are influenced by conscious and subconscious factors in the sighted test.

I think it's okay as long as you realize what the adjective is related to. The title of the blog post is "The Dishonesty of Sighted Listening Tests" not, "The Dishonesty of Sighted Listening Test Respondents".

I could design a test where in the early morning I bid you to where my Magic Day Warmer Hat, and then ask you at 2pm if it feels warmer outside. You'll say yes. You're not dishonest. But I am dishonest in designing, conducting, and publishing this "test".

Eyleron 01-20-2015 09:08 AM

Quote:

Originally Posted by Stephan Mire (Post 30987481)
The idea is not to compare the conditions under which the equipment is to be compared, but rather, the equipment and the more I think about it, the more I realise that people in sighted, casual conditions are comparing the conditions, not the equipment. There seems to be a big distinction between these two things.

Absolutely. You cancel out factors you don't want to test for. You reduce to only one variable the difference between A & B as much as possible.

So, you use the same audience as close in time as possible to test speakers A, B, C and then D, E, F. You randomize the order. You repeat the test several times. You use a statistically-significant sized testers group.
Sometimes one tester will get hungrier torwards the end of the session, become grumpy, and value a speaker worse. That's okay, because you use that person at other times of the day and other days. And you used enough people that his anomalous rating is lost to noise.

You use an anechoic room. You use a normal listening room, but with the speakers placed in multiple locations over different trials.

Quote:

Originally Posted by Stephan Mire (Post 30987481)
So in my mind, unless the conditions are equalised in some way, you cannot compare equipment as the operating assumption is that the conditions are apples to apples which in many cases is not the case. Am I on the right track here? It kind of just hit me one day after reading subjective anecdotes from someone about his amplifier comparison.

Of course, these are basic aspects of statistics and testing methodology developed over generations to test things objectively and subjectively. You can find out what people like, and you can find out why they like it, and you can find out that what sort of people in what conditions will like which. To do so, you vary only the variables you want to test for.

Same people and equipment, change the speaker. Are you testing for how the speaker sounds? Then hide the speaker. Sometimes you should pretend to change speakers, but use the same speaker. Rinse, repeat.

Quote:

Originally Posted by Stephan Mire (Post 30987481)
And of course, in addition to all of that, you have all these other psychological confounders to deal with which are outside our control. So there is very little to control. Again, I'm speaking for myself and what I think (suspect) is happening, but I could very well be wrong.

But it is in our control. If this was too hard, we'd never know how to flavor our food, how to design our furniture, how to pressurize our airliner cabins, how to design clothes, what shows to put on TV, etc.
It is more work than asking a couple guys, "Hey, listen to these $4,000 speakers that look gorgeous and tell me what you hear."

You need more bodies, more time, and more controlled environment to do it right. Holt and Sean's point is that DBT are done for many other industries, but audio is a willful anomaly for some reason. But it's not because it can't be done.

You should read the Floyd Toole book referenced in the blog.

What's disappointing and ironic is that Harman Group doesn't often practice what its researchers preaches. Shows how entrenched the bad mentality in management and marketing is.

Billy30045 01-20-2015 09:36 AM

Quote:

Originally Posted by Eyleron (Post 30988490)
Absolutely. You cancel out factors you don't want to test for. You reduce to only one variable the difference between A & B as much as possible.

So, you use the same audience as close in time as possible to test speakers A, B, C and then D, E, F. You randomize the order. You repeat the test several times. You use a statistically-significant sized testers group.
Sometimes one tester will get hungrier torwards the end of the session, become grumpy, and value a speaker worse. That's okay, because you use that person at other times of the day and other days. And you used enough people that his anomalous rating is lost to noise.

You use an anechoic room. You use a normal listening room, but with the speakers placed in multiple locations over different trials.



Of course, these are basic aspects of statistics and testing methodology developed over generations to test things objectively and subjectively. You can find out what people like, and you can find out why they like it, and you can find out that what sort of people in what conditions will like which. To do so, you vary only the variables you want to test for.

Same people and equipment, change the speaker. Are you testing for how the speaker sounds? Then hide the speaker. Sometimes you should pretend to change speakers, but use the same speaker. Rinse, repeat.



But it is in our control. If this was too hard, we'd never know how to flavor our food, how to design our furniture, how to pressurize our airliner cabins, how to design clothes, what shows to put on TV, etc.
It is more work than asking a couple guys, "Hey, listen to these $4,000 speakers that look gorgeous and tell me what you hear."

You need more bodies, more time, and more controlled environment to do it right. Holt and Sean's point is that DBT are done for many other industries, but audio is a willful anomaly for some reason. But it's not because it can't be done.

You should read the Floyd Toole book referenced in the blog.

What's disappointing and ironic is that Harman Group doesn't often practice what its researchers preaches. Shows how entrenched the bad mentality in management and marketing is.

I am going to give you an example of why sighted reviews are always bias.

Back in the day LOL: I was manager of engineering at Clarion, we were working on next generation audio systems for Nissan. Our competition where Panasonic and Bose, I was frustrated because no matter what we did we were always coming in second to Bose. I had Nissan deliver 4 vehicle to our corporate office, we covered up the dashes and told them this was to avoid any adjustments to the systems. We also labeled each vehicle as to the system, V1 current system, V2 Panasonic, V3 Clarion and V4 Bose.
We set up or own FM broadcast so each vehicle would receive the same music and test tones, to make the test as far as possible. There were 3 engineers from Nissan, we spent about 3 hours review the systems. After lunch we tallied the scores, and it was the same, Bose 1, Clarion 2 and Panasonic 3, the current system was used as reference only. After the scores were reviewed we went back to the cars I removed the dash covers and V1-3 had our system, the engineers had it set in their minds that Bose was better and they heard what they needed to back up their bias. So I proved my point and was removed from the Nissan account the next day, and was told by my boss I was lucky to keep my job. Apparently I embarrassed the Nisan engineers, and they complained to our president.
Luckily we just received the go ahead from Chrysler and I took over that account, lesson learned, but it sounded like a good idea at the time.

The only true test is blind

Eyleron 01-20-2015 10:19 AM

Quote:

Originally Posted by Billy30045 (Post 30989330)
...the engineers had it set in their minds that Bose was better and they heard what they needed to back up their bias.

Interesting. How come the speaker system testers were the engineers, and not members of the public?

zgeneral 01-20-2015 11:15 AM

Quote:

Originally Posted by FMW (Post 30983609)
There simply is no rational explanation for how a sighted evaluation can be more accurate than a blind one. But audiophiles continue invent ways.

Olive's tests were more complicated than the ones we did. We were just testing for the existence of audible differences. It was black or white. Yes or no. Olive tests for preference among products that definitely have audible differences. So his tests deal with differences in the audible differences as they are affected by hearing bias.

What's more important in a sighted test is that you don't listen from a house on a hill. The phased antimatter i the cables goes out of sync causing issues that can be heard on a true high end system by a real audiophile. Obviously, you can test this, but if you don't hear a difference, you're not an audiophile. :)

Quote:

How come the speaker system testers were the engineers, and not members of the public?
It was an internal test meant to show bias since they wouldn't allow double blind testing. The visual bias was amplified by it being HK staff actually. Not having it be the general public wasn't a bad thing. Just don't post this in the speaker forum. You'll have dozens of people swearing that their 'trained' ears wouldn't be biased.

Quote:

The only true test is blind
Great story. It's sad that educated people still aren't able to put away their emotions. It looks like your company was losing the account anyways, so I'm not sure why they'd be upset at you.

wilbur_the_goose 01-20-2015 11:23 AM

An anecdote: In the late 1980's, I stopped in at a store called Incredible Universe. They had a demo room set up with huge speakers, so I decided to watch/listen to the demo.

The room had about 20 people in it when the demo started, and the speakers sounded amazing. Music was full and rich and bass was much better than I had at home.

About 5 minutes into the demo, the presenter moved the speakers (which turned out to be fake) and showed that the sound was coming from Bose acoustimass satellite speakers and their bass module. It sold me, and I lived happily with those speakers until I discovered that I could get a much better bang for the buck (remember - there was no internet then, so many of us were on our own).

Point is that the sight of the huge speakers with OK sound fooled many of us into buying those speakers. If they had started with the view of the actual speaker, I know I wouldn't have bought them.

Of course, the opposite can be true too. I demo'd some McIntosh equipment only to find that I preferred my 4 year old Denon AVR-4311 with Emotiva amplification. Go figure :)

Bill Fitzmaurice 01-20-2015 11:47 AM


FMW 01-20-2015 12:03 PM

Quote:

Originally Posted by JHAz (Post 30987105)
my only comment is that the title of the blog is a bit off putting,


Perhaps offputting to you but certainly accurate. Any time a hi fi audio comparison is done sighted means that the results of comparison include bias and are therefore pretty unreliable. That means that it is a dishonest methodology for determining audible differences or preferences because it produces unreliable results. Sounds pretty accurate to me. Blind tests, on the other hand, control bias and provide reliable results. They would be honest by comparison.

Billy30045 01-20-2015 12:44 PM

Quote:

Originally Posted by Eyleron (Post 30991034)
Interesting. How come the speaker system testers were the engineers, and not members of the public?

Because they gave us the go or no go

wilbur_the_goose 01-20-2015 12:45 PM

^^^^
That's exactly why the medical field uses double-blind testing.

Eyleron 01-20-2015 12:47 PM

Quote:

Originally Posted by Billy30045 (Post 30996146)
Because they gave us the go or no go

Which is fine, as long as management is honest with themselves and say, "These are the speakers our engineers prefer." :)

Bill Fitzmaurice 01-20-2015 02:07 PM

Quote:

Originally Posted by Eyleron (Post 30996234)
Which is fine, as long as management is honest with themselves and say, "These are the speakers our engineers prefer." :)

In truth what they meant was "These are the speakers that our marketing department prefers, because they're what they think customers will most likely be willing to pay more for." ;)

amirm 01-20-2015 02:22 PM

Quote:

Originally Posted by wilbur_the_goose (Post 30993162)
An anecdote: In the late 1980's, I stopped in at a store called Incredible Universe. They had a demo room set up with huge speakers, so I decided to watch/listen to the demo.

The room had about 20 people in it when the demo started, and the speakers sounded amazing. Music was full and rich and bass was much better than I had at home.

About 5 minutes into the demo, the presenter moved the speakers (which turned out to be fake) and showed that the sound was coming from Bose acoustimass satellite speakers and their bass module. It sold me, and I lived happily with those speakers until I discovered that I could get a much better bang for the buck (remember - there was no internet then, so many of us were on our own).

I sat through the same demo but at Fry's. Was shopping there and I see this sign saying home theater demo. I walk in and there are two other guys there in this rather bit and fancy theater. The demo guy plays movie clip (the one with Dustin Huffman and virus in the monkey). There were these tall, tower speakers and I kept wondering why they sounded so harsh and distorted.

The demo finishes and demo guy proudly pulls off the big grills stuck there with velcro and there is the tiny Bose Acoustimass speakers. Well, I had my answer as to why they sounded so stressed. The demo guy goes to the other guy sitting in the front asking what he thought expecting him to gosh about how good it sounded. To his shock the guy said, "I couldn't understand anything they were saying!" He asked him to repeat and he said the same thing. You should have seen the look on the face of the demo guy. Dejected, he let us out.

That said, I did think the sound was coming from those large speakers. So the visual bias did work. What didn't work was expecting those tiny speakers to fill an entire theater at elevated volume.

wilbur_the_goose 01-20-2015 02:31 PM

amirm - I was young and dumb at the time :)

Ratman 01-20-2015 02:41 PM

Quote:

Originally Posted by wilbur_the_goose (Post 30999953)
amirm - I was young and dumb at the time :)

Young and dumb? Nah... YOU never "accepted" an Emmy Award!
You don't shop at Fry's and you can probably name the film. :D

amirm 01-20-2015 03:45 PM

Quote:

Originally Posted by wilbur_the_goose (Post 30999953)
amirm - I was young and dumb at the time :)

Sorry, didn't mean it that way :). Thanks for taking it in stride.

jkeny 01-21-2015 07:07 AM

Can I raise some questions about this blog (which is based on a Olive/Harmon test) - none of this will come as a surprise to Amir as we happen to be talking about this same paper on another forum.
http://4.bp.blogspot.com/_w5OVFV2Gso...kerRatings.png

- I think the test exaggerates the difference between sighted & blind preferences by the very fact that it uses Harmon employees for testing mostly Harmon speakers (G,D,S of the 4). I would like to have seen a control test used where a bank of non Harmon speakers were used.
- the only non Harmon speaker was T & this scored very similar between sighted & blind - is this the actual difference when no other psychological factors (employee pressure) are in play? We don't know.
- the graph shows that in blind testing all speakers are pretty close in preference - is this showing an anomaly of blind testing - it minimises real differences?. Does this correlate with how the speakers measure? i.e would they be expected to be close in preference based on their measured differences? I would like to know the answer but it's not in the results
- they use only one speaker in their test & claim (based on their research) that this is more discriminating than a stereo pair. I would like to see others test this claim before I'm convinced
- to me using one speaker in the test is just avoiding psychoacoustics - I consider it the equivalent of evaluating 3D TVs with one eye closed
- I also believe, in the test that they minimise room boundary reflections which again I find to be questionable.
- The test is so divorced from how a stereo pair of speakers will be used in a room that I question if their conclusions can be extrapolated to real-world usage

As to the title, I find it revealing of the particular skewed angle that Olive is taking & the fact that he hasn't addressed the points I've raised confirms to me, that he is being less than objective.

Yes, I believe Olive has done more damage to the DBTers than all the anti-DBT posts on audio forums.
I've been suggesting for a long time now that If anybody wants to prove DBTs don't skew results towards nulls then include controls in them for false negatives - then we will have a measure of whether a null result is because the test is prone to false negatives or not, as the case may be. A perfectly logical & reasonable thing to do but so far just excuses put forth for not doing so. Until I see such controls the test is unreliable, in my mind

Eyleron 01-21-2015 08:36 AM

Quote:

Originally Posted by jkeny (Post 31018194)
- I think the test exaggerates the difference between sighted & blind preferences by the very fact that it uses Harmon employees for testing mostly Harmon speakers (G,D,S of the 4). I would like to have seen a control test used where a bank of non Harmon speakers were used.

I think the main point of the blog post was, "Look, even our own employees are biased in sighted tests. They are sure our own speakers are superior to others." The purpose of the test wasn't for peer-reviewed study, but rather to impress upon colleagues and superiors at Harman the importance of DBT and how biases can skew results.

Quote:

Originally Posted by jkeny (Post 31018194)
- the only non Harmon speaker was T & this scored very similar between sighted & blind - is this the actual difference when no other psychological factors (employee pressure) are in play? We don't know.

The rating isn't an absolute scale, but rather relative to the other speakers. I suspect the same speaker, put up against terrible speakers, would rate higher. If the preferences are relative, it stands to reason that some might sit at the same number yet still be considered better or worse vs the other speakers.

You're right, though, we don't know. We'd need to see more results to see patterns.

Quote:

Originally Posted by jkeny (Post 31018194)
- the graph shows that in blind testing all speakers are pretty close in preference - is this showing an anomaly of blind testing - it minimises real differences?. Does this correlate with how the speakers measure? i.e would they be expected to be close in preference based on their measured differences? I would like to know the answer but it's not in the results

No, it shows that in blind testing all those speakers are pretty close in preference, for those 40 Harman employees.

I wouldn't consider it a definitive test for those speakers. Toole and Olive and Harman's research has involved many more subjects in tests, that included people off the street, professed audiophiles, audio engineers, etc. They've tested speakers by themselves and in stereo pairs and in multi-channel setups. And as Olive mentions, they have a multi-channel speaker shuffler testing room.

Quote:

Originally Posted by jkeny (Post 31018194)
- to me using one speaker in the test is just avoiding psychoacoustics - I consider it the equivalent of evaluating 3D TVs with one eye closed

Actually, testing a 3D TV should involve its 2D capabilities, brightness, contrast, and absolute resolution. If the stereoscopic effect of using two eyes or cameras includes cross-talk and ghosting that masks testing those metrics, then you'd test the TV in 2D mode.

Just like in building anything, you test parts in isolation. Jet engine: you test the fan blades by themselves and specify materials and strengths etc. At some point you test the engine on the ground (not how it'll be finally used). You don't just assemble it all together and test the engine flying a plane with passengers on it.

Quote:

Originally Posted by jkeny (Post 31018194)
- I also believe, in the test that they minimise room boundary reflections which again I find to be questionable.

Do you find it questionable that they chose to do so, or that you question their methodology for doing so?

Quote:

Originally Posted by jkeny (Post 31018194)
- The test is so divorced from how a stereo pair of speakers will be used in a room that I question if their conclusions can be extrapolated to real-world usage

Likewise, a stereo pair is divorced from how it would be used in multi-channel in a room.
For instance, with just a stereo pair, Toole's testing showed that lateral reflections were more important.
In multi-channel, the image-broadening and ambiance is provided more by surround speakers.

Quote:

Originally Posted by jkeny (Post 31018194)
As to the title, I find it revealing of the particular skewed angle that Olive is taking & the fact that he hasn't addressed the points I've raised confirms to me, that he is being less than objective.

The title is a thesis as a result of decades of work, not just this little test to make a point to coworkers twenty years ago.
The point is that these humans, 40 Harman employees, regarded speakers' audible performance differently when sighted versus blind. Olive obviously had other agendas too, since he included the non-German and German versions of the same speaker.

Quote:

Originally Posted by jkeny (Post 31018194)
Yes, I believe Olive has done more damage to the DBTers than all the anti-DBT posts on audio forums.

I agree that this is not the best study to publish today to make the point.

Other papers, including the Toole book, includes many more tests with many more speakers in different conditions, if you're interested.

jkeny 01-21-2015 09:07 AM

Quote:

Originally Posted by Eyleron (Post 31020994)
I think the main point of the blog post was, "Look, even our own employees are biased in sighted tests. They are sure our own speakers are superior to others." The purpose of the test wasn't for peer-reviewed study, but rather to impress upon colleagues and superiors at Harman the importance of DBT and how biases can skew results.

Yes, I guess I'm just posting a warning that those results are not representative of the differences between sighted & blind listening. I would have liked to have seen a graph of non-Harmon speakers blind Vs sighted listening preferences to get a handle on how much the original graph is exaggerated.


Quote:

The rating isn't an absolute scale, but rather relative to the other speakers. I suspect the same speaker, put up against terrible speakers, would rate higher. If the preferences are relative, it stands to reason that some might sit at the same number yet still be considered better or worse vs the other speakers.

You're right, though, we don't know. We'd need to see more results to see patterns.
My point is that for the only non-harmon speaker, T, preference whether sighted or blind is exactly the same score i.e it doesn't matter if listening is done sighted or blind. I wondered if this was more typical of listening tests when neutral listeners are used (no inbuilt biases)?

Quote:

No, it shows that in blind testing all those speakers are pretty close in preference, for those 40 Harman employees.

I wouldn't consider it a definitive test for those speakers. Toole and Olive and Harman's research has involved many more subjects in tests, that included people off the street, professed audiophiles, audio engineers, etc. They've tested speakers by themselves and in stereo pairs and in multi-channel setups. And as Olive mentions, they have a multi-channel speaker shuffler testing room.
OK, it just raised a question I have about blind tests - that they skew results towards, null, no differences heard - & the blind results graphed here showed a similar trend


Quote:

Actually, testing a 3D TV should involve its 2D capabilities, brightness, contrast, and absolute resolution. If the stereoscopic effect of using two eyes or cameras includes cross-talk and ghosting that masks testing those metrics, then you'd test the TV in 2D mode.
Sure 2D capabilities are important but that is no reason to ONLY test 2D capabilities.

Quote:

Just like in building anything, you test parts in isolation. Jet engine: you test the fan blades by themselves and specify materials and strengths etc. At some point you test the engine on the ground (not how it'll be finally used). You don't just assemble it all together and test the engine flying a plane with passengers on it.
Yes but you can't compare engineering, with very accurate measurements that can be shown to accurately correlate to the desired end-use of the device with an audio device whose function is to present an illusion of reality which is determined solely by our auditory perception. No measurements in audio correlate to this end-use function


Quote:

Do you find it questionable that they chose to do so, or that you question their methodology for doing so?
I believe that yet again, they are simplifying the test to the extent that it's relevance to real-world usage is questionable.



Quote:

Likewise, a stereo pair is divorced from how it would be used in multi-channel in a room.
For instance, with just a stereo pair, Toole's testing showed that lateral reflections were more important.
In multi-channel, the image-broadening and ambiance is provided more by surround speakers.
Well, I was restricting my comments to 2 channel playback but I don't see your point?



Quote:

The title is a thesis as a result of decades of work, not just this little test to make a point to coworkers twenty years ago.
The point is that these humans, 40 Harman employees, regarded speakers' audible performance differently when sighted versus blind. Olive obviously had other agendas too, since he included the non-German and German versions of the same speaker.
He chose the title to use on his blog, irrespective of the title of his AES papers or the title of the report used internally in Harmon. I question his motivation as I do when I read any skewed newspaper headlines


Quote:

I agree that this is not the best study to publish today to make the point.

Other papers, including the Toole book, includes many more tests with many more speakers in different conditions, if you're interested.
Yes, I recently came across this quote from the intro to one of Toole's papers "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2"
Quote:

Using the highly reliable subjective ratings from an earlier study, loudspeaker measurements have been examined for systematic relationships to listener preferences. The resuls has been a logical and orderly organization of measurements that can be used to anticipate listener opinion. With the restriction to listeners with near-normal hearing and loudspeakers of the conventional forward-facing configuration, the data offer convincing proof that a reliable ranking of loudspeaker sound quality can be achieved with specific combinations of high-resolution free-field amplitude-response data. Using such data obtained at several orientations it is possible to estimate loudspeaker performance in the listening room. Listening-room and sound-power measurements alone appear to be susceptible to error in that while truly poor loudspeakers can generally be identified, excellence may not be recognized. High-quality stereo reproduction is compatible with those loudspeakers yielding high sound quality; however, there appears to be an inherent trade-off between the illusions of specific image localization and the sense of spatial involvement.
I'm not sure how to read the bolded statement - anyone help in understanding this?

FMW 01-21-2015 09:18 AM

Quote:

Originally Posted by jkeny (Post 31022226)

I'm not sure how to read the bolded statement - anyone help in understanding this?


It is gibberish to me but I will say that it is irrational to think that a sighted test is more reliable than a blind one, no matter who the test subjects are or for whom they are employed.

jkeny 01-21-2015 10:30 AM

Quote:

Originally Posted by FMW (Post 31022666)
It is gibberish to me but I will say that it is irrational to think that a sighted test is more reliable than a blind one, no matter who the test subjects are or for whom they are employed.

My take on it is that there are lots of unrecognised factors introduced in doing quick A/B style blind testing with repetitions & the internal controls are a way of evaluating how much effect these are having on the results - I'm particularly interested in how prone the test is to false negatives (hearing no difference where a known audible difference exists). I'm very sure that there will be a great variability from one test to another in this regard but we have no way of knowing this from the results - there's no internal calibration or validation of the test. Without these I find it irrational to consider a particular blind test superior to a sighted test.

An example from this forum is cogent - ArnyK produced some ABX results recently. He didn't listen to A/B, just randomly hit A or B for most trials. Only the timing of each trial showed unnaturally short timings was this picked up (1, 2, 3 4 secs).

If someone loses concentration & is just guessing, we have no way of knowing the occurrence of this from the results without having some embedded internal controls within the test to catch guessing through false negatives.

There is no doubt that knowledge can influence our hearing but I find that therefore using an unnatural way of listening in a blind test with repetitions of small audio pieces is not a good substitute particularly when no attempt is made to examine how these NEW factors affect the results.

Eyleron 01-21-2015 11:12 AM

Quote:

Originally Posted by jkeny (Post 31022226)
Sure 2D capabilities are important but that is no reason to ONLY test 2D capabilities.

Yes, and they do test multiple speakers working as part of a system (in a real room, in stereo/multi-channel).
Perhaps the Harman staff in 1994 had already been doing sighted single-speaker testing, so keeping Olive's test to single-speaker was most applicable for that test?

Quote:

Originally Posted by jkeny (Post 31022226)
Yes but you can't compare engineering, with very accurate measurements that can be shown to accurately correlate to the desired end-use of the device with an audio device whose function is to present an illusion of reality which is determined solely by our auditory perception. No measurements in audio correlate to this end-use function

Well, to stretch the analogy: User testing (a softer science where you're using survey responses in a hopefully controlled environment) might show that airline passengers report that pressurization up to 80% of sea level pressure was the threshold of "discomfort" and 90% was the threshold of "noticeable" and anything between 91% and 110% was unnoticeable.
So you have targets for what will be "good" or "bad."

Now you can use accurate measurements and engineering to achieve those targets.

Similarly for speakers, their extensive testing of comb filtering, reflections, frequency response, phase response, etc. gave them a wealth of knowledge of what people reported they could discern and what they liked and disliked.
For instance, users liked a smooth, fairly flat response. A less flat response (ie a downward tilt) but smooth, was less objectionable than a more flat response that had a resonance at a frequency. Once you gain confidence that that's what people like (anechoic, in a real room with reflections, stereo, multi-channel), now your engineers have parameters to keep the speakers in if they're to be of a certain "quality."

You can do the same for non-linear distortions (which are noticeable, which get masked, which are objectionable), directivity (e.g. they found that lateral reflections should match the on-axis response, and if they do, the response is neutral to positive. If not, might be better off killing the first lateral reflections).

Quote:

Originally Posted by jkeny (Post 31022226)
I believe that yet again, they are simplifying the test to the extent that it's relevance to real-world usage is questionable.

Yeah, the test of one speaker without a stereo pair would have to be suggestive of real world performance. Like testing an engine before it's mated to its paired engine, mounted on a plane, flown.
Presumably, they found that listening to a single speaker unmasked issues that were less apparent in a pair.

Quote:

Originally Posted by jkeny (Post 31022226)
Well, I was restricting my comments to 2 channel playback but I don't see your point?

My point was that if multi-channel is an important use case, a stereo test may be just as abstracted and not as applicable to real usage as it should.

Quote:

Originally Posted by jkeny (Post 31022226)
He chose the title to use on his blog, irrespective of the title of his AES papers or the title of the report used internally in Harmon. I question his motivation as I do when I read any skewed newspaper headlines

The blog is an editorial.

Quote:

Originally Posted by jkeny (Post 31022226)
Yes, I recently came across this quote from the intro to one of Toole's papers "Loudspeaker Measurements and Their Relationship to Listener Preferences: Part 2"

I'm not sure how to read the bolded statement - anyone help in understanding this?

At least in the second clause, he's saying that it's up to user preference of whether they would rather have point source imaging (which means killing lateral reflections) or the image broadening effect, where lateral reflections were pleasing if they matched the on-axis response. Also, they found the latter was less important in multi-channel. He acknowledges that sound engineers and musicians wanted more direct sound, and for good reason. That thesis is, "The sound engineering people have models for rooms where you kill more first reflections, and while that might serve their purposes, it's not preferred by typical consumers of the art, who enjoyed the first reflections and who found they added to intelligibility."

jkeny 01-21-2015 11:25 AM

Quote:

Originally Posted by Eyleron (Post 31026802)
....
At least in the second clause, he's saying that it's up to user preference of whether they would rather have point source imaging (which means killing lateral reflections) or the image broadening effect, where lateral reflections were pleasing if they matched the on-axis response. Also, they found the latter was less important in multi-channel. He acknowledges that sound engineers and musicians wanted more direct sound, and for good reason. That thesis is, "The sound engineering people have models for rooms where you kill more first reflections, and while that might serve their purposes, it's not preferred by typical consumers of the art, who enjoyed the first reflections and who found they added to intelligibility."

Ok, thanks, Eyleron, that makes sense now.

wilbur_the_goose 01-21-2015 12:43 PM

I read an article a few months ago about a Hsu Research booth at CEDIA. They were using relatively inexpensive electronics and nothing special for speaker wiring. The author of the article (which I can't find now) said that he was very impressed by the sound quality and wondered if a person used to seeing and listening to $50K speakers would have been as impressed. The author implied that he liked the sound of the Hsu speakers better than the ultra-expensive speakers (although that could be a case of reverse bias!)

amirm 01-21-2015 01:42 PM

Quote:

Originally Posted by wilbur_the_goose (Post 31030346)
I read an article a few months ago about a Hsu Research booth at CEDIA. They were using relatively inexpensive electronics and nothing special for speaker wiring. The author of the article (which I can't find now) said that he was very impressed by the sound quality and wondered if a person used to seeing and listening to $50K speakers would have been as impressed. The author implied that he liked the sound of the Hsu speakers better than the ultra-expensive speakers (although that could be a case of reverse bias!)

I think you mean this thread: https://www.avsforum.com/forum/91-aud...ity-check.html

As you see Mark's observations were quite contested :).

amirm 01-21-2015 01:48 PM

Quote:

Originally Posted by jkeny (Post 31022226)
I'm not sure how to read the bolded statement - anyone help in understanding this?

Eyleron essentially answered this but to put in my two cents :), the issue was whether speaker directivity could be quantified with respect to listener preference. Study cited showed that while recording engineers used wide dispersion speakers at home, half of them surveyed did not want to use the same for their work. Instead they preferred focused, narrow angle radiation.

For most listeners, this does not hold as a) we are not performing mix decisions and b) we have not developed the preference for point source reproduction (indeed for such applications as center channel for movies such point source reproduction is "wrong" as that channel needs to span the full video screen).


All times are GMT -7. The time now is 01:18 PM.

Powered by vBulletin® Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
vBulletin Security provided by vBSecurity (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.

vBulletin Optimisation provided by vB Optimise (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.
User Alert System provided by Advanced User Tagging (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.