AVS Forum

AVS Forum (http://www.avsforum.com/forum/)
-   Audio Theory, Setup, and Chat (http://www.avsforum.com/forum/91-audio-theory-setup-chat/)
-   -   Debate Thread: Scott's Hi-res Audio Test (http://www.avsforum.com/forum/91-audio-theory-setup-chat/1532092-debate-thread-scott-s-hi-res-audio-test.html)

amirm 05-16-2014 07:54 PM

As a courtesy to Scott Wilkinson, I thought any debate about hi-res vs CD should be here. As such, I am taking the posts from that thread and answering them here:
Quote:
Originally Posted by antoniobiz1 View Post

As stated in page 776_ "Many types of music and voice signals were included in the sources, from classical (choral, chamber, piano, orchestral) to jazz, pop and rock music."

So, either such a wide sample means that no sacd or dvd-a had high res content, or Meyer and Moran were really really really...unlucky rolleyes.gif
smile.gif.

Turns out luck is not a factor but proper protocol is. If you were taking a drug and it turned out that in 20% of the cases they mixed up the placebo for the real drug in the tests, would you feel comfortable continuing to take that drug? I know I would not. You might say, what if that wouldn't have changed the outcome? My answer would be still the same. There are some mistakes you just don't make. In the case of a drug trial it would be keeping the samples straight. That is why should such protocol errors surface, the entire study gets thrown out. You sort of can't believe anything else they have done as being accurate.

By the same token, if you are going to test "high-res" audio vs 44.1 Khz, the first and foremost thing to do is to make sure you are bloody testing high-res content. If you didn't think or know to perform a simple spectrum comparison, well, I am not going to put faith in anything else you are reporting. Mind you, I think their results are probably right but I simply can't wear the cloth of objectivity and then turn around and look the other way when folks make beginner mistakes.

Some people have no business running such tests. If you can't be super careful and check and re-check everything three times, especially if you are going to write a paper to be published in a Journal of the industry, then you have no business running such tests. BTW, the "problem" becomes ours if we start to parade this in public as proper evidence. We would be tainted by their mistakes. I know I don't want to go there. I hope others do not either.

So sadly we have to throw this "evidence" away. There is no such thing as being part-time vegetarian. smile.gif Either a test is reliable and authoritative or it is not. There is no in between.

If still not convinced, think how you would behave if the very test had found a difference. Wouldn't' we try to dismiss their results by saying, "look here, you had two files that were the same and you still found them to be different?" If the same mistake is cause for dismissal of one outcome, objectivity calls that it does the same in reverse. We have to play by our rules. We can't play by other people's rule where we are trying to invalidate their beliefs.

RobertR 05-17-2014 12:14 AM

One would think that with all the criticism of the Meyer Moran tests, there would be people EAGER to repeat the identical conditions with "real" hires (funny how people were gushing over how much better and closer to analog the "faux hires" recordings were than CD) recordings to show them how wrong they were, but they haven't. It's rather obvious why...

Quote from a different forum on this topic:
Quote:
there were enough genuine hi-res recordings in that list (at least, recordings with real high frequency content above 22kHz) to allow people to ABX those if they were in any way special / sonically above what CD can achieve.

And this:
Quote:
Fans are/were forever wetting themselves about how these older recordings sound so much better on hi-res than CD.

The idea that 15ips analogue recordings don't sound better on hi-res than CD is only mentioned to try to debunk Mayer and Moran. Outside this specific purpose, fans of hi-res seem convinced that

Indeed! This Patricia Barber, Nightclub is well regarded for its really impressive quality.
You will find nearly on every review a remark about the better sounding SACD layer even if there is not anything above 22kHz because the original recording was done on an old digital machine. So if MOFI didn´t mess up the CD-Layer this one should have been a pro HiRes one even without extended HF content. I bet so it goes for other releases listed also.
So i don´t think the reasoning of BW limited sources count at all.
Quote:
think how you would behave if the very test had found a difference.
Behave how? If a difference was found, it's logical to assume that "true" hires material would have shown an even greater difference.

Roseval 05-17-2014 01:12 AM

Anybody familiar with this study?
http://www.aes.org/e-lib/browse.cfm?elib=15398
Quote:
It is currently common practice for sound engineers to record digital music using high-resolution formats, and then down sample the files to 44.1kHz for commercial release. This study aims at investigating whether listeners can perceive differences between musical files recorded at 44.1kHz and 88.2kHz with the same analog chain and type of AD-converter. Sixteen expert listeners were asked to compare 3 versions (44.1kHz, 88.2kHz and the 88.2kHz version down-sampled to 44.1kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2kHz and their 44.1kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2kHz and files recorded at 44.1kHz.

Authors: Pras, Amandine; Guastavino, Catherine
Affiliation: McGill University, Montreal, Quebec, Canada

RobertR 05-17-2014 01:24 AM

Quote:
Originally Posted by Roseval View Post

Anybody familiar with this study?
Discussed here.

Roseval 05-17-2014 01:34 AM

Thanks

antoniobiz1 05-17-2014 03:05 AM

Quote:
Originally Posted by amirm View Post

As a courtesy to Scott Wilkinson, I thought any debate about hi-res vs CD should be here. As such, I am taking the posts from that thread and answering them here:
smile.gif.

Turns out luck is not a factor but proper protocol is. If you were taking a drug and it turned out that in 20% of the cases they mixed up the placebo for the real drug in the tests, would you feel comfortable continuing to take that drug? I know I would not. You might say, what if that wouldn't have changed the outcome? My answer would be still the same. There are some mistakes you just don't make. In the case of a drug trial it would be keeping the samples straight. That is why should such protocol errors surface, the entire study gets thrown out. You sort of can't believe anything else they have done as being accurate.

The tested drugs, in Meyer and Moran, are SACD and DVD Audio, as they existed at the time. No custom recordings involved. Just commercially available discs, released to the public and purchasable in most record stores.
The placebo was CD audio Redbook quality (downsampling to 16bit/44.1).

There was no drug mixing whatsoever. At worst, you can say that, at that time, they were not able to fully exploit SACD and DVD/Audio formats.

As per the quoted list, 19 discs were tested, out of a 5000 titles available. 19 is a pretty big sample (can anyone help me with statistic significance?) Again, at worst you can say that in general, existing recordings DO NOT benefit from the hires treatment.

Which leads to another interesting point, by the way: if the "great flaw" of Meyer and Moran is lack of hypersonic content, that means that no recording lacking hypersonic content will benefit from hires treatment.

Which in turn means that anything recorderd prior to 1980 will not need hires treatment (forget Beatles, Eagles and Elvis and Jaco and Duke Ellington). No recording from digital multitrack of the eighties and the nineties (Sony DASH, etc.) will benefit either (except those recorded at 20 bit, which are not a lot, in my experience). That's a LOT of music.

Chu Gai 05-17-2014 07:49 AM

If the content that was being tested was not hi-res and one was of the opinion that this leaves the conclusions drawn to be specious, then wouldn't a reasonable course of action be to repeat the study using such material? While this could be done by the original authors, others could certainly investigate it, no?

amirm 05-17-2014 11:16 AM

Quote:
Originally Posted by Roseval View Post

Anybody familiar with this study?
http://www.aes.org/e-lib/browse.cfm?elib=15398
I am and have the paper also. Will post relevant results.
Quote:
Originally Posted by RobertR View Post

Discussed here.
There is nothing but conjecture in that post. It does show what I said: when the shoe is on the other foot, all of a sudden we become highly critical. With Meyer and Moran test we have real, objective, verifiable issues with their testing. The post you linked to from Arny is just shooting in the dark hoping it hits something smile.gif.

Here is a sample comment: "There have been many improper jobs of resampling that have lead to misleading audible differences. However we have long (over a decade) had good resampling software such as Cool Edit Pro and Audition."

I don't know what improper jobs Arny is talking about. Would be good to see a list of those tests. But it matters not here: either he knows the test used "improper resampling" or he doesn't. If he doesn't have such data, he is creating FUD. That is not objective.

Interesting that he talks about using Cool Edit Pro and Audition. That is the tool that would have found the problems with Meyer and Moran tests.

His comments at least in that post indicates to me that he has not read the paper itself. I have not however read the rest of that lengthy thread. If you have, do you mind linking to Arny actually saying that and quoting anything other than the abstract that is available to all?

amirm 05-17-2014 11:25 AM

Quote:
Originally Posted by Chu Gai View Post

If the content that was being tested was not hi-res and one was of the opinion that this leaves the conclusions drawn to be specious, then wouldn't a reasonable course of action be to repeat the study using such material? While this could be done by the original authors, others could certainly investigate it, no?
Sure. But the implication that nobody has is not data that we can use to draw conclusions from. That would be a "trust me" argument. What you are really implying is "trust me, if the flaws were removed the conclusions would be the same." Problem is, the people we want to convince don't trust us. They see us repeatedly posting this test without putting in a strong disclaimer that issues were found with it. And that no professional testing effort would have made such a mistake. Lack of transparency on our part causes us to have no credibility. So "trust me' arguments on test results we don't have doesn't work.

It is a shame really as the trust me argument should have some value. Hopefully we practice more transparency in the future and start to gain such credibility.

Chu Gai 05-17-2014 11:31 AM

No, I'm not making a trust me argument, Amir. I'm simply saying this would be fertile ground to explore and i would hope someone would.

amirm 05-17-2014 11:42 AM

Quote:
Originally Posted by antoniobiz1 View Post

The tested drugs, in Meyer and Moran, are SACD and DVD Audio, as they existed at the time. No custom recordings involved. Just commercially available discs, released to the public and purchasable in most record stores.
The placebo was CD audio Redbook quality (downsampling to 16bit/44.1).

There was no drug mixing whatsoever.
Of course there were. They were testing the idea of whether frequency response above 22.05 Khz of 44.1 Khz sampling rate matters or not. They presented two samples to the user: one that was supposed to have content above that (i.e. the high-res sample) and the other not (the 44.1 Khz version). Problem is, the former was not high-res in all cases. They were presenting the same samples in a way to the user. The equiv. thing in drug testing would be to have given the real drug to both camps where one was supposed to get the get placebo.

Fact that they were a victim of commercial recordings being that way is an excuse, not an explanation that leads us to trusting the data and experimenters. It just boggles the mind that they would not test the before and after files to see if they were different objectively. They just stuck a label on a track as being "high--res" because they *thought* it was. That is not how we perform proper testing. We don't rely on thought. We rely on data.
Quote:
At worst, you can say that, at that time, they were not able to fully exploit SACD and DVD/Audio formats.
No, at worst we can say that the rest of the report is faulty just as well. We simply can't trust the experimenters to have done the rest of the test right. We have data that we can gather on the high-frequency response. We can't know if they made other operational errors because we were not there and don't have a recording of the whole process. It is unfortunate that we get to that conclusion as I appreciate the work they did. But we must go there given the fundamental error.
Quote:
As per the quoted list, 19 discs were tested, out of a 5000 titles available. 19 is a pretty big sample (can anyone help me with statistic significance?) Again, at worst you can say that in general, existing recordings DO NOT benefit from the hires treatment.
We can't say that about quality of commercial discs. Indeed the report confirms that:

Though our tests failed to substantiate the claimed advantages of high-resolution encoding for two-channel audio, one trend became obvious very quickly and held up throughout our testing: virtually all of the SACD and DVD-A recordings sounded better than most CDs—sometimes much better. Had we not “degraded” the sound to CD quality and blind-tested for audible differences, we would have been tempted to ascribe this sonic superiority to the recording processes used to make them.

Plausible reasons for the remarkable sound quality of these recordings emerged in discussions with some of the engineers currently working on such projects. This portion of the business is a niche market in which the end users are preselected, both for their aural acuity and for their willingness to buy expensive equipment, set it up correctly, and listen carefully in a low-noise environment.

Partly because these recordings have not captured a large portion of the consumer market for music, engineers and producers are being given the freedom to produce recordings that sound as good as they can make them, without having to compress or equalize the signal to suit lesser systems and casual listening conditions. These recordings seem to have been made with great care and manifest affection, by engineers trying to please themselves and their peers. They sound like it, label after label. High-resolution audio discs do not have the overwhelming majority of the program material crammed into the top 20 (or even 10) dB of the available dynamic range, as so many CDs today do."


It matters not at the end why the discs sounded better. They did. And that is what we desire as consumers: better sound. Fact that without having ultrasonics preserved they still accomplished the same is neither here, nor there.
Quote:
Which leads to another interesting point, by the way: if the "great flaw" of Meyer and Moran is lack of hypersonic content, that means that no recording lacking hypersonic content will benefit from hires treatment.

Which in turn means that anything recorderd prior to 1980 will not need hires treatment (forget Beatles, Eagles and Elvis and Jaco and Duke Ellington). No recording from digital multitrack of the eighties and the nineties (Sony DASH, etc.) will benefit either (except those recorded at 20 bit, which are not a lot, in my experience). That's a LOT of music.
I hear what you are saying smile.gif. But we don't have such data. A single test and sampling done by people who are not familiar with basics of how you do such tests properly cannot be the basis for further conclusions. We would need to cite other work that is not from the same source.

Personally I don't care about this argument itself. What I like to see is a digital file as was produced in the studio. If it were an old analog tape or whatever and was captured at 96 Khz, I *want that file*. I don't want them to resample it for me to 44.1 Khz. I can do that as can anyone else here that can spend two minutes learning how to use a resampling program. Disk storage is not a concern anymore and neither is bandwidth. The whole reasoning why we have a problem with high-res is a stale argument looking for a problem.

Why would we advocate that the file be converted to a lower resolution before we get it? I mean it is not like we have an argument that doing so makes it better. I hope we are in violent agreement that at best, it will be equal to the master. In no case would it be better. If so, then let's advocate getting the original file. If you must have the 44.1 Khz version, then the distributors will offer that also. The world has come to peace with this already but we keep pretending the argument has a valid use. It doesn't. Sites like HD Tracks are providing high-res files and they are not going to close their doors because we wave a four year old flawed study in their face and that of their customers.

amirm 05-17-2014 11:47 AM

Quote:
Originally Posted by Chu Gai View Post

No, I'm not making a trust me argument, Amir. I'm simply saying this would be fertile ground to explore and i would hope someone would.
Ok, my apology for assuming so smile.gif. Then I agree that there should have been more tests. I have one other that I will try to find and post.

Chu Gai 05-17-2014 11:48 AM

Apology accepted! How's the smoker doing?

amirm 05-17-2014 11:52 AM

Quote:
Originally Posted by Chu Gai View Post

Apology accepted! How's the smoker doing?
Smoker is fantastic! Just smoked 18 pounds of pork butt over a 14 hour period to feed the crew at work. Produced unbelievably good results. Everyone was "happy and stuffed." smile.gif

Frank Derks 05-17-2014 12:24 PM

The M&M test used sacd and with dsd there is *always* >20Khz content: DSD Noise (and lots of it).

Some if the tested disc had 20 or 24 bit sources (48kHz) and analog tape sources converted to 24 bit digital It may be only (tape) noise content in stead of music signals but it still counts as content that should contribute to be able to detect an audible difference.

Above all the M&M test proved that inserting a adc and dac is transparent.

RobertR 05-17-2014 12:53 PM

Quote:
Originally Posted by amirm View Post

the people we want to convince don't trust us.
Many people who believe in the superiority of hires wouldn't trust any data using any methodology that doesn't conform with what they think they already "know". You know perfectly well that this is the case.
Quote:
It matters not at the end why the discs sounded better
Of course it matters. People would know that careful production of 16/44 gives them just as good sound as hires.

amirm 05-17-2014 12:55 PM

Quote:
Originally Posted by RobertR View Post

Many people who believe in the superiority of hires wouldn't trust any data using any methodology that doesn't conform with what they think they already "know". You know perfectly well that this is the case.
So what is the motivation for constanting debating this topic if you believe the other side will not trust a word you are saying?

RobertR 05-17-2014 12:59 PM

Quote:
Originally Posted by amirm View Post

So what is the motivation for constanting debating this topic if you believe the other side will not trust a word you are saying?
You missed where I said "many" and not "all". Convincing the likes of audiophools like Robert Hartley and John Atkinson is not the goal. Enlightening others who aren't of such a mindset is.

CharlesJ 05-17-2014 01:24 PM

Quote:
Originally Posted by RobertR View Post

... Convincing the likes of audiophools like Robert Hartley and John Atkinson is not the goal. Enlightening others who aren't of such a mindset is.
Yes, they will never accept the facts and will carry their beliefs to the grave. wink.gif And yes, that is for sure about the latter. biggrin.gif

CharlesJ 05-17-2014 01:38 PM

Quote:
Originally Posted by amirm View Post

.... With Meyer and Moran test we have real, objective, verifiable issues with their testing. ...

Well, you don't if what is said that a number of those recordings indeed contained ultrasonic material.
And, you just cannot compare the Conference paper which is not peer reviewed to one that was.
Yes, the authors could repeat it but why bother? Something else will not be acceptable.
How about you doing such an endeavor?

CharlesJ 05-17-2014 01:46 PM

Quote:
Originally Posted by amirm View Post

So what is the motivation for constanting debating this topic if you believe the other side will not trust a word you are saying?
Simple. One cannot walk away from pseudoscience and snake oil claims unchallenged or, these will become the accepted facts even if they are false.
Perhaps our ancestors should have overlooked the witch burnings and just accepted that there were witches? And, don't tell me this is not a proper analogy.

RobertR 05-17-2014 01:46 PM

Quote:
Originally Posted by CharlesJ View Post

Yes, the authors could repeat it but why bother? Something else will not be acceptable.
Anything that doesn't conclude that hires is audibly superior to CD is unacceptable to some people.
Quote:
How about you doing such an endeavor?
Since the sole goal seems to be to discredit M&M and not find out the audibility of hires vs. CD, I don't think he's interested.

amirm 05-17-2014 02:38 PM

Quote:
Originally Posted by CharlesJ View Post

Well, you don't if what is said that a number of those recordings indeed contained ultrasonic material.
Please read my earlier reasoning as to why we can't trust much of that report given the basic failures in creating the test. As I said, maybe those results are correct despite lack of proper protocol. Convention says we should not put such weight behind them. It is unfortunate that we have to dismiss the totality of their work but that is required of "us" if we are to live by our rules of proper and careful testing protocols.
Quote:
And, you just cannot compare the Conference paper which is not peer reviewed to one that was.
I have always reminded folks to not put too much weight behind "peer review" and this is a great example of it. When I was at Microsoft my signal processing PhD's would be the peer reviewers for other conferences. I asked them what they did and they said they would look to see if the person flunked school and such. In no way are they assuring that the conclusions in the paper are right or wrong. They can't do that. What they can do is find a person saying 1+1 = 3 and rejecting the paper on the basis of the person not being to do simple math (in my trivial example here).

Given this, I perform the same analysis myself. I look at the work and see if they made simple, simple mistakes. Here is an example. A proper double blind study will have a control. The control is something where we know for sure the answer. If our test then shows that people are voting otherwise, either the test is wrong or the subjects are. If the testers here knew about this simple protocol, they would have been forced to analyze the spectrum of the files and discover what we know now, i.e. their test samples did not represent what they set out to test.
Quote:
Yes, the authors could repeat it but why bother? Something else will not be acceptable.
That's true. The other camp will always find something wrong including the whole DBT affair not being acceptable to them. But all is not lost. People like me would then defend their work if they did not make the most basic mistakes. I most believe in their outcome yet I can't stand next to them. I just can't. No way would I ever create a test that would have such obvious failings. And it is not just this one test. I will later post other tests from the same group that had multiplicity of errors.
Quote:
How about you doing such an endeavor?
You are expecting me to be less lazy than I am. biggrin.gif Blind tests are the most boring things to set up and run. I paid my dues when it was part of my job and that of my group to perform them. That is an important thing by the way: my career and company's success depended on whether we had constructed correct tests or not. Hobbyist such as the group that M&M are part of don't have that requirement. Sure there is some egg on their face due to these errors but their next house mortgage payment does not depend on it. If it did, I think they would be much less likely to make such errors. They would have read a lot more tests of these types, performed many others, be familiar with others doing the same kind of testing, etc., etc. Just because you are a hobbyist listening to music and got an ABX box doesn't make you the proper person to run such tests as we see here. There is real science here that is deeper than the casual treatment we often give it.

Back to me, my current project is to get my CNC router up and running. Need to make a lot of things with it that are far more interesting than dealing with some philosophical matter about high-res audio which has no real life consequence. As I said, folks are free to download high-res files today and there is no longer a format war. You want only 44.1 Khz stuff, you can buy the CD or download the same. You can even go lower to MP3 and AAC. There is nothing to be accomplished one way or the other. If others feel strongly about it and are qualified to properly create such a test, I would be happy to see the results of their work. smile.gif

amirm 05-17-2014 02:50 PM

Quote:
Originally Posted by RobertR View Post

You missed where I said "many" and not "all". Convincing the likes of audiophools like Robert Hartley and John Atkinson is not the goal. Enlightening others who aren't of such a mindset is.
My bad. I thought when you said "many people" you meant more than a couple of people. Addressing that regardless, I am confident there is nothing we can do to challenge their views of audio especially since they are not even here to have a discussion with them. Even if they were, what good comes out of that? If you are a devout Christian and some dude goes around elsewhere and shouts there is no god, would you be up in arms about it? I hope not smile.gif.

The reason I am engaging in the discussion is that I hope that we reform ourselves. We are here. We seem to have lots of interest in the topic. So there is opportunity for learning. Learning what constitutes a good test. Learning to be neutral with our criticism of tests even if they favor our point of view. Learning how the technology really works. And value of measurements and psychoacoustics science in that regard. We seem to want to charge ahead without any of this. That can’t be right if we are going to be vocal about this topic.

Speaking of measurements, I am and I hope many of you are also, in debt of John Atkinson for the incredible library of free measurement data about audio gear. I don’t know how one can call himself an “objectivist” yet find no value in that gift that stereophile has provided to us. No subscription fee, no nothing. I often see Arny for example using their measurements to make his point. So not sure I buy into him being of the top two villains.

I personally ignore everything RH says and subjective things that JA may say (outside of anything related to a measurement). Problem solved. biggrin.gif

Frank Derks 05-17-2014 02:57 PM

Wrong reasoning.

Implying that the test didn't include 'true' hirez content and fabricate this into due to "such obvious failings" the test can't be trusted is unreasonable.

The test did include hires content.

CharlesJ 05-17-2014 03:19 PM

Quote:
Originally Posted by amirm View Post

My bad. I thought when you said "many people" you meant more than a couple of people. .... biggrin.gif

Yes, but many doesn't imply "all" right? Many may include anything to one less than all, if my thinking is correct which may not be the case. wink.gifbiggrin.gif

RobertR 05-17-2014 03:30 PM

Quote:
Originally Posted by amirm View Post

My bad. I thought when you said "many people" you meant more than a couple of people.
You missed it again. I said "the likes of", which clearly refers to more than two people.
Quote:
I am confident there is nothing we can do to challenge their views of audio especially since they are not even here to have a discussion with them.
You keep missing what I said. I already said that such people aren't the target.
Quote:
I personally ignore everything RH says and subjective things that JA may say (outside of anything related to a measurement).
So you do exhibit some enlightenment after all!
Quote:
I have always reminded folks to not put too much weight behind "peer review" and this is a great example of it. When I was at Microsoft my signal processing PhD's would be the peer reviewers for other conferences. I asked them what they did and they said they would look to see if the person flunked school and such. In no way are they assuring that the conclusions in the paper are right or wrong. They can't do that.
Citing a bad example of peer review doesn't make the concept valueless, especially as opposed to not doing it at all.
Quote:
What they can do is find a person saying 1+1 = 3 and rejecting the paper on the basis of the person not being to do simple math (in my trivial example here).
You left out trying the methodology themselves and verifying the results, which is of course very important in giving the experiment credence.

amirm 05-17-2014 05:40 PM

Quote:
Originally Posted by RobertR View Post

You missed it again. I said "the likes of", which clearly refers to more than two people.
Nobody said I am the sharpest tool in the shed smile.gif.
Quote:
You keep missing what I said. I already said that such people aren't the target.
Now I am really lost in the woods smile.gif. How can RH and JA not be part of your target even though you listed them by name?
Quote:
So you do exhibit some enlightenment after all!
I am often misunderstood. biggrin.gif
Quote:
Citing a bad example of peer review doesn't make the concept valueless, especially as opposed to not doing it at all.
No but explaining the reality of "peer review" vs imagined one is. That is what I explained. I talked about how I had a team of people performing that function and it is not what the layman thinks the process is. The M&M report just brings the point home that we somehow think there is a judging panel that makes sure the work has good results where in reality there is none.
Quote:
You left out trying the methodology themselves and verifying the results, which is of course very important in giving the experiment credence.
Once again, the peer review process does not "verify" anything. That is for the reader to judge. If they had, they would be judged right now and they are not (not directly anyway).

The methodology to some extent is reviewed. You can't show up and say you looked at two cables and decided the thicker one sounded better. They would throw out your paper because it would violate the "simple math" equiv. of controlled testing. The Meyer and Moran paper walked and talked like the proverbial duck. It talked about ABX testing and statistical results. That put it on, "they didn't flunk math" class and paper was approved. That's all. There is no representation that the methodology was completely correct but rather, it was not completely wrong. The two are not the same thing. We can't go and skewer the review panel for these guys not testing their samples to see if they are high-res or the other more detailed failings. It is not their job to scrutinize papers to that level. They get tons of papers. They read through it and if there is no obvious red flag, it goes through.

So with all due respect I did not omit those things. I know what is involved in the process and it is not what you are assuming it is. No one here is arguing that they flunked basic math. They followed it. Problem is, they missed the next level of understanding. Let me quote International Telecommunication Union (ITU) document that is the bible of testing for small impairments in BS1116:

"It should be understood that the topics of experimental design, experimental execution, and statistical analysis are complex, and that only the most general guidelines can be given in a Recommendation such as this. It is recommended that professionals with expertise in experimental design and statistics should be consulted or brought in at the beginning of the planning for the listening test."

This is a 30 page document describing many aspects of proper controlled listening tests and it still says that it is just scratching the surface and that qualified people need to be consulted before running head long into such testing. If M&M had brought professionals with experience in such tests we would not be here discussing the problems with their work. I don't even see a reference to this document so makes me believe that they had not read it. Had they done so they would have seen requirements such as these:

"3.1 Expert listeners
It is important that data from listening tests assessing small impairments in audio systems should come exclusively from
subjects who have expertise in detecting these small impairments. The higher the quality reached by the systems to be
tested, the more important it is to have expert listeners."


Yes they claim to have used people who were involved in recording music and such but no specifics are provided as to their expertise in hearing small impairments. A job title doesn't make you an expert listener for this type of test. I can easily hear compression artifacts that top creative engineers who produce music content can't. I can't do their job and they can't do mine. Likewise, we would need to have people who know what to listen for when we resample the music down. What kind of artifacts could be there and what they would sound like just like me knowing those for compressed music. It is not that I have better ears than others. It is simply the case of being trained to hear compression artifacts and studying how the codec work.

ITU paper goes on to say:

"The outcome of subjective tests of sound systems with small impairments utilizing a selected group of listeners is not
primarily intended for extrapolation to the general public.
Normally the aim is to investigate whether a group of expert
listeners, under certain conditions, are able to perceive relatively subtle degradations but also to produce a quantitative
estimate of the introduced impairments."


Yet M&M went on to recurit any and all people: "With the help of about 60 members of the Boston Audio Society and many other interested parties, a series of double-blind (A/B/X) listening tests were held over a period of about a year."

Who cares if the people were part of the audio society? Anyone can join that group. That doesn't make them qualified to hear small differences. They picked other people with same problems: "The subjects included men and women of widely varying ages, acuities, and levels of musical and audio experience; many were audio professionals or serious students of the art."

Again, the people selected must demonstrate skill in hearing the artifacts introduced into audio samples. Variety is not a requirement or a merit despite what the authors think.

"3.2.1 Pre-screening of subjects
Pre-screening procedures, include methods such as audiometric tests, selection of subjects based on their previous
experience and performance in previous tests and elimination of subjects based on a statistical analysis of pre-tests. The
training procedure might be used as a tool for pre-screening.

The major argument for introducing a pre-screening technique is to increase the efficiency of the listening test. This must
however be balanced against the risk of limiting the relevance of the result too much."


Clearly no screening was performed per industry guidelines. No pre-test was given before allowing a person to take the test.

"3.2.2 Post-screening of subjects
Post-screening methods can be roughly separated into at least two classes; one is based on inconsistencies compared
with the mean result and another relies on the ability of the subject to make correct identifications."


The idea here is to throw out the votes from people who had no business taking the test. The only way to do that is to have a control: a test where we *know* the outcome and if the person misses it, then we know they are not fit for this exercise. There is nothing resembling this in the test. There is this comment however:

"The “best” listener score, achieved one single time, was 8 for 10, still short of the desired 95% confidence level. There were two 7/10 results. All other trial totals were worse than 70% correct."

What if these were the only qualified people to be in the test and got as much as 7-8 identities right out of 10? Very different picture emerges than "we find no difference." Personally I don't care how the rest of the people did. I care what the few did because by definition we are trying to find small differences that are not audible to the masses.

"4.1 Familiarization or training phase

Prior to formal grading, subjects must be allowed to become thoroughly familiar with the test facilities, the test
environment, the grading process, the grading scales and the methods of their use. Subjects should also become
thoroughly familiar with the artefacts under study.
For the most sensitive tests they should be exposed to all the material
they will be grading later in the formal grading sessions. During familiarization or training, subjects should be preferably
together in groups (say, consisting of three subjects), so that they can interact freely and discuss the artefacts they detect
with each other.
"


None of this was done.

I could go on but you get the idea as to why I put so little weight on their work. This is hobby work trying to masquerade as proper testing. I am especially disappointed to see this kind of boasting in the M&M paper: "With the printing of the characterizations in Stuart’s lead paper [1] in this Journal, it became clear that it was well past time to settle the matter scientifically."

Settle the matter scientifically? Sorry but no. We don't have such a low standard for science of audio testing. The mere existence of an ABX box and a bunch of people in the test doesn't get us there. People don't wake up one morning and be qualified to do drug trials. Even your doctor may not be qualified to do so. Yet in audio we think it must be that simple and all it takes is something playing music and two ears pushing a button on ABX box. That is fine if the differences are large or we don't intend the results to be authoritative. Such is not the case here. Differences are small and the authors claim scientific validity sanctioned by "peer review" stamp.

So high-res music or not, the testing leaves a ton to be desired. That they didn't even have the right sample data just exacerbates the problems.

Should we dismiss the test? No. The likely showed that most people can't hear differences in a sampling of the music they had picked when downsampled to 44.1 Khz. That much we can believe. Going beyond that is taking the test to places that it is not qualified to go.

antoniobiz1 05-17-2014 05:49 PM

I lurk a lot, here, and during the past years I had a chance to read you often. So, first of all, let me say that, although I probably disagree with 17,024 of your posts out of 17,025 (and I am doubtful on the other), I find your way of debating very pleasant and friendly, and I appreciate it a lot. I'll try to be the same smile.gif
Quote:
Originally Posted by amirm View Post

Of course there were. They were testing the idea of whether frequency response above 22.05 Khz of 44.1 Khz sampling rate matters or not. They presented two samples to the user: one that was supposed to have content above that (i.e. the high-res sample) and the other not (the 44.1 Khz version). Problem is, the former was not high-res in all cases. They were presenting the same samples in a way to the user. The equiv. thing in drug testing would be to have given the real drug to both camps where one was supposed to get the get placebo.

Fact that they were a victim of commercial recordings being that way is an excuse, not an explanation that leads us to trusting the data and experimenters. It just boggles the mind that they would not test the before and after files to see if they were different objectively. They just stuck a label on a track as being "high--res" because they *thought* it was. That is not how we perform proper testing. We don't rely on thought. We rely on data.

Nowhere they say they were testing whether frequency response above 22.05 Khz matters or not. They were testing SACD and DVD-A. In 2007 was pretty trivial to obtain an audio card that could do hires (my Edirol UA 4-FX costed 150 Euros back then), and create files that had all the characteristics you mention.

The test had instead the goal to assess whether SACD or DVD-A made a difference, on the whole. Please reread paragraphs 0 and 1 of the paper. So no excuse or explanation: they did exactly what they declared they would. It is not them who stuck labels: it was the record companies. And don't forget, apparently up until that point millions of enthusiast believed that label without questioning if there was something to it or not.
Quote:
Originally Posted by amirm View Post


It matters not at the end why the discs sounded better. They did. And that is what we desire as consumers: better sound. Fact that without having ultrasonics preserved they still accomplished the same is neither here, nor there.
How can it possibly not matter? We are debating whether the delivery format makes a difference, not the master. And they proved that, if the master is better, then it will remain of the same quality through the 16bit/44.1 Khz loop, which is exactly what they were trying to assess. In other words, they proved that whatever was on the tested SACDs or DVD/As could travel through that loop down to the listeners' ears WITHOUT HARM. So, as far as high res audio of that era was concerned, SACD and DVD-A were not necessary. It was simply a matter of better mastering for Redbook CDs to obtain the very same results.

I hear what you say about having a copy of the original hires file. I think it is totally pointless, but I understand and respect that desire.

Chu Gai 05-17-2014 05:57 PM

How about we table the discussion with regards to sonic differences between CD type rates and the various flavors of hi-res that are out today? Instead, might there be reasons why hi-res would be a preferred approach?

For example, do such chips facilitate interfacing with other components?
Are manufacturers of these chips able to incorporate additional functionality such that a manufacturer of a device using them is able to realize savings elsewhere such as reduced parts count, simplified circuit layout, the ability to reduce their head count, quicker manufacturing?
Do some hi-res formats inhibit piracy?
Equipment in studios gets old, breaks down, has outdated functionality, and not all of it was designed all that well. Might not hi-res provide some selling points there?


All times are GMT -7. The time now is 08:08 AM.

Powered by vBulletin® Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.

vBulletin Optimisation provided by vB Optimise (Pro) - vBulletin Mods & Addons Copyright © 2014 DragonByte Technologies Ltd.
User Alert System provided by Advanced User Tagging (Pro) - vBulletin Mods & Addons Copyright © 2014 DragonByte Technologies Ltd.