Originally Posted by krabapple
'keep going and stop when I hit a strong confidence level' is NOT how to do such a test.
You are from HA -- you should read Pio's very nice sticky post
there about how to do audio DBTs. Here's a relevant part:.
I don't disagree technically with any of that stuff Pio wrote, but please keep in mind this was just a casual, impropmtu test under less than ideal conditions, I thought to share with my buddies:
- I started the test not even knowing I was about to start a test; I thought I was just doing a casual sighted comparison of A vs B and didn't expect to be hitting the X button at all.
- Only when I realized "Wow, this subtle audible distinction between buttons A and B I thought to casually explore , [not to be confused with difference between the musical recordings
A vs B] might actually exist!" did I then decide to keep going.
- Did I keep taking the test, over and over again, until I hit pay dirt? NO. I did a grand total of only one
test, didn't throw out any data such as the clunkers, didn't cherrypick only the correct answers, and didn't hit "RESET" at any time.
- Did I do so few
trials, say for example 4, that there was no strong statistical significance? No.
- Did I do so many hundreds
of trials that I just waited until my statistical significance was momentarily peaking strongly for a brief time, only to most likely plummet the next moment, so I abruptly stopped the test at that very specific advantageous moment? Um, not really
[but maybe in a certain sense technically one could say that]. I only did 23 trials, not an inordinate amount. And I got the last 11 straight answers correct, in a row, which is not easy to do "by luck". I'm pretty sure once I "got it" at that point I could have gone on indefinitely except that the traffic noise was only getting worse and the conditions were grueling, so I didn't dare continue, but like I said, I don't disagree with Pio's guidelines of how it all should
be done properly.
-the difference was so subtle I was terrified to lose my concentration by even pausing briefly to just stand up to walk over to shut my window. Once I had keyed in on this tiny, tiny difference between buttons A and B, I didn't want to blow it by changing anything fearing I might "jinx" it. I also didn't know if I'd ever be able to find this exact .9 second passage where I hear this tiny, tiny difference between buttons A and B ever again.
-one of my clunkers was hit by mistake. I swear. It happens. Arny I believe mentioned the same goof happened to him in one of his recently posted tests, but yes, goofs need to be counted as indeed mine was
counted on this one and only test I have ever taken for the song "Mosaic".
All excellent points though krabapple.
If I was publishing a scientific paper vs. casually pointing out to a group of forum buddies, "Hey peeps, check out what I just noticed!" I'd say all your points here are 100% valid and that my data was sloppy. I too hate it when I see sloppy work done.
"-The test is run for the first time. And if it is not the case, all previous results must be summed up in order to get the result."
I adhered to both of these stipulations. The test was run for the first and only time, took a grueling hour out of my life I'm not looking forward to attempting to replicate since the difference was so miniscule that I had to switch back and forth dozens of times for each trial [that's
why it took about an hour, and I didn't take any
breaks during that time. It was intense and fatiguing but I trudged on.] And yes, all data was summed and none was thrown out.
m. zillch-- "I hear different between buttons A and B one might just as easily hear if the files weren't music but were instead just the same
, simple, constant 1kHz test tone, from the same source, recorded at the two different sampling rates and bit depths
, but I'm not really sure.
It's fleeting in nature and would be completely invisible if it weren't for the phrase repeat function which allows you to repeat a tiny slice of time over and over again, in my case about .9 seconds long."
Originally Posted by krabapple
So, *can* you do this -- take a file, make an exact duplicate of it, call one A, call the other B, then 'distinguish' A and B using foobarABX, in , say, 13 out of 16 trials?
Don't know. I wrote "but I'm not really sure", which in a later edit I changed to "[but I'm not dead sure on that point, I would need to test it]", posted before I read your post, by the way.
Krabapple-- "make an exact duplicate of it".
Ideally I would need two different files that went through the exact same multi-step process Scott described here:
"We realize that putting one version through two sample-rate conversions—even state-of-the-art conversions—will raise some eyebrows, and we are concerned that this could somehow introduce artifacts that might make it easier to distinguish the sound of the converted file. The other alternative was to downconvert the 24/96 file to 16/44.1, then convert both files to analog and back to digital at 24/96 using state-of-the-art DACs (digital-to-analog converters) and ADCs (analog-to-digital converters), but this also would have raised eyebrows by introducing multiple digital/analog conversions. In the end, we decided to go with the double sample-rate conversion to keep everything in the digital domain."
Does it matter? I don't know.
There is no doubt in my mind that in general a constant, non level changing 1kHz sine wave will sound the same through a hi-res and non hi-res system sampling rate/bit depth. If you or anyone who's posts I read can provide me with that test comparison, I will gladly attempt to test them.
[I don't have the technical expertise to manufacture such files and the needed SRCs.]
Krabapple-- "This btw would not necessarily explain what is happening using the Mac app ABX Tester, unless it too displays the same artifact."
[Note: if there was some discussion about the Apple ABX tester in this thread, I wasn't privy to it]
"Artifact" would indeed be a good way to describe what I hear. The music sounds the same, it is in a sense the buttons
which sound different. Again, think of it sort of as if one button is rusty and needs oiling, so it makes a tiny, tiny squeak noise that most people don't even notice, but I do.
That's the best analogy I can think of. I personally wouldn't describe listening for this tiny "squeak" as either cheating or gaming the system though, I would describe it as noticing a subtle flaw in the switcher which unscrupulous people might take advantage of by not
declaring that's what they are keying off of, and instead profess, "It is the superior sound of hi-res music that allowed me to pass the test". I on the other hand never claimed I hear any difference in the music; I'm pretty sure I'm just hearing a flaw in the "tester box", much like Arny can ID certain mechanical switch relays, even across the room
, in certain early hardware ABX boxes made way back in the early days, at least when I'm presented with two files manufactured the way these two were. [Or maybe that part doesn't matter? I don't know.]
Also, not in this test nor the other one I posted , did I ever drive my system to the point of clipping and "game the system" by listening to differences in the distortion products generated by such manipulations.
P.S. This post, as well as other recent ones, took me well over an hour to compose, edit, reread, correct links/quotes etc. and I just don't have the time on my hands to keep this up. Sorry, but I have a lot on my plate and considering I have to dot every "i" and cross every "t", or else a vulture will swoop down and attack me, that I am defenseless against, I am going to have to lay low for a bit. Not talking about YOU krabapple, of course, all your points were good ones and it is good you are supporting the correct scientific method and keeping all of us on our toes.
Bye for now everyone!