Hi Alexander, Based on the mails I have tried to capture the requirements in a block diagram.Please let me know if there are any mistakes in the digram.
I wanted to add a few points regarding the final comparison block which is being sought to be compared using Speech recognition. I think the comparisons can very easily be done using a PSNR comparison which will effectively do a comparison of two streams for differences in the audio samples.This kind of measurements is quite mature in audio codecs and can as well work here.
Speech recognition has its own sets of problem of training the recognition engine and it is notoriously erroneous.This was my observation while working on a ASR engine.So,finally we may end up doing a testing of the sphiks ;) rather than the Panda audio .But,it is definitely worth a try.
Block Diagram
http://www.gliffy.com/publish/2944818/
Regards Rony
-------- Original Message -------- Subject: Re: end-to-end audio testing (jacks) Date: Tue, 27 Sep 2011 18:25:05 +0200 From: Alexander Sack asac@linaro.org To: Kurt Taylor kurt.taylor@linaro.org CC: linaro-multimedia@lists.linaro.org, David Zinman david.zinman@linaro.org
On Tue, Sep 27, 2011 at 5:16 PM, Kurt Taylor <kurt.taylor@linaro.org mailto:kurt.taylor@linaro.org> wrote:
On 27 September 2011 09:18, Alexander Sack <asac@linaro.org mailto:asac@linaro.org> wrote:
Hi,
we are looking at landing more and more full stack test cases for our automated board support status tracking efforts.
While for some hardware ports it's hard to test whether a port really gets a proper signal etc, we feel for audio this might be relatively straight forward: we got the idea that we could connect a cable from jack out to jack in in the lab and then have a testcase that plays something using aplay and checks that he gets proper input/signal on the jack in.
This could be done on alsa level and later pa level (for ubuntu).
A more advanced idea that came up when discussing options was to use opensource speech recognition like sphinx to even go one step further and see if the output we produce yields roughly the same input. For that we could play one or two words, use speech recognition to parse it and check if the resulting text is stable/expected.
What do you think?
These are really good ideas. I had started a discussion with Torez several months ago about an automated test for audio. My idea at the time was to use a sine wav at a particular frequency and use or hack one of the tuner/freq analysis apps to detect the frequency. If it was too garbled or distorted, it wouldnt recognize the frequency.
As you know, sound quality is very subjective and depends on the cables, speakers, amp, etc. I like the speech recognition idea as well, for the same reasons. It might actually be a better test of the quality.
right. i think it would be hard to measure real audio quality, but if we get speech recognition going we would at least know that the input was similar enough to what we played.
I think some experiments with pocketsphinx would make sense to see how easy that would be. I am happy to create a blueprint for the first investigation steps for your backlog with a quick outline.
Would MMWG be able to take experimenting and implementing such end-to-end audio test into their 11.10 work list?
I think this is a really good idea to explore. Could we also maybe use camera and face recognition when we hack a pandaboard to do that? Hm...
psssst ... i wanted to keep that idea back for a bit :).
On 29 September 2011 01:46, Rony Nandy rony.nandy@linaro.org wrote:
**
Hi Alexander, Based on the mails I have tried to capture the requirements in a block diagram.Please let me know if there are any mistakes in the digram.
I wanted to add a few points regarding the final comparison block which is being sought to be compared using Speech recognition. I think the comparisons can very easily be done using a PSNR comparison which will effectively do a comparison of two streams for differences in the audio samples.This kind of measurements is quite mature in audio codecs and can as well work here.
I think we could grow the test into a more comprehensive test that could compare input and output waveforms, but that is probably overkill for an initial sniff/smoke test. Said another way, I would hope that the device designers would have already tested for S/N ratio, etc. And, it can always grow in features as needed.
I was thinking that a super simple test would be to script up speaker-test with a sine, and have a frequency detect app do 1) detect silence 2) detect a frequency 3) detect silence (repeat as needed). This would be a easy app to code up, I started looking at it this weekend for fun. This would show that a board was working, the LEB worked and several other system level flows worked out of the box. I like it.
The speech idea is a good one as well, we could use the generic "front center" wavs that ship with alsa, play those, and test the speech-to-text = front, center. I may be missing something, but that doesnt sound too hard either.
Speech recognition has its own sets of problem of training the recognition engine and it is notoriously erroneous.This was my observation while working on a ASR engine.So,finally we may end up doing a testing of the sphiks ;) rather than the Panda audio .But,it is definitely worth a try.
<snip>
I think the idea was to have the dev board play the sound and do the capture/compare. That is, have a loopback cable run from the line out to the mic in. We may need to adjust some of the default levels for the test, but that is doable.
Nice drawing btw, I will have to go play with gliffy. ;-)
http://www.gliffy.com/publish/2944818/
Regards Rony
-------- Original Message -------- Subject: Re: end-to-end audio testing (jacks) Date: Tue, 27 Sep 2011 18:25:05 +0200 From: Alexander Sack asac@linaro.org asac@linaro.org To: Kurt Taylor kurt.taylor@linaro.org kurt.taylor@linaro.org CC: linaro-multimedia@lists.linaro.org, David Zinman david.zinman@linaro.orgdavid.zinman@linaro.org
On Tue, Sep 27, 2011 at 5:16 PM, Kurt Taylor kurt.taylor@linaro.orgwrote:
On 27 September 2011 09:18, Alexander Sack asac@linaro.org wrote:
Hi,
we are looking at landing more and more full stack test cases for our automated board support status tracking efforts.
While for some hardware ports it's hard to test whether a port really gets a proper signal etc, we feel for audio this might be relatively straight forward: we got the idea that we could connect a cable from jack out to jack in in the lab and then have a testcase that plays something using aplay and checks that he gets proper input/signal on the jack in.
This could be done on alsa level and later pa level (for ubuntu).
A more advanced idea that came up when discussing options was to use opensource speech recognition like sphinx to even go one step further and see if the output we produce yields roughly the same input. For that we could play one or two words, use speech recognition to parse it and check if the resulting text is stable/expected.
What do you think?
These are really good ideas. I had started a discussion with Torez several months ago about an automated test for audio. My idea at the time was to use a sine wav at a particular frequency and use or hack one of the tuner/freq analysis apps to detect the frequency. If it was too garbled or distorted, it wouldnt recognize the frequency.
As you know, sound quality is very subjective and depends on the cables, speakers, amp, etc. I like the speech recognition idea as well, for the same reasons. It might actually be a better test of the quality.
right. i think it would be hard to measure real audio quality, but if we get speech recognition going we would at least know that the input was similar enough to what we played.
I think some experiments with pocketsphinx would make sense to see how easy that would be. I am happy to create a blueprint for the first investigation steps for your backlog with a quick outline.
Would MMWG be able to take experimenting and implementing such end-to-end audio test into their 11.10 work list?
I think this is a really good idea to explore. Could we also maybe use camera and face recognition when we hack a pandaboard to do that? Hm...
psssst ... i wanted to keep that idea back for a bit :).
-- Alexander Sack Technical Director, Linaro Platform Teams http://www.linaro.org | Open source software for ARM SoCs http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog
linaro-multimedia mailing list linaro-multimedia@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-multimedia
linaro-multimedia@lists.linaro.org