University of Nebraska-Lincoln
Department of Electrical Engineering
209N WSEC
Lincoln, NE 68588-0511
Phone: 402 472-1979
FAX: 402 472-4732
mail: 209N WSEC, Lincoln, NE 68588-0511
The research will examine potential solutions for a number of needs that exist for the audio interface between a human and a computer (or machine). These needs include reducing background noise and reverberation. The fact that machines also generate audio signals for the human means that the interface will need to remove this audio feedback from the input to the machine from the human. Finally, all of these needs must be satisfied in real-time to allow the human-machine interface to be of any practical use.
The results sought from the project include an enhanced speech input to a machine that will allow subsequent processing algorithms for word recognition, speaker identification, speech compression, etc., to be more effective. The approach suggested is completely general and can be used for any microphone configuration and for any type of acoustic interface between a human and a machine. The fact the array picks up signals remotely means that the user will not be connected to the machine by a cable. The approach suggested is "cellular" in that the processor can be modified to track a user who is moving within a room by determining the user position and changing the constraints that define the robust adaptive processor to preserve signals generated within the cell that contains the user's estimated position.
M. W. Hoffman and K. M. Buckley. "Robust time-domain processing of broadband acoustic data." IEEE Transactions on Speech and Audio Processing, Vol.3, pp. 193-203, May 1995.
M. W. Hoffman, T. Trine, K. M. Buckley, and D. J. Van Tasell. "Robust microphone array processing for hearing aids: Realistic speech enhancement predictions." Journal of the Acoustical Society of America, vol.96:759-771, Aug. 1994.
M. W. Hoffman. "Robust adaptive processing of microphone array data for hearing aids." In Proceedings of the IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, Lake Mohonk, New Paltz, NY, Oct. 1993.
The current project applies a spatial filter to provide clean acoustic signals to a machine (computer) in a noisy and reverberant room. A voice-controlled system's reliability depends upon a clear, uncorrupted speech signal as input to the automated speech processing system. Virtually all processing algorithms designed for speech signals work better when the input speech is not corrupted by interfering noise and distortion. Speech coders, word recognition systems, and speaker identification systems are often very sensitive to background noise and reverberation. Digitally processed signals from an array of microphones provide enhanced speech input for the machine's automated processing systems. In addition, the processing that improves the speech input quality will not place hardships on the human user the system, such as a microphone attached to the machine by a cable or a precise fixed location for the human. The project attempts to advance a sophisticated interface between humans and machines that places the burden of processing and inconvenience on the machine rather than the human user.
H. Cox, R.M. Zeskind, and M.M. Owen. "Robust adaptive beamforming". IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-35:1365--1376, Oct. 1987.
J.E. Greenberg and P.M. Zurek. "Evaluation of an adaptive beamforming method for hearing aids". Journal of the Acoustical Society of America, vol.91:1662--1676, Mar. 1992.
R.W. Stadler and W.M. Rabinowitz. "On the potential of fixed arrays for hearing aids". Journal of the Acoustical Society of America, vol.94:1332--1342, 1993.
Other programs areas that are attempting to ease the burden that human-computer interactions place on the human may be able to exploit user information (such as position in a room, movement, etc.) to better anticipate the needs of the user. The microphone array interface should be able to provide some good cues as to user position, user movements and histories of such movements. Arrays of sensors allow two separate functions: signal enhancement (i.e., beamforming) and source localization (i.e., direction finding). While the primary emphasis of the current project is signal enhancement, some emphasis could be placed on exploiting the localization capacities of the sensor array.