Modern communication systems have completely changed the outlook of the audio industry. It has also changed the way people communicate, interact, and engage with each other. With the rising adoption of smart digital audio technology, people are in pursuit of having automated and connected home audio experiences. Devices like smart speakers, Facebook portals are high in demand, owing to its smart and intelligent features. Technologies like speech recognition, 360-degree audio, wireless audio are at the forefront of audio industry. The concept of immersive audio has made communications more natural and surreal for the listeners. There are several tools and techniques behind these high-quality audio experiences that polish the sound and play a significant role in getting the high-quality sound.
Introduction to Audio Signal Processing
Audio Signal processing is a method where intensive algorithms, techniques are applied to audio signals. Audio signals are the representation of sound, which is in the form of digital and analog signals. Their frequencies range between 20 to 20,000 Hz, and this is the lower and upper limit of our ears. Analog signals occur in electrical signals, while digital signals occur in binary representations. This process encompasses removing unwanted noise and balancing the time-frequency ranges by converting digital and analog signals. It focuses on computational methods for altering the sounds. It removes or minimizes the overmodulation, echo, unwanted noise by applying various techniques into it.
Remote communication, such as virtual video conferencing, is becoming the preferred method of communication over face-to-face meetings. But, acoustic noise, distortion, and echo are inevitable in any communication process. Suppose a person is talking over the phone or walking around the streets. His speech would be hampered by the traffic noise, noise caused by people around him, wind sound, etc. It becomes imperative to remove such distortion to have smooth and flawless sound quality. Various techniques are used in the process of improving the audio quality and are discussed below.
- Analog to Digital Converter (ADC)
- Audio effects
- Data compression/ decompression
- Automatic gain control
- Acoustic echo cancellation (AEC)
- Filtering/ Resampling
- Digital to Analog Converter (DAC)
1. Analog to Digital Conversion
Analog audio signals are more likely to be influenced by noise and distortion. Converting them into digital signals allows convenient manipulation, storage, and transmission without any quality degradation. It uses a specified sampling rate and converts the electric signals into the binary bits resolution. The higher the sampling rate and precision measurements, the higher the quality.
The performance of ADC is defined by its bandwidth and signal-to-noise ratio (SNR). Bandwidth is characterized by sampling rate, and SNR differs when there is a change in resolution, accuracy, aliasing (occurs when encoded signal is different from the original signal), etc. ADC is considered in its ideal state when SNR of ADC exceeds that of the input signal.
2. Audio Effects- Audio Pre/post Processing Techniques
Post-processing algorithms are used to suppress the noise and any artifacts created in the first stage of processing. It is primarily focused on echo, distortion removal, and speech enhancement. Equalization and filtering are popular post-processing techniques to add reverberation and noise control.
a. Data Compression/ Decompression
Compression is one of the most powerful mixing tools which is a process to reduce the dynamic range of audio signals. Dynamic range is the difference between the highest and lowest range of an audio signal.
For example, while screaming or whispering pitch is either too high or too low and, in this case, if we record it without compression, then the resulting sound will be distorted. The compressor fixes this problem by attenuating the loudest sound and boosting the slowest sound. It helps us find the perfect balance of audio track and gives us more natural sound without distortion. It also reduces the bandwidth of digital audio streams and storage size of the file to save storage space and faster transmission.
There are 2 types of audio compression exist, i.e., lossless and lossy compression. The most widely used audio compressions are lossy methods due to their much larger compression ratios than their original data. It can eliminate the information that is not very relevant and any decline in the quality. The most popular audio compressions are MP3 and AAC Compression.
b. Automatic Echo Cancellation (AEC)
Acoustic Echo Canceller plays an important role in audio signal processing. It removes the echo, reverberation and unwanted noise caused by acoustic coupling between the microphone and loudspeaker. Microphones capture the far-end speech due to the acoustic coupling.
Suppose you are in a voice call talking with someone over a phone. The speech of the other person you are talking to referred to as far-end speech, which would be played through loudspeaker and your voice, referred to as near-end speech which would be captured by microphone. If the far-end speech gets transmitted back to the other side of the call, the other person would hear their voice after some delay (network + processing delay). AEC blocks the transmission of far-end back to the other party in the call.
Resampling is defined as the total no. of samples generated per second. These samples are measured in kilohertz (kHz), where one unit is equal to 1000 times per second. Different audio systems use different sampling rates and frame rates. It measures the frequency of the audio signals. It works on the principle of oversampling and transcoding which results in less noise and distortion. The higher sampling rate is more advantageous as it gives the more accurate details of rising and falls in the signals, which improve the sound quality.
Filters are considered the most basic circuit in any signal processing used in almost every process. It removes the unwanted noise, echo, distortion, and allows the filtered data to pass through it. We will be discussing pass filters that allow only specific frequencies while rejecting others.
- Low-pass filter Low-pass filters allow the frequencies below the selected cut-off frequency level and cut the frequencies above the cut-off range.
- High-pass filter A high-pass filter is the opposite of a low-pass filter. It filters and passes the frequency, which is higher than the cut-off frequency range and attenuates the frequency lower than the cut-off range.
- Bandpass Filter After resampling of signals, band pass filter is applied to remove the extra noise and be considered the most ideal filter in signal processing. It attenuates the frequencies which are higher or lower than the cut off frequencies range and only passes the frequencies which fall within the cut-off range.
- Band-rejection/stop filter It is also known as a notch filter and opposite of band-pass filter. It leaves most of the frequencies unaltered and attenuates those within a specified range to very low levels.
Equalizers are used to alter or adjust the frequency so that the sound spectrum frequency at the transmitter should match the sound spectrum’s frequency at the receiver. Frequency ranges are being adjusted to high or low using low-pass filter, high-pass filter, band-pass filter. It removes the delay between different frequency components and gets the desired output.
f. Automatic gain control (AGC) or Loudness Control
It gives a constant output despite having various input signals. It shows the amount of gain or attenuation applied to the input signals to get the target input signal. If the input signal is higher than the target input, then AGC subtracts the gain, and if it is lower than the target input level then AGC adds the gain. Gain shows the loudness of the input of the channel, which controls the tone.
Beamforming, also known as Spatial Filtering, is a signal processing technique used in microphone array processing. Beamforming exploits the microphones’ spatial diversity in the array to detect and extract desired source signals and suppress unwanted interference. Beamforming is used to direct and steer the composite microphones’ directivity beam in a particular direction based on the signal source direction. This technique helps to boost the composite range of microphones and increases the signal-to-noise (SNR) ratio.
Digital to Analog Conversion (DAC)
Modern audio signals are mostly found in digital forms like MP3 but, if you want to listen to it through the speaker, it needs to be converted in analog form. It transforms digital data streams into analog audio signals, and then the converted analog signals are sent to amplifier. It is then used by the output devices like speakers, music players. This process improves the overall sound quality and enhances the listening experience. Some of the DAC enabled devices are digital speakers, CD players, music players, etc.
Advancements in digital audio technology have propelled us to have very efficient and high-quality speech processing algorithms in place. These algorithms are applied in the process of recording, storing, and transmitting the audio content. Audio content brings lots of unwanted echo, interference and distortions that need to be removed to get the desired results in audio quality. It works on the principle of converting the audio signals between analog and digital formats, adjusting the frequency ranges, removal of unwanted noise and adding audio effects to get the smooth and flawless speech quality.
PathPartner provides audio signal pre/post-processing algorithms for various smart audio devices with additional features like enabling wireless and smart functionalities, voice-assistant integrations, 3D surround sound implementation, and providing Dolby/DTS product certifications. Reach out to us to know more or for quick consultation, write to us at firstname.lastname@example.org.