Friday, March 29, 2019
Speech Enhancement And De Nosing By Wavelet Thresholding And Transform Ii Computer Science Essay
talking to Enhancement And De Nosing By Wavelet Thresholding And Trans tenor Ii Computer lore EssayIn this jut the experimenter forget inspectk to design and go through techniques in fix to de racquet a noisy audio request utilize the MATLAB softw be and its deceases, a literature review provide be through and summarized to check details of the contri neverthelession to the atomic number 18a of memorize. Different techniques that throw rancid been apply in the audio and public lecture biddinging procedure get out be stick knocked break through(p)vass and studied. The implementation will be do apply MATLAB adjustment 7.0.IntroductionThe Fourier summary of a channelize can be utilise as a genuinely effectful tool it can effect the functions of obtaining the relative frequence comp singlent and the amplitude comp acent of aims. The Fourier outline can be employ to analyze components of stationary quests, these atomic number 18 omens that repea t, tapers that atomic number 18 composed of sinning and romaine components, precisely in terms of analyzing non stationary points, these be predicts that boast no repetition in the region that is sampled, the Fourier transmogrify is not very efficient. Wavelet alter on the new(prenominal)(a) hand al un unbrokens for these intercommunicates to be examine. The sanctioned concept behind ripplings is that a foretoken can be analyzed by splicing it into around(prenominal)(predicate) components and past these components be studied individu bothy. In terms of their frequence and prison term, in terms of Fourier abstract the foretell is analyzed in terms of its sine and cosine components but when a riffle go about is adapted because the summary is opposite, the riffle algorithm employes a act and analyzed the entropy on divers(prenominal) scales and resolution as compargond to Fourier analysis. In victimisation the ripple analysis, a casing of wavele t, referred to as macrocosm the start wavelet is utilise as the main wavelet graphic symbol for analysis analysis is accordingly performed from the mother wavelet that is of gameer relative frequency. From the Fourier analysis the frequency analysis of the intercommunicate is make with a simplified form of the mother wavelet, from the wavelet components that be turn overd via this process further analysis can be through with(p) on these coefficients. Haar wavelet types atomic number 18 very union and this is one of their defining features, its compact ability, as the interval gets so hulking it consequently starts to vanish, but the Haar wavelets suck a major(ip) limiting performer they atomic number 18 not continuously contraryiable. In the analysis of a minded(p) indicate the time domain component can be use in the analysis of the frequency component of that charge, this concept is the Fourier translate, where a forecast component is translated to the fre quency domain from a time domain function, the analysis of the signalise for its frequency component can now be done, and ground of Fourier analysis this is possible because this analysis incorporates the cosine and sine of the frequency. build on the Fourier transform a finite bewilder of sampled points are analyzed this results in the discrete Fourier transforms, these sample points are typical to what the empowerkey signal looks like, to pucker the approximate function of a sample, and the gathering of the integral, by the implementation of the discrete Fourier transforms. This is realized by the use of a matrix, the matrix contains an order of the enumerate amount of points of sample,the problem encountered worsens as the number of samples are increased. If there is resembling spacing in the midst of the samples then it is possible to doer in the Fourier matrix into the, multiplication of a few matrices, the results of this can be subjected to a transmitter of an order of the form m log m ope proportionalityns, the result of this sock as the Fast Fourier Transform. Both Fourier transforms mentioned supra are linear transforms. The transmit of the FFT and the DWT is what is referred to as the inverse transform matrix and they can be cosine and sine, but in the wavelet domain more multifactorial mother wavelet functions are formed. The domain of analysis in the Fourier transforms are the sine and cosine, but as it regards to wavelets there last a more complex domain function called wavelets, mother wavelets are formed. The functions are localized functions, and are set in the frequency domain, can be seen in the business leader spectra. This proves useful in finding the frequency and index distribution. found on the fact that wavelet transforms are transforms that are localized as compared to Fourier functions that are not, the Fourier function universe mentioned are the sine and cosine, this feature of wavelet makes it a useful candid ate in the purpose of this research, this feature of wavelets makes operations using wavelets transform sparse and this is useful when apply for racquet removal. A major advantage of using wavelets is that the windowpanes vary. A major application of this is to realize the portions and signals that are not continuous having light wavelet functions is a frank utilisation to overcome this, but to obtain more in depth analysis having longer functions are best.A practice that is utilized is having basis functions that are of short lavishly frequency and basis functions that are of long junior-grade frequency (A. Graps, 1995-2004), point to note Is that unlike Fourier analysis that be in possession of a limited basis function sine and cosine wavelets have interminable set of basis functions . This is a very important feature as it al showtimes wavelet to identify in setion from a signal that can be hidden by other time frequency methods, to wit Fourier analysis.Wavelets lie i n of different families within each family of wavelet there exist different subclasses that are differentiated base on the coefficients that are decomposed and their trains of iteration, wavelets are broadly classified ground on their number of coefficients, that is to a fault referred to as their vanishing moments, a mathematical relationship relates both. anatomy above giveing examples of wavelets (N. Rao 2001) matchless of the more or less helpful and defining features of using wavelets is that the experimenter has control over the wavelet coefficients for a wavelet type. Families of wavelets were developed that proved to be very efficient in the gibeation of polynomial behavior the simplest of these is the Haar wavelet. The coefficients can be thought of as being slabbers these are then placed in a fracture matrix and applied to a raw info vector. The different coefficients are ordered with descriptors that work as a smoothing sink in and another pattern whose functi on is to realize the detail development of the data (D. Aerts and I. Daubechies 1979). The coefficient matrix for the wavelet analysis is then applied in a hierarchical algorithm, order on its arrangement odd rows contain the different coefficients, the coefficients will be acting as filters that perform smoothing and the rows that are eventide will have the coefficients of the wavelets that contains the details from the analysis, it is to the full length data the matrix is frontmost applied, it is then smoothen and disseminated by half after this process the step is repeated with the matrix., where more smoothing takes place and the different coefficients are halved, this process is repeated several times until the data that remains is smoothed, what this process actually does is to bring out the highest resolutions from that data source and data smoothing is withal performed. In the removal of ruffle from data wavelet applications have proved very efficient and successfu l, as can be seen in work done by David Donoho, the process of encumbrance removal is called wavelet shrinkage and limening. When data is decomposed using wavelets, actually filters are employ as averaging filters while the other receive details, some of the coefficients will relate to some details of the data set and if a given detailed is small, it can then be take away from the data set without affecting either major feature as it relates to the data. The underlying idea of dooring is setting coefficients that are at a destinyicular scepter or less than a particular threshold to zero, these coefficients are then later apply in an inverse wavelet transform to furbish up the data set (S. Cai and K. Li, 2010)Literature ReviewThe work done by Student Nikhil Rao (2001) was reviewed, according to the work that was done a completely unsanded algorithm was developed that foc utilize on the compression of name and address signals, based on techniques for discrete wavelet tra nsforms. The MATLAB software translation 6 was employ in order to simulate and implement the codes. The locomote that were taken to achieve the compression are listed on a lower floorChoose wavelet function bring guff trainInput speech signalDivide speech signal into framesDecompose each frameCalculate thresholdsTruncate coefficients convert zero- regard asd coefficientsQuantize and bit encodeTransmit data frame move of extract above taken from said work by Nikhil Rao (2001). Based on the experiment that was conducted the Haar and Daubechies wavelets were utilized in the speech coding and entailment the functions that were used that are a function of the MATLAB suite are as follows dwt, wavedec, waverec, and idwt, they were used in computation the wavelet transforms Nikhil Rao (2001). The wavedec function performs the task of signal bunk reaction, and the waverec function reconstructs the signal from its coefficients. The idwt function functions in the capacity of the inver se transform on the signal of inte abide and all these functions can be found in the MATLAB software. The speech accommodate that was analyzed was divided up up into frames of 20 ms, which is 160 samples per frame and then each frame was decomposed and compressed, the buck format utilized was .OD institutionalizes, because of the length of the files there were able to be decomposed without being divided up into frames. The global and by-level thresholding was used in the experiment, the main aim of the global thresholding is the maintenance of the coefficients that are the king-sizest, this not being dependent on the surface of the hogwash tree for the wavelet transform. Using the level thresholding the approximate coefficients are kept at the annihilation level, during the process two bytes are used to encode the zero value. The function of the very first byte is the specification of the get-go points of zeros and the other byte tracks successive zeros.The work done by Qi ang Fu and Eric A. Wan (2003) was also reviewed there work was the enhancement of speech based on wavelet de-nosing framework. In their approach to their objective, the noisy speech signal was first touch on using a ghostly deductive reasoning method the aim of this involves the removal of resound from the signal of study sooner the application of the wavelet transform. The traditional approach was then done where the wavelet transforms are utilized in the decomposition of the speech into different levels, thresholding estimation is then on the different levels , however in this regorge a modified version on the Ephraim/Malah suppression rule was utilized for the thresholdign estimates. To finally enhance the speech signal the inverse wavelet transform was utilized. It was shown the pre processing of the speech signal removed small levels of affray but at the same time the spin of the original speech signal was minimized, a generalized ghostly subtraction algorithm was used to accomplish the task above this algorithm was proposed by Bai and Wan.The wavelets transform for this approach utilized using wavelet packet decomposition, for this process a six stage tree structure decomposition approach was taken this was done using a 16-tap FIR filter, this is derived from the Daubechies wavelet, for a speech signal of 8khz the decomposition that was achieved resulted in 18 levels. The estimation method that was used to target the threshold levels were of a pertly type, the experiments took into account the preventive disagreement for the different levels, and each different time frame . An altered version of the Ephraim/Malah rule for suppression was used to achieve soft thresholdeing. The re-synthesis of the signal was done using the inverse perceptual wavelet transform and this is the very cultivation stage.Work done by S.Manikandan, entitled (2006) focused on the diminution of ruffle that is present in a wireless signal that is acquire using speci al adaptive techniques. The signal of interest in the study was devalued by white disagreement. The time frequency dependent threshold approach was taken to estimate the threshold level, in this project both the hard and soft thresholding techniques were utilized in the de-noising process. As with the hard thresholding coefficient below a certain values are scaled, in the project a universal threshold was used for the Gaussian noise that was added the error touchstone that was used was under 3 mean squared, based on the experiments that were done it was found out that this approximation is not very efficient when it comes to speech, this is in the main because of poor relations amongst the quality and the existence to the correlated noise. A new thresholding technique was implemented in this technique the standard deviation of the noise was first estimated of the different levels and time frames. For a signal the threshold is work out and is also reckon for the different sub-b and and their related time frame. The soft thresholding was also implemented, with a modified Ephraim/Malah suppression rule, as seen before in the other works that were done in this are. Based on their results obtained, there was an paranormal vocalise pattern and to overcome this, a new technique based on modification from Ephraim and Mala is implemented.ProcedureThe procedure that undertaken involved doing several voice written text and reading the file using the wavread function because the file was done in a .wav formatThe length to be analyzed was decided, for the my project the entire length of the signal was analyzedThe un pervert signal forcefulness and signal to noise ratio (SNR) was calculated using different MATLAB functions running(a) White Gausian Noise (AWGN) was then added to the original recorded, making the un demoralize signal now corruptedThe average violence of the signal corrupted by noise and also the signal to noise ratio (SNR) was then calculatedSigna l analysis then followed, the procedure involved in the signal analysis includedThe wavedec function in MATLAB was used in the decomposition of the signal.The detail coefficients and approximated coefficients were then extracted and speckles made to show the different levels of decompositionThe different levels of coefficient were then analyzed and compared, making detailed analysis that the decomposition resulted inAfter decomposition of the different levels de-nosing took place this was done with the ddencmp function in MATLAB,The actual de-nosing process was then undertaken using wdencmp function in MATLAB, plot comparison was made to compare the noise corrupted signal and the de-noised signalThe average antecedent and SNR of the de-noised signal was done and comparison made between it and the original and the de-noised signal.Implementation/DiscussionThe first part of the project consisted of doing a enter in MATLAB, a recording was done of my own voice and the default sample rate was used were Fs = 11025, codes were used to do recordings in MATLAB and different variables were altered and specified based on the codes used, the m file that is submitted with this project gives all the codes that were utilized for the project, the recordings were done for 9 seconds the wavplay function was then used to replay the recording that was done until a desired recording was obtained after the recording was done a wavwrite function was then used to store the data that was previously recorded into a wav file. The data that was create verbally into a wav file was originally stored in variable y and then given the name recording1. A plot was then made to show the wave format of the speech file recorded.Fig 1Fig1 while above cover original recording without any noise putrefactionAccording to fig1 the maximum amplitude of the signal is +0.5 and the minimum amplitude being -0.3 from observation with the naked eye it can be seen that most of the information in the spe ech signal is confined between the amplitude +0.15 -0.15.The power of the speech signal was then calculated in MATLAB using a periodogram spectrum this produces an estimate of the spectral density of the signal and is computed from the finite length digital sequence using the Fast Fourier Transform (The MathWorks 1984-2010) the window parameter that was used was the Hamming window, the window function is some function that is zero outdoors some chosen interval. The hamming window is a typical window function and is applied typically by a point by point multiplication to the excitant of the fast fourier transform, this controls the adjacent levels of spectral artifacts which would progress in the magnitude of the fast fourier transform results, for a case where the input frequencies do not correspond with the bin center. Convolution that occurs within the frequency domain can be considered as windowing this is basically the same as performing multiplication within the time domain , the result of this multiplication is that any samples outside a frequency will affect the overall amplitude of that frequency.Fig2Fig2 plot showing periodogram spectral analysis of original recordingFrom the spectral analysis it was calculated that the power of the signal is 0.0011 wattAfter the signal was analyzed noise was added to the signal, the noise that was added was one-dimensional gaussian white noise (AWGN), and this is a random signal that contains a flat power spectral density (Wikipedia, 2010). At a given center frequency extra white noise will contain equal power at a fixed bandwidth the term white is used to mean that the frequency spectrum is continuous and is also uniform for the entire frequency band. In the project additive is used to simply mean that this impairment to the original signal is corrupting the speech The MATLAB code that was used to add the noise to the recording can be seen in the m file.For the very first recording the power in the signal was s et to 1 watt and the SNR set to 80, the applied code was set to signal z, which is a copy of the original recording y, below is the plot showing the analysis of the noise corrupted recording.Fig3Fig3 plot showing the original recording corrupted by noiseBased on observation of the plot above it can be estimated that information in the original recording is masked by the additive white noise to the signal, this would have a negative effect as the clean information would be masked out by the noise, a process known as aliasing. Because the amplitude of the additive noise is great than the amplitude of the recording it causes distortion observation of the graph shows the amplitude of the corrupted signal is greater than the original recording. The noise power of the corrupted signal was calculated buy the division of the signal power and the signal to noise ratio, the noise power calculated from the first recording is 1.37e-005. The noise power of the corrupted signal is 1.37e-005 the spectrum peridodogram was then used to calculate the average power of the corrupted signal , based on the MATLAB calculations the power was calculated to be 0.0033 wattFig4Fig4 plot showing periodogram spectral analysis of corrupted signalFrom analysis of the plot above it can be seen that the frequency of the corrupted signal spans a wider band, the original recording spectral frequency analysis showed a value of -20Hz as compared to the corrupted signal showed a value of 30Hz this increase in the corrupted signal is attributed to the noise added and this masked out the original recording again as before the process of aliasing.It was seen that the average power of the corrupted was greater than the original signal, the increase in power can be attributed to the additive noise added to the signal this caused the increase in power of the signal.The signal to noise ratio (SNR) of the corrupted signal was calculate from the formula corrupted power/noise power , and the corrupted SNR w as found to be 240 as compared to 472.72 of the de-noised, the decrease in signal to noise ratio can be attributed to the additive noise this resulted in the level of noise to the level of clean recording to be greater this is the basis for the fall SNR in the corrupted signal, the increase in the SNR in the clean signal will be discussed further in the discussion.The reason there was a reduce in the SNR in the corrupted signal is because the level of noise to clean signal is greater and this is basis of signal to noise comparison, it is used to measure how much a signal is corrupted by noise and the lower this ratio is, the more corrupted a signal will be. The calculation method that was used to calculate this ratio isWhere the different signal and noise power were calculated from MATLAB as seen aboveThe analysis of the signal then commenced a .wav file was then created for the corrupted signal using the MATLAB command wavwrite, with Fs being the sample frequency, N being the corr upted file and the name being noise recording, a file x1 that was going to be analysed was created using the MATLAB command wavread.Wavelet multilevel decomposition was then performed on the signal x1 using the MATLAB command wavedec, this function performs the wavelet decomposition of the signal, the decomposition is a multilevel one dimensional decomposition, and discrete wavelet transform (DWT) is using pyramid algorithms, during the decomposition the signal is passed through a high pass and a low pass filter. The output of the low pass is further passed through a high pass and a low pass filter and this process continues (The MathWorks 1994-2010) based on the specification of the programmer, a linear time invariant filter, this being a filter that passes high frequencies and attenuates frequency that are below a threshold called the cut off frequency, the rate of attenuation is specified by the designer. While on the other hand the opposite to the high pass filter, is the low pa ss filter this filter will just now pass low frequency signals but attenuates signal that contain a higher frequency than the cut off. Based on the decomposition procedure above the process was done 8 times, and at each level of decomposition the actual signal is vanquish sampled by a factor of 2. The high pass output at each stage represents the actual wavelet transformed data these are called the detailed coefficients (The MathWorks 1994-2010).Fig 5Fig 5 above levels decomposition (The MathWorks 1994-2010)Block C above contains the decomposition vectors and Block L contains the bookkeeping vector, based on the theatrical performance above a signal X of a specific length is decomposed into coefficients, the first part of the decomposition produces 2 sets of coefficients the approximate coefficient cA1 and the detailed coefficient cD1, to get the approximate coefficient the signal x is convolved with low pass filter and to get the detailed coefficient signal x is convolved with a high pass filer. The second stage is similar only this time the signal that will be sampled is cA1 as compared to x before with the signal further being sampled through high and low pass filter again to produce approximate and detailed coefficients respectively hence the signal is down sampled and the factor of down ingest is twoThe algorithm above (The MathWorks 1994-2010) represents the first level decomposition that was done in MATLAB, the original signal x(t) is decomposed into approximate and detailed coefficient, the algorithm above represents the signal being passed through a low pass filter where the detail coefficients are extracted to give D2(t)+D1(t) this analysis is passed through a single stage filter swear further analysis through the filter bank will produce greater stages of detailed coefficients as can be seen with the algorithm below (The MathWorks 1994-2010).The coefficients,cAm(k)andcDm(k)formm = 1,2,3can be calculated by iterating or cascading the single stag e filter bank to obtain a manifold stage filter bank(The MathWorks 1994-2010).Fig6Fig6 showing graphical representation of multilevel decomposition (The MathWorks 1994-2010)At each level it is observed the signal is down sampled and the sampling factor is 2. At d8 obeservation shows that the signal is down sampled by 28 i.e. 60,000/28. All this is done for better frequency resolution. Lower frequencies arepresentat all time I am generally concerned with higher frequencies which contains the actual data.I have used daubechies wavelet type 4 (db4), the daubechies wavelet are defined by computing the running averages and differences via scalar products with scaling signals and wavelets(M.I. Mahmoud, M. I. M. Dessouky, S. Deyab, and F. H. Elfouly, 2007) For this type of wavelet there exist a balance frequency retort but the phase response is non linear. The Daubechies wavelet types uses windows that overlap in order to ensure that the coefficients of higher frequencies will show any changes in their high frequency, based on these properties the Daubechies wavelet types proves to be an efficient tool in the de-nosing and compression of audio signals.For the Daubechies D4 transform, this transform has 4 wavelet types and scaling coefficient functions, these coefficient functions are shown belowThe different steps that are involved in the wavelet transforms, will utilize different scaling functions, to the signal of interest if the data being analyzed contains a value of N, the scaling function that will be applied will be applied to calculate N/2 smoothed values. The smoothed values are stored in the lower half of the N element input vector for the ordered wavelet transform. The wavelet function coefficient values are g0= h3 g1= -h2 g2= h1 g3= -h0The different scaling function and wavelet function are calculated using the inner product of the coefficients and the four different data values. The equations are shown below (Ian Kaplan, July 2001)The repetition of t he of the steps of the wavelet transforms was then used in the calculation of the function value of the wavelet and the scaling function value, for each repetition there is an increase by two in the index and when this occurs a different wavelet and scaling function is produced.Fig 7Diagram above showing the steps involved in foregoing transform(The MathWorks 1994-2010)The diagram above illustrates steps in the forward transform, based on observation of the diagram it can be seen that the data is divided up into different elements, these separate elements are even and the first elements are stored to the even array and the second half of the elements are stored in the odd array. In reality this is folded into a single function even though the diagram above goes against this, the diagrams shows two normalized steps.The input signal in the algorithm above (Ian Kaplan, July 2001) is then broken down into what are called wavelets. One of the most significant benefits of use of wavelet transforms is the fact that it contains a window that varies, to identify signal not continuous having base functions that are short is most desirable. But in order to obtain detailed frequency analysis it is better to have long basis function. A good way to achieve this compromise is having a short high frequency functions and also long low frequency ones(Swathi Nibhanupudi, 2003)Wavelet analysis contains an immeasurable basis functions, this allows wavelet transforms and analyisis with the ability realize cases that can not be easily realized by other time frequency methods, namely Fourier transforms.MATLAB codes are then used to extract the detailed coefficients, the m file shows these codes, the detailed coefficients that are Daubechies orthogonal type wavelets D2-D20are often used. The numbers of coefficients are represented by the index number, for the different wavelets they contain vanishing moments that are identical to the halve of the coefficients. This can be seen using the orthogonal types where D2 contain only one moment and D4 two moments and so on, the vanishing moment of the wavelets refers to its ability to represent the information in a signal or the polynomial behavior. The D2 type that contains only one moment will encode polynomial of one coefficient easily that are of constant signal component. The D4 type will encode polynomial of two coefficients, the D6 will encode coefficient of three polynomial and so on.The scaling and wavelet function have to be normalized and this normalization factor is a factor. The coefficients for the wavelet are derived by the reverse of the order of the scaling function coefficients and then by reversing the sign of the second one (D4 wavelet = -0.1830125, -0.3169874, 1.1830128, -0.6830128) mathematically, this looks likewherekis the coefficient index,bis a wavelet coefficient andca scaling function coefficient.Nis the wavelet index, ie 4 for D4 (M. Bahoura, J. Bouat. 2009)Fig 7Plot of fig 7 showing approx imated coefficient of the level 8 decompositionFig 8Plot of fig 8 showing detailed coefficient of the level 1 decompositionFig 9Plot of fig 9 showing approximated coefficient of the level 3 decompositionFig 10Plot of fig 10 showing approximated coefficient of the level 5 decompositionFig 11Plot of fig 11, showing comparison of the different levels of decompositionFig12Plot fig12 showing the details of all the levels of the coefficientsThe next step in the de-nosing process is the actual removal of the noise after the coefficients have been realized and calculated the MATLAB functions that are used in the de-noising functions are the ddencmp and the wdencmp functionThis process actually removes noise by a process called thresholding, De-noising, the task of removing or suppressing un edifying noise from signals is an important part of many signal or image processing applications. Wavelets are car park tools in the field of signal processing. The popularity of wavelets in de-nosingis largely due to the computationally efficient algorithms as well as to the sparsityof the wavelet representation of data. By sparsity I mean that majority of the wavelet coefficients have very small magnitudes whereas only a small subset of coefficients have large magnitudes. I may informally state that this small subset contains the interesting informative part of the signal, whereas the rest of the coefficients describe noise and can be toss to give a noise-free reconstruction.The best known wavelet de-noising methods are thresholding approaches, see e.g. In hard thresholding all the coefficients with greater magnitudes as compared to the threshold are retained unmodified this is because they comprise the informative part of data, while the rest of the coefficients are considered to represent noise and set to zero. However, it is reasonable to assume that coefficients are not purely either noise or informative but mixtures of those. To cope with this soft thresholding approaches have been proposed, in the process of soft thresholding coefficients that are smaller than the threshold are made zero, however the coefficients that are kept are made smaller towards zero by an amount of the threshold value in order to decrease the effect of noise anticipate to corrupt all the wavelet coefficients. In my project I have chosen to do a eight level decomposition before applying the de-nosing algorithm, the decomposition levels of the different eight levels are obtained, because the signal of in
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.