Ph.D. Defense by Sam Karimian-Azari

On Septeber 27 2016, Sam Karimian-Azari will defend his Ph.D. entitled Fundamental Frequency and Direction-of-Arrival Estimation for Multichannel Speech Enhancement at AD:MT, Aalborg University in Rendsburggade 14. The asessment committee is comprised of Assoc. Prof. Kamal Nasrollahi (chairman, AAU), Assoc. Prof. Roland Badeau (Télécom ParisTech), and Assoc. Prof. Augusto Sarti (Politecnico di Milano).  He was supervised by Professor Mads Græsbøll Christensen and Assistant Prof. Jesper Rindom Jensen. A small reception will be held after the defense.

Abstract: Audio systems receive the speech signals of interest usually in the presence of noise. The noise has profound impacts on the quality and intelligibility of the speech signals, and it is therefore clear that the noisy signals must be cleaned up before being played back, stored, or analyzed. We can estimate the speech signal of interest from the noisy signals using a priori knowledge about it. A human speech signal is broadband and consists of both voiced and unvoiced parts. The voiced part is quasi-periodic with a time-varying fundamental frequency (or pitch as it is commonly referred to). We consider the periodic signals basically as the sum of harmonics. Therefore, we can pass the noisy signals through bandpass filters centered at the frequencies of the harmonics to enhance the signal. In addition, although the frequencies of the harmonics are the same across the channels of a microphone array, the multichannel periodic signals may have different phases due to the time-differences-of-arrivals (TDOAs) which are related to the direction-of-arrival (DOA) of the impinging sound waves. Hence, the outputs of the array can be steered to the direction of the signal of interest in order to align their time differences which eventually may further reduce the effects of noise. This thesis introduces a number of principles and methods to estimate periodic signals in noisy environments with application to multichannel speech enhancement. We propose model-based signal enhancement concerning the model of periodic signals. Therefore, the parameters of the model must be estimated in advance. The signal of interest is often contaminated by different types of noise that may render many estimation methods suboptimal due to an incorrect white Gaussian noise assumption. We therefore propose robust estimators against the noise and focus on statistical-based and filtering-based methods by imposing distortionless constraints with explicit relations between the parameters of the harmonics. The estimated fundamental frequencies are expected to be continuous over time. Therefore, we concern the time-varying fundamental frequency in the statistical methods in order to lessen the estimation error. We also propose a maximum likelihood DOA estimator concerning the noise statistics and the linear relationship between the TDOAs of the harmonics. The estimators have benefits compared to the state-of-the-art statistical-based methods in colored noise. Evaluations of the estimators comparing with the minimum variance of the deterministic parameters and the other methods confirm that the proposed estimators are statistically efficient in colored noise and computationally simple. Finally, we propose model-based beamformers in multichannel speech signal enhancement by exploiting the estimated fundamental frequency and DOA of the signal of interest. This general framework is tailored to a number of beamformers concerning the spectral and spatial information of the periodic signals which are quasi-stationary in short intervals. Objective measures of speech quality and ineligibility confirm the advantage of the harmonic model-based beamformers over the traditional beamformers, which are non-parametric, and reveal the importance of an accurate estimate of the parameters of the model.

Keynote Talk at IWAENC 2016

At The 15th International Workshop on Acoustic Signal Enhancemen 2016 (IWAENC) held September 13-16 in Xi’an, China, Audio Analysis Lab member Prof. Mads Græsbøll Christensen gave a keynote talk about the lab’s work. The slides can be downloaded here. IWAENC is a leading workshop in the signal processing community addressing the problems of acoustic signal processing.

Title: Statistical Parametric Speech Processing

Abstract: Parametric speech models have been around for many years but have always had their detractors. Two common arguments against such models are that it is too difficult to find their parameters and that the models do not take the complicated nature of real signals into account. In recent years, significant advances have been made in speech models and robust estimation using statistical principles, and it has been demonstrated that, regardless of any deficiencies in the model, the parametric methods outperform the more commonly used non-parametric methods (e.g., autocorrelation-based methods) for problems like pitch estimation. In this talk, state-of-the-art parametric speech models and statistical estimators for finding their parameters will be presented and their pros and cons discussed. The merits of the statistical, parametric approach to speech modeling will be demonstrated by showing how otherwise complicated problems can be solved comparably easily this way. Examples of such problems are pitch estimation for non-stationary speech, distortionless speech enhancement, noise statistics estimation, speech segmentation, multi-channel modeling, and model-based localization and beamforming with microphone arrays.

Two New Assistant Professors

We are happy to announce that as of September 1, there are two newly appointed Assistant Professors in the Audio Analysis Lab.

jrjThe first is Jesper Rindom Jensen, who was previously postdoc with the lab. He has been with the lab since its inception and is a founding member. Prior to becoming assistant professor, he held an individual postdoc grant from the Danish Council for Independent Research for three years. Jesper Rindom Jensen has worked on various aspects of audio and acoustic signal problems, including single- and multi-channel noise reduction, beambforming, localization and tracking with microphone arrays, and is appointed as Assistant Professor in Microphone Array Signal Processing.

jknThe second is Jesper Kjær Nielsen, a long-time collaborator with the Audio Analysis Lab, who was previously with Dept. of Electronic Systems. For the past few years, he has has worked on industrial research projects with B&O. He is an expert in statistical methods for signal processing, having worked on a wide range of problems, including sinusoidal parameter estimation, interpolation and extrapolation, pitch estimation, and fast implementations. He joins the lab with an appointment as Assistant Professor in Statistical Signal Processing.

We congratulate them both on their appointments and welcome newcomer Jesper Kjær Nielsen to the lab!

Mads Græsbøll Christensen receives EURASIP Early Career Award

DDSC_0031uring the award ceremony at EUSIPCO 2016 in Budapest, Hungary, Mads Græsbøll Christensen of the Audio Analysis Lab received the EURASIP Early Career Award for significant contributions to statistical processing of audio and speech signals.

EURASIP Early Career Award is awarded to an outstanding researcher and engineer working within the technical scope of EURASIP at an early or mid-stage of their career whose current work shows not only significant scientific achievements but also high potential to advance scientific knowledge through novel, timely and significant endeavors. This award targets at researchers who are less than forty. It was the first time the award was given.

EUSIPCO 2016

The Audio Analysis Lab was well represented at this year’s EUSIPCO, which was held in Budapest, Hungary, with the following presentations of papers:

  • Multi-Pitch Estimation of Audio Recordings Using a Codebook-Based Approach Martin Weiss Hansen, Jesper Rindom Jensen and Mads Græsbøll Christensen (Aalborg University, Denmark)
  • Computational Analysis of a Fast Algorithm for High-order Sparse Linear Prediction Tobias Lindstrøm Jensen (Aalborg University, Denmark); Daniele Giacobello (SONOS, Inc. USA); Toon van Waterschoot (KU Leuven, Belgium); Mads Græsbøll Christensen (Aalborg University, Denmark)
  • Ad Hoc Microphone Array Beamforming Using the Primal-Dual Method of Multipliers Vincent Mohammad Tavakoli and Jesper Rindom Jensen (Aalborg University, Denmark); Richard Heusdens (Delft University of Technology, The Netherlands); Jacob Benesty (INRS-EMT, University of Quebec, Canada); Mads Græsbøll Christensen (Aalborg University, Denmark)
  • Semi-non-intrusive objective intelligibility measure using spatial filtering in hearing aids Charlotte Sørensen (Aalborg University & GN Resound, Denmark); Jesper Bünsow Boldt (GN ReSound, Denmark); Fredrik Gran (GN Resound, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark)
  • Grid Size Selection for Nonlinear Least-Squares Optimization in Spectral Estimation and Array Processing Jesper Kjær Nielsen (Aalborg University & Bang & Olufsen, Denmark); Tobias Lindstrøm Jensen, Jesper Rindom Jensen, Mads Græsbøll Christensen and Søren Holdt Jensen (Aalborg University, Denmark)

Audio Analysis Workshop 2016

IMG_2562 On August 19 2016 the annual Audio Analysis Workshop was held. This year’s edition was co-sponsored by the Audio Analysis Lab’s projects funded by the Danish Council for Independent Research and the Villum Foundation. It featured 14 scientific talks and 2 keynote talks with 18 participants from Lund University, Aalborg Unviversity, Delft University of Technology, GN Resound, and Ashton University. The two keynote talks were on the topic Parkinson’s disease, how it affects the voice, and how it can be detected from the voice. The first keynote talk, entitled Braak´s hypothesis and its impact on research and treatment in Parkinson´s disease was given by neurologist Lorenz Oppel, Aalborg University Hospital. The second keynote talk was given by Dr. Max Little, Ashton University, and was entitled Algorithms for feature extraction in voicebased analysis of Parkinson’s disease. In his talk, Max gave an overview of his many years of research on the topic. The scientific talks were on varied topics, including fast implementations, microphone arrays, music analysis, measurement of speech intelligibility, multi-pitch estimation, sparse approximations, classification of  height, weight,  and other things from speech, room geometry estimation, and speech enhancement.

Inaugural Lecture by Prof. Mads Græsbøll Christensen

On June 2 2016 Professor Mads Græsbøll Christensen of the Audio Analysis Lab gave his inaugural lecture, entitled “Statistical Parametric Speech Processing: Solving Problems with the Model-Based Approach”, at AD:MT, Aalborg University. Below you can see a video recording of the lecture.

Presentations at ICASSP

Next week, Audio Analysis Lab members will present a number of papers at this year’s installment of ICASSP, which will be held in Shanghai, China:

  • EXPERIMENTAL STUDY OF GENERALIZED SUBSPACE FILTERS FOR THE COCKTAIL PARTY SITUATION
  • KALMAN FILTER FOR SPEECH ENHANCEMENT IN COCKTAIL PARTY SCENARIOS USING A CODEBOOK-BASED APPROACH
  • FAST AND STATISTICALLY EFFICIENT FUNDAMENTAL FREQUENCY ESTIMATION
  • DOA ESTIMATION OF AUDIO SOURCES IN REVERBERANT ENVIRONMENTS
  • A PARTITIONED APPROACH TO SIGNAL SEPARATION WITH MICROPHONE AD HOC ARRAYS
  • VARIABLE SPAN FILTERS FOR SPEECH ENHANCEMENT

Open postdoc position in Parkinson’s project

We are currently looking for candidates for the open position as postdoctoral researcher in our project entitled Signal Processing for Diagnosis of Parkinson’s Disease from Noisy Speech.

Many people are affected by Parkinson’s disease (PD) in some way. There currently exists no cure, there are no known biomarkers that can be used for diagnosis, and the number of people with PD is expected to rise dramatically in the near future. The project aims at finding accurate and robust signal processing methods for analyzing natural, noisy speech for early diagnosis and monitoring of the progression of PD based on parametric models. The project is an international collaboration between Aalborg University, MIT, University of Colorado-Boulder, the Parkinson’s Voice Initiative, and Lund University. The project is funded by the Danish Council for Independent Research | Technology and Production Science which has granted 6.5 million DKK for the project.

You can read more about the project here. We are looking for recent graduates with a Ph.D. in signal processing, speech processing, machine learning, or something like that. If anybody’s interested in the position, please contact Prof. Mads Græsbøll Christensen by sending an email to mgc@create.aau.dk. We will post an official call for applications once we know we have qualified candidates.

 

Ph.D. Defense by Sidsel Marie Nørholm

On March 4 2016, Sidsel Marie Nørholm of the Audio Analysis Lab will defend her Ph.D. thesis entitled “Enhancement of Speech – with a Focus on Voiced Speech Models”.

Thesis abstract: The thesis deals with speech  enhancement, i.e., noise reduction in speech signals. This has application in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based on this model. The basic model used in this thesis is the harmonic model which is a commonly used model for describing the voice d speech part of the speech signal. We show that it can be beneficial to extend the model to take inharmonicities or the non-stationarity of speech into account. Extending the model introduces extra parameters and we suggest methods to estimate these extra parameters and derive filters based on the extended models.

The Ph.D. assessment committee is

  • Professor Stefani a Serafin (Chairman), Aalborg University
  • Professor Ioannis (Yannis) Stylianou, University of Crete
  • Reader Wenwu Wang, University of Surrey

The work was supervised by Jesper Rindom Jensen and Mads Græsbøll Christensen.

The defense will take place at 1 pm at Rendsburggade 14, room 3.429, 9000 Aalborg. The Department of Architecture, Design and Media Technology will host a reception after the defense.