Improved microphone array design with statistical speaker identification methods

Titre:

Auteur personnel:

Demir, Kadir Erdem, author.

PRODUCTION_INFO:

[s.l. : s.n.], 2016.

Description physique:

xi, 48 leaves : illustrations ; 30 cm + 1 CD-ROM.

Note générale:

Date of approval: 2016.

Extrait:

Abstract: Conventional microphone array implementations aim to lock onto a source with given location and if required, tracking it. This implementation is straightforward when the location or the path of the source and interference are provided. It becomes a challenge to detect the intended source when multiple unknown sources exist in the same environment. Performance of speaker identification degrades drastically when the speech signal is severely distorted by additive noise and reverberation. In such environments, microphone arrays are often utilized as a means of improving the quality of cap¬tured speech signals. Both microphone array and speaker identification are mature fields. The advances of these two distinct fields can be combined into one system that maximizes gain on the intended speaker, which is the topic of this thesis. We utilize microphone array methods to improve the accuracy of speaker identification in a cocktail party environment. When the source and interferences are localized microphone array can be tuned to further reduce noise and increase the gain. In this thesis we developed a robust simulation environment to demonstrate the proposed improved microphone array design with statistical speaker identification. This is an open source implementation in which users can assign speakers anywhere in the room. We proposed two features; fusion based, and computationally efficient N-Gram for speaker identification. We demonstrated that the proposed features and the algorithm that leverages the synergy of microphone array processing and speaker identification methods outperforms conventional algorithms.

Özet: Mikrofon dizilerinin kazancı dizinin boyutlarını büyütürek artırılabilir fakat kazancı artırmak için sensör eklemek çok maliyetlidir. Bu nedenle eğer ortamda yeterince alan olsa bile algoritma karışıklığını artırarak kazancı artırma tercih edilir. Spektral dizi işleme methodlarında, odaklanılmak istenen kişinin ve gürültünün bulunduğu posizyonlarm bilinmesi büyük avantaj sağlar. Geleneksel metodlar bu problemi istatiksel olmayan yöntemlerle çözmeye çalışır. Ayrıca ses tanıma metodlarınm performansları gürültü oranın yüksek olduğu ortamlarda azalır. Bu gibi ortamlarda, mikrofon dizilerinin kullanılması ses sinyalinin kalitesini artırır. Bu nedenlerde dolayı, mikrofon dizileri ve ses tanıma metodları birbirlerine katkı sağlarlar. Bu çalışmamızda, mikrofon dizisi sistemi ve ses tanıma sistemi tek bir sistemin parçaları olarak tasarlanmıştır. Mikrofon dizisi kullanarak ses tanıma sisteminin doğruluğu artılırken ses tanıma sisteminin sonuçları kullanılarakta mikrofon dizisinin kazancı artırılmıştır. Ses tanıma sistemi uygulumasında Fusion ve N-Gram temel frekans yöntemleri önerilmiştir Gelişmiş mikrofon tasarımını gösterebilmek için simulasyon ortamı konuşmacıların odanın herhangi bir yer¬ine eklenebiliceği bir simulasyon ortamı geliştirilmiştir. Simulasyon ortamında deneyler sonuçu önerilen metodlarm geleneksel metodlar üstün olduğu gözlemlenmiştir.

Terme sujet:

Signal processing--Digital techniques.

Signal processing--Statistical methods.

Adaptive signal processing.

Signal processing--Digital techniques--Data processing.

Signal processing--Mathematical models.

Dissertations, Academic.

Auteur ajouté: