Demonstration on HRTF Interpolation from Spatially Sparse Measurements Using Autoencoder with Source Position Conditioning

Yuki Ito, Tomohiko Nakamura, Shoichi Koyama, and Hiroshi Saruwatari (The University of Tokyo)

In this demo page, you can listen to binaural signals synthesized with head-related tranfer functions (HRTFs) estimated by our proposed method [1] and a regularized-linear-regression(RLR)-based mehod [2] from 9 nearly uniformly sampled measurement positions.
The HRTFs on this page belongs to one of the listeners of the HUTUBS dataset [3][4], and the source singal is from the MUSDB18-HQ dataset [5].
For a better experience, listen with headphones or earphones.


Source Position Ground Truth RLR-based method [2] Proposed [1]
azimuth: 0 deg
elevation: 0 deg
azimuth: 0 deg
elevation: 30 deg
azimuth: 0 deg
elevation: 50 deg
azimuth: 60 deg
elevation: 0 deg
azimuth: 60 deg
elevation: 30 deg
azimuth: 60 deg
elevation: 50 deg
azimuth: 90 deg
elevation: 0 deg
azimuth: 90 deg
elevation: 30 deg
azimuth: 90 deg
elevation: 50 deg
azimuth: 180 deg
elevation: 0 deg
azimuth: 180 deg
elevation: 30 deg
azimuth: 180 deg
elevation: 50 deg
azimuth: 270 deg
elevation: 0 deg
azimuth: 270 deg
elevation: 30 deg
azimuth: 270 deg
elevation: 50 deg
azimuth: 300 deg
elevation: 0 deg
azimuth: 300 deg
elevation: 30 deg
azimuth: 300 deg
elevation: 50 deg

References

[1] Yuki Ito, Tomohiko Nakamura, Shoichi Koyama, and Hiroshi Saruwatari, “Head-Related Transfer Function Interpolation from Spatially Sparse Measurements Using Autoencoder with Source Position Conditioning,” in Proc. International Workshop on Acoustic Signal Enhancement (IWAENC), Sep., 2022. (to appear) [PDF] [Slides]
[2] Ramani Duraiswami, Dmitry N. Zotkin, and Nail A. Gumerov, “Interpolation and range extrapolation of HRTFs [head related transfer functions],” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2004, vol. 4, pp. 45–48.
[3] Fabian Brinkmann, Manoj Dinakaran, Robert Pelzer, Peter Grosche, Daniel Voss, and Stefan Weinzierl, “A cross-evaluated database of measured and simulated HRTFs including 3D head meshes, anthropometric features, and headphone impulse responses”, J. Audio Eng. Soc., vol. 67, no. 9, pp. 705–718, 2019.
[4] Fabian Brinkmann, Manoj Dinakaran, Robert Pelzer, Jan Joschka Wohlgemuth, Fabian Seipel, Daniel Voss, Peter Grosche, Stefan Weinzierl, “The HUTUBS head-related transfer function (HRTF) database,” 2019, url: http://dx.doi.org/10.14279/depositonce-8487 (accessed May 6, 2022).
[5] Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis, and Rachel Bittner, “MUSDB18-HQ - an uncompressed version of MUSDB18,” 2019, url: https://doi.org/10.5281/zenodo.3338373 (accessed Aug. 30, 2022).