Multimodal Diarization Challenge


Eduardo Lleida, Alfonso Ortega, Antonio Miguel (UZ)
Virginia Bazán, Carmen Pérez, Alberto de Prada (RTVE)


The IBERSPEECH-RTVE MULTIMODAL DIARIZATION is a new challenge in the ALBAYZIN evaluation series. This is supported by the Spanish Thematic Network on Speech Technology (RTTH) and Cátedra RTVE en la Universidad de Zaragoza and is organized by Vivolab – Universidad de Zaragoza.

The multimodal diarization evaluation consists of segmenting audiovisual documents according to different speakers and faces and linking those segments which originate from the same speaker and face. For this evaluation, a list of characters to recognize will be given. The rest of characters on the audiovisual document will be labelled as “unknown”. System outputs must give for each segment who is speaking and who is/are in the image from the list or “unknown”. For each character, a set of face pictures will be given.

For this evaluation, six hours of labelled faces and speakers of two different RTVE shows will be used. The shows cover studio to live broadcast. Two hours will be used for development and four hours will be used for testing. The data is available to the evaluation participants only and subject to the terms of a licence agreement with the RTVE. The license agreement can be downloaded from Cátedra RTVE-UZ web page (

System outputs will be scored in terms of Diarization Error Rate (DER). The DER will be computed using included on the sctk-2.4.10 NIST toolkit. System outputs will be a file with an rttm format where the speaker name field will consist on the concatenation of the name of who is speaker and the name of who is/are in the image.

More details will be given in the evaluation plan.


June 18, 2018: Release of the evaluation plan, training and development data
July 15, 2018: Registration deadline
September 24, 2018: Release of evaluation data
October 21, 2018: Deadline for the submission of system outputs and description papers
October 31, 2018: Results distributed to participants
November 21-23, 2018: Evaluation Workshop at Iberspeech 2018



Interested groups must register for the evaluation before July 15th 2018, by contacting the organizing team at with CC to ALBAYZIN 2018 Evaluations Organising Committee. The contact should contain the following information:

Research group (name and acronym)
Institution (university, research center, company, …)
Contact person (name)

To download the RTVE data, you will need to sign this data license and return it to the IberSpeech-RTVE team.