KunquDB

a large-scale, well-annotated audio-visual Kunqu Opera dataset

About

KunquDB

  • A large-scale, well-annotated audio-visual dataset
  • Comprises 339 speakers and 128 hours of content
  • Originated from the Kunqu Opera Art Canon, Kunqu yishu dadian
  • Structured by dialogue lines, providing explicit annotations: character names, speaker names, gender, vocal manner classifications and preliminary text transcriptions

Kunqu yishu dadian

The Kunqu Opera Art Canon, encompasses the most significant literary, musical, and audiovisual materials spanning over 600 years, embodying the essence of Kunqu Opera art. Specifically, it contains over 22.3 million words of textual documentation, 396 sets of reprinted documents totaling over 70,000 pages, 127 hours of recorded audio, over 400 hours of video recordings, and over 6,000 images, all compiled into 149 volumes. For more details about this book, you can visit the official website of its publisher, or check out its introduction on douban.

Note

After purchasing the book, we negotiated with the publisher and secured their authorization for its utilization in Kunqu Opera research. The publisher explicitly stated that the book's digital resource can be employed solely for scholarly or research endeavors upon the approval of the publisher. It may not be illegally disseminated or used for commercial purposes.

Kunqu yishu dadian

statistics

Role Type

Vocal Manner

Duration

demo

Annotation Data Format

  • Data structured by dialogue lines, metadata stored in a CSV table
  • Contains all labels and annotations for each utterance
  • Information for each utterance includes:
    • video ID indicates the corresponding video;
    • start and end timestamps specify the location of the utterance within the video;
    • character name denotes the character portrayed in the video, often associated with the role type, for the corresponding utterance;
    • performer name represents the individual performer portraying the character for the corresponding utterance;
    • vocal manner type categorizes the utterance into “stage speech” or “singing”, depending on how it’s vocalized;
    • preliminary content transcription corresponds to the transcription of the spoken content within the utterance.

Here’s a screenshot of the annotation:

Annotation Screenshot

Annotation Data Examples

Due to data restrictions, we choose not to publicly disclose the data from the KunquDB dataset as examples. Instead, we showcase similar annotations for online Kunqu Opera videos here.

Video Name
Start time
End time
Character name
Performer name
Vocal manner
Text transcription
Play video
牡丹亭
Peony Pavilion
00:11
00:47
杜丽娘
Liniang Du
单雯
Wen Shan
Singing
原来姹紫嫣红开遍
A riot of deep purple and bright red.
牡丹亭
Peony Pavilion
01:12
01:30
杜丽娘
Liniang Du
单雯
Wen Shan
Singing
良辰美景奈何天
Why does Heaven give us brilliant day and dazzling sight?
牡丹亭
Peony Pavilion
00:30
00:50
柳梦梅
Mengmei Liu
施夏明
Xiaming Shi
Singing
则把云鬟点 红松翠偏
Let me rerange your tresses in disarray.
牡丹亭
Peony Pavilion
02:25
02:32
柳梦梅
Mengmei Liu
施夏明
Xiaming Shi
Stage speech
姐姐你身子乏了
Fair maiden, you are tired.
牧羊记
Shepherd's Notes
00:26
01:20
李陵
Ling Li
施夏明
Xiaming Shi
Singing
到虏庭与哥哥报冤
To the enemy's court, I'll seek justice for my brother.
牧羊记
Shepherd's Notes
05:35
05:38
苏武
Wu Su
柯军
Jun Ke
Stage speech
竟在此享荣华受富贵
Here you unexpectedly enjoy wealth and honor.
牧羊记
Shepherd's Notes
11:00
11:06
苏武
Wu Su
柯军
Jun Ke
Singing
我的忠心铁石样坚
My loyalty is as firm as iron and stone.
牡丹亭
Peony Pavilion
03:03
03:21
杜丽娘
Liniang Du
单雯
Wen Shan
Singing
乱煞年光遍
Chaos reigns, time passes in chaos.
牡丹亭
Peony Pavilion
04:42
04:55
杜丽娘
Liniang Du
单雯
Wen Shan
Stage speech
剪不断理还乱闷无端
Endless twists and turns, cutting without end.
牡丹亭
Peony Pavilion
03:57
04:16
春香
Chunxiang
陶一春
Yichun Tao
Singing
恁今春关情似去年
The affairs of this spring seem similar to those of last year.
牡丹亭
Peony Pavilion
04:36
04:41
春香
Chunxiang
陶一春
Yichun Tao
Stage speech
小姐 你侧着宜春髻子恰凭栏
Miss, leaning on the railing with your Yichun hairpin askew.

Citation:

If our work is useful for your research, please consider citing:

@article{zhou2024kunqudb,
  title={KunquDB: An Attempt for Speaker Verification in the Chinese Opera Scenario},
  author={Zhou, Huali and Lin, Yuke and Liu, Dong and Li, Ming},
  journal={arXiv preprint arXiv:2403.13356},
  year={2024}
}

download

Download

Researchers can gain access to the source video data by purchasing Kunqu yishu dadian. It is the user’s responsibility to get the approval from the publisher to conduct research for non-commercial purposes. We only provide our annotation dataset and processing scripts. To obtain the annotation dataset, please contact us via E-mail: huali.zhou@dukekunshan.edu.cn or ming.li369@dukekunshan.edu.cn, along with your affiliation and the consent from the publisher.

License

The dataset is licensed under the CC BY-NC-SA 4.0 license. This means that you can share and adapt the dataset for non-commercial purposes as long as you provide appropriate attribution and distribute your contributions under the same license. Detailed terms can be found on LICENSE.

acknowledgment

This research is funded by the Kunshan Municipal Government Research Funding under the project "Deep Learning based Singing Voice Synthesis for Kun Opera". We want to thank the publisher for allowing us to conduct research on their data and DKU library staff members for their coordination. Special thanks to Xiaoyi Qin for his assistance.