SONGS CONTINUATION GENERATION TECHNOLOGY BASED ON TEST GENERATION STRATEGIES, TEXTMINING AND LANGUAGE MODEL T5

Authors

  • O. Mediakov Lviv Polytechnic National University, Lviv, Ukraine, Ukraine
  • V. Vysotska Lviv Polytechnic National University, Lviv, Ukraine, Ukraine

DOI:

https://doi.org/10.15588/1607-3274-2023-4-15

Keywords:

text generation, T5 language model, Transformers, author’s style, Contrastive search, Top-p sampling, Top-k sampling, Multinomial sampling, Beam search, Diverse beam search, Greedy search, and Beam-search multinomial sampling

Abstract

Context. Pre-trained large language models are currently the driving force behind the development of not only NLP, but also deep learning systems in general. Model transformers are able to solve virtually all problems that currently exist, provided that certain requirements and training practices are met. In turn, words, sentences and texts are the basic and most important way of communication between intellectually developed beings. Of course, speech and texts are used to convey certain emotions, events, etc. One of the main ways of using language to describe experienced emotions is songs with lyrics. However, often due to the need to preserve rhyme and rhyming, the dimensions of verse lines, song structure, etc., artists have to use repetition of lines in the lyrics. In addition, the process of writing texts can be long.

Objective of the study is to develop information technology for generating the continuation of song texts based on the T5 machine learning model with (SA, specific author) and without (NSA, non-specific author) consideration of the author's style.

Method. Choosing a decoding strategy is important for the generation process. However, instead of favoring a particular strategy, the system will support multiple strategies. In particular, the following 8 strategies: Contrastive search, Top-p sampling, Top-k sampling, Multinomial sampling, Beam search, Diverse beam search, Greedy search, and Beam-search multinomial sampling.

Results. A machine learning model was developed to generate the continuation of song lyrics using large language models, in particular the T5 model, to accelerate, complement and increase the flexibility of the songwriting process.

Conclusions. The created model shows excellent results of generating the continuation of song texts on test data. Analysis of the raw data showed that the NSA model has less degrading results, while the SA model needs to balance the amount of text for each author. Several text metrics such as BLEU, RougeL and RougeN are calculated to quantitatively compare the results of the models and generation strategies. The value of the BLEU metric is the most variable, and its value varies significantly depending on the strategy. At the same time, Rouge metrics have less variability, a smaller range of values. For comparison, 8 different decoding methods for text generation, supported by the transformers library, were used. From all the results of the text comparison, it is clear that the metrically best method of song text generation is beam search and its variations, in particular beam sampling. Contrastive search usually outperformed the conventional greedy approach. The top-p and top-k methods are not clearly superior to each other, and in different situations gave different results.

Author Biographies

O. Mediakov, Lviv Polytechnic National University, Lviv, Ukraine

Post-graduate student of Information Systems and Networks Department

V. Vysotska, Lviv Polytechnic National University, Lviv, Ukraine

PhD, Associate Professor of Information Systems and Networks Department

References

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser Ł., Polosukhin I. Attention Is All You Need, arXiv. Access mode: https://arxiv.org/abs/1706.03762

Raffel C., Shazeer N., Roberts A., Lee K., Narang S., Matena M., Zhou Y., Li W., Liu P. J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, arXiv. Access mode: https://arxiv.org/abs/1910.10683

Hugging Face community. Text generation strategies. Access mode: https://huggingface.co/docs/transformers/v4.29.0/en/generati on_strategies

von Platen P. How to generate text: using different decoding methods for language generation with Transformers. Access mode: https://huggingface.co/blog/how-to-generate

Hugging Face community. T5. Access mode: https://huggingface.co/docs/transformers/model_doc/t5

Vijayakumar A. K., Cogswell M., Selvaraju R. R., Sun Q., Lee S., Crandall D., Batra D. Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models, arXiv. Access mode: https://arxiv.org/abs/1610.02424

Hugging Face community. T5v1.1. Access mode: https://huggingface.co/docs/transformers/model_doc/t5v1.1

Lacoste A., Luccioni A., Schmidt V., Dandres T. Quantifying the Carbon Emissions of Machine Learning, arXiv. Access mode: https://arxiv.org/abs/1910.09700

Google. SentencePiece. Access mode: https://github.com/google/sentencepiece

Sennrich R. Subword Neural Machine Translation. Access mode: https://github.com/rsennrich/subword-nmt

Wu Y., Schuster M., Chen Z., Le Q. V., Norouzi M., Macherey W., Krikun M., Cao Y., Gao Q., Macherey K., Klingner J., Shah A., Johnson M., Liu X., Kaiser Ł., Gouws S., Kato Y., Kudo T., Kazawa H., Stevens K., Kurian G., Patil N., Wang W. , Young C., Smith J., Riesa J., Rudnick A., Vinyals O. , Corrado G., Hughes M., Dean J. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, arXiv. Access mode: https://arxiv.org/abs/1609.08144

Hugging Face community. Transformers. State-of-the-art Machine Learning for PyTorch, TensorFlow, and JAX. Access mode: https://huggingface.co/docs/transformers/index

Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G. S., Davis A. , Dean J., Devin M., Ghemawat S., Goodfellow I., Harp A., Irving G., Isard M., Jia Y., Jozefowicz R., Kaiser L., Kudlur M., Levenberg J., Mane D., Monga R., Moore S., Murray D., Olah C., Schuster M., Shlens J., Steiner B., Sutskever I., Talwar K., Tucker P., Vanhoucke V., Vasudevan V., Viegas F., Vinyals O., Warden P., Wattenberg M., Wicke M., Yu Y., Zheng X. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, arXiv. Access mode: https://arxiv.org/abs/1603.04467

Shah D. Song Lyrics Dataset, Kaggle. Access mode: https://www.kaggle.com/datasets/deepshah16/song-lyricsdataset

Swift T., Dessner A. Long story short, Genius. Taylor Swift Music. Access mode: https://genius.com/Taylor-swift-longstory-short-lyrics

Fan A., Lewis M., Dauphin Y. Hierarchical Neural Story Generation, Association for Computational Linguistics : 56th Annual Meeting, Melbourne, Australia, July 2018 : proceedings. Melbourne, ACL, 2018, pp. 889–898. DOI: 10.18653/v1/p18-1082

Chiang T.-R., Chen Y.-N. Relating Neural Text Degeneration to Exposure Bias, Analyzing and Interpreting Neural Networks for NLP : the Fourth BlackboxNLP Workshop, Punta Cana, Dominican Republic, November 2021 : proceedings. Punta Cana, ACL, 2021, pp. 228–239. DOI: 10.18653/v1/2021.blackboxnlp-1.16

Su Y., Lan T., Wang Y., Yogatama D., Kong L., Collier N. A Contrastive Framework for Neural Text Generation, arXiv. Access mode: https://arxiv.org/abs/2202.06417

Paulus R., Xiong C., Socher R. A Deep Reinforced Model for Abstractive Summarization, arXiv. Access mode: https://arxiv.org/abs/1705.04304

Klein G., Kim Y., Deng Y., Senellart J., Rush A. OpenNMT: Open-Source Toolkit for Neural Machine Translation, System Demonstrations : Association for Computational Linguistics, Vancouver, Canada, July 2017 : proceedings. Vancouver, ACL, 2017, pp. 67–72. DOI: 10.18653/v1/p17-4012

Murray K., Chiang D. Correcting Length Bias in Neural Machine Translation, arXiv. Access mode: https://arxiv.org/abs/1808.10006

Mathur N., Baldwin T., Cohn T. Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics, Association for Computational Linguistics : 58th Annual Meeting, Online, July 2020 : proceedings. Online, ACL, 2020, pp. 4984– 4997. DOI: 10.18653/v1/2020.acl-main.448

Lin C.-Y. ROUGE: A Package for Automatic Evaluation of Summaries, Text Summarization Branches Out : Association for Computational Linguistics, Barcelona, Spain, July 2004 : proceedings. Barcelona, ACL, 2004, pp. 74–81. Access mode: https://aclanthology.org/W04-1013

KerasNLP. Access mode: https://keras.io/keras_nlp/

Prokipchuk O., Vysotska V. Ukrainian Language Tweets Analysis Technology for Public Opinion Dynamics Change Prediction Based on Machine Learning, Radio Electronics, Computer Science, Control, 2023, No. 2(63), pp. 103–116. DOI: 10.15588/1607-3274-2023-2-11

Published

2024-01-04

How to Cite

Mediakov, O., & Vysotska, V. (2024). SONGS CONTINUATION GENERATION TECHNOLOGY BASED ON TEST GENERATION STRATEGIES, TEXTMINING AND LANGUAGE MODEL T5. Radio Electronics, Computer Science, Control, (4), 157. https://doi.org/10.15588/1607-3274-2023-4-15

Issue

Section

Progressive information technologies