TREE-BASED SEMANTIC ANALYSIS METHOD FOR NATURAL LANGUAGE PHRASE TO FORMAL QUERY CONVERSION

Authors

  • A. A. Litvin V.M. Glushkov Institute of Cybernetics.
  • V. Yu. Velychko V.M. Glushkov Institute of Cybernetics.
  • V. V. Kaverynskyi I.M. Frantsevich Institute for Problems of Material Science.

DOI:

https://doi.org/10.15588/1607-3274-2021-2-11

Keywords:

natural language processing, graph data base, semantic analysis, formal query, decision tree, ontology.

Abstract

Context. This work is devoted to the problem of natural language interface construction for ontological graph databases. The focus here is on the methods for the conversion of natural language phrases into formal queries in SPARQL and CYPHER query languages.

Objective. The goals of the work are the creation of a semantic analysis method for the input natural language phrases semantic type determination and obtaining meaningful entities from them for query template variables initialization, construction of flexible query templates for the types, development of program implementation of the proposed technique.

Method. A tree-based method was developed for semantic determination of a user’s phrase type and obtaining a set of terms from it to put them into certain places of the most suiting formal query template. The proposed technique solves the tasks of the phrase type determination (and this is the criterion of the formal query template selection) and obtaining meaningful terms, which are to initialize variables of the chosen template. In the current work only interrogative and incentive user’s phrases are considered i.e. ones that clearly propose the system to answer or to do something. It is assumed that the considered dialog or reference system uses a graph ontological database, which directly impacts the formal query patterns – the resulting queries are destined to be in SPARQL or Cypher query languages. The semantic analysis examples considered in this work are aimed primarily at inflective languages, especially, Ukrainian and Russian, but the basic principles could be suitable to most of the other languages.

Results. The developed method of natural language phrase to a formal query in SPARQL and CYPHER conversion has been implemented in software for Ukrainian and Norwegian languages using narrow subjected ontologies and tested against formal performance criteria.

Conclusions. The proposed method allows the dialog system fast and with minimum number of steps to select the most suitable query template and extract informative entities from a natural language phrase given the huge phrase variability in inflective languages. Carried out experiments have shown high precision and reliability of the constructed system and its potential for practical usage and further development.

Author Biographies

A. A. Litvin, V.M. Glushkov Institute of Cybernetics.

Postgraduate student of department of Microprocessor Technology.

V. Yu. Velychko, V.M. Glushkov Institute of Cybernetics.

PhD, Senior researcher of department of Microprocessor Technology.

V. V. Kaverynskyi, I.M. Frantsevich Institute for Problems of Material Science.

PhD, Senior researcher of department of Abrasion- and Corrosion-Resistant Powder Construction Materials.

References

Galitsky B. Developing Enterprise Chatbots. Learning Linguistic Structures. Berlin, Springer, 2019, 566 p.

Sun C. A Natural Language Interface for Querying Graph Databases: master’s thesis … master in computer science and engineering. USA, Massachusetts Institute of Technology, 2018, 69 p.

Palagіn O. V., Krivij S. L., Bіbіkov D. S., Velichko V. Ju. ta іn. Formal-logical approach to building analysis systems of knowledge in different domains, Problems in progtamming, 2010, No. 2–3, pp. 382–389.

Li F., Jagadish H. V. Understanding natural language queries over databases, SIGMOD Record, 2016, Vol. 45, pp. 6– 13. DOI: 10.1145/2949741.2949744

Zhong V., Xiong G., Socher R. Seq2sql: generating structured queries from natural language using reinforcement learning, 2017 [Electronic resource]. Access mode: https://arxiv.org/pdf/1709.00103.pdf. arXiv: 1709.00103

Shaik S., Kanakam P., Hussain S. M., Suryanarayana D. Transforming natural language query to SPARQL for semantic information retrieval, International Journal of Engineering Trends and Technology, 2016, No. 7, pp. 347–350. DOI: 10.14445/22315381/IJETT-V41P263

Ochieng P. PAROT: Translating natural language to SPARQL, Expert Systems with Applications, 2020, No. 5, pp. 1–16. DOI: 10.1016/j.eswa.2021.114712

Jung H., Kim W. Automated conversion from natural language query to SPARQL query, Journal of Intelligent Information Systems, 2020, Vol. 55, pp. 501–520. DOI: 10.1007/s10844-019-00589-2

Yin X., Gromann D., Rudolph S. Neural machine translation from natural language to SPARQL, Future Generation Computer Systems, 2021, Vol. 117, pp. 510–519. DOI: 10.1016/j.future.2020.12.013

Damljanovic D., Agatonovic M., Cunningham H. FREyA: an interactive way of querying linked data using natural language, The Semantic Web: ESWC 2011 Workshops, 2011, pp. 125–138. DOI: 10.1007/978-3-642-25953-1_11

GIT-hub: FREyA documentation [Electronic resource]. Access mode: https://github.com/nmvijay/freya

GIT-hub Convert English sentences to Cypher queries documentation [Electronic resource]. Access mode: https://github.com/gsssrao/english2cypher

Litvin A. A., Velychko V. Yu., Kaverynskyi V. V. Method of information obtaining from ontology on the basis of a natural language phrase analysis, Problems in progtamming, 2020, No 2–3, pp. 322–330. DOI: 10.15407/pp2020.0203.322

Kіral’ S. S. recenzenti, ta іn. pіd zagal’n. red. M. Stepanenka Listi do Olesja Gonchara. Kyiv, Sakcent Pljus, Vol. 1, 1946–1982, 2016, 736 p.

Downloads

Published

2021-07-07

How to Cite

Litvin, A. A., Velychko, V. Y., & Kaverynskyi, V. V. (2021). TREE-BASED SEMANTIC ANALYSIS METHOD FOR NATURAL LANGUAGE PHRASE TO FORMAL QUERY CONVERSION . Radio Electronics, Computer Science, Control, (2), 105–113. https://doi.org/10.15588/1607-3274-2021-2-11

Issue

Section

Neuroinformatics and intelligent systems