Đào tạo

TTLV: Issues in Vietnamese language processing to improve search engine performance

Friday - June 20, 2014 00:49

INFORMATION ABOUT THE MASTER'S THESIS

1. Student's full name: Nguyen Thi Minh Tam 2. Gender: Female

3. Date of birth: September 23, 1988

4. Place of birth: Bac Ninh

5. Decision No. 1883/2010/QD-XHNV-SDH dated October 21, 2010, of the Rector of the University of Social Sciences and Humanities, Vietnam National University, Hanoi, recognizing the student.

6. Changes in the training process: None

7. Thesis title: Issues in Vietnamese language processing to improve search engine performance.

8. Major: Linguistics; Code: 60 22 01

9. Scientific supervisor: Dr. Nguyen Ai Viet - Institute of Information Technology, Vietnam National University, Hanoi.

10. Summary of the thesis results:

In theory:

  • This thesis presents essential linguistic concepts for natural language processing: the concept of words and word classes.
  • This thesis provides an overview of language processing in search engines, and also addresses issues related to the Vietnamese language in improving search performance: part-of-speech tagging, dictionary construction, indexing, etc.
  • This thesis studies stop words (a relatively new topic in Vietnam that has not been widely researched). Specifically, the thesis delves into the nature of Vietnamese stop words and provides a universal definition for this concept.

Regarding research practice:

  • This thesis addresses the nature of Vietnamese stop words, including their lexical nature, word class, and linguistic independence.
  • Based on existing characteristics, we can establish a table of stop words for the Vietnamese language.

11. Practical applications: (if any)

- The first step in building a theoretical framework about the nature of stop words.

- The thesis plays an important role in natural language processing, especially in the indexing process for search systems.

12. Further research directions (if any): Further investigation into issues related to the Vietnamese language in the natural language processing of search engines.

13. Published works related to the thesis:

INFORMATION ON MASTER'S THESIS

1. Full name: Nguyen Thi Minh Tam 2. Sex: Female

3. Date of birth: 23 September 1988 4. Place of birth: Bac Ninh, Vietnam

5. Admission decision number: No. 1883/2010/QD-XHNV-SDH on October 21, 2010 by the Rector of University of Social Sciences and Humanities, Hanoi – Vietnam National University

6. Changes in academic process: None

7. Official thesis title: The Vietnamese problems in improving search efficiency

8. Major: Linguistics 9. Code: 60 22 01

10. Supervisors: Dr. Nguyen Ai Viet- Information Technology Institute- Vietnam National University, Hanoi.

11. Summary of the findings of the thesis: .............................................................

From theoretical point of view:

- The thesis has already presented the linguistic concepts which are necessary in the process of language processing: the concepts of word and TAGS.

- The thesis has given an overview of the language processing in the search engines, identifying the Vietnamese language related issues in the search efficiency improvement: POS tagging, dictionary building, indexing…

- The thesis pioneers reseach on the stop words (a new reseach direction in Vietnam which has not been studied scientifically). In particular, the thesis has discussed in dept the nature of the Vietnam stop words and attempeted to give a universal definition for this new concept.

From a practical point of view:

- The thesis has addressed the nature of the Vietnamese stop words from different aspects including: the nature of words, POSTAG, the linguistic independence,…

- Based on the above characteristics, it is possible to set up the first theorized checked list of Vietnamese stop words

12. Practical applicability, if any: ...................................................................

- The initial theoretical framework of the future Vietnamese stop words studies.

- The thesis has contributed an important role in natural language processing especially in the indexing of the search systems.

13. Further research directions, if any: .............................................................

Learn more deep Vietnamese language related issues in the natural language processing search engines.

14. Thesis-related publications: ..........................................................................

The total score for this article is: 0 out of 0 reviews

Click to rate the article

Newer news

Older news

You haven't used the Site.Click here to remain logged in.Waiting time: 60 second