NLP- Interview questions Part 4

Home / NLP Interview question & answers / NLP- Interview questions Part 4

1. What is tokenization ?

Answer: Splitting the sentence into words is called tokenizaation.

2. What are stop words ?

Answer: a, the , an etc like repeated words in text, that doesn’t give any additional value to context. we can filter those words by using nltk library standard function.

3. What is Noise Removal ?

Answer: Remove unwanted data from corpus. Like if you are working sentiment analysis, we have to remove ?”! etc.

4. What is Wordnet ?

Answer: WordNet is a lexical database for the English language. It provides short definitions and usage examples, also groups English words into sets of synonyms called synsets, , and records a number of relations among these synonym sets or their members.