1. What is tokenization ?
Answer: Splitting the sentence into words is called tokenizaation.
2. What are stop words ?
Answer: a, the , an etc like repeated words in text, that doesn’t give any additional value to context. we can filter those words by using nltk library standard function.
3. What is Noise Removal ?
Answer: Remove unwanted data from corpus. Like if you are working sentiment analysis, we have to remove ?”! etc.
4. What is Wordnet ?
Answer: WordNet is a lexical database for the English language. It provides short definitions and usage examples, also groups English words into sets of synonyms called synsets, , and records a number of relations among these synonym sets or their members.
5. What is NLG (Natural language Generation) ?
Answer: It’s about generating new text from understanding old data.