/    /  Statistics – Interview questions Part 8

1. When You Are Creating A Statistical Model How Do You Prevent Over-fitting?

Answer: Over-fitting can be prevented by cross-validation.

 

2. Give An Example Of Inferential Statistics?

Answer:  Example of Inferential Statistic:

You asked six of your classmates about their height. By this information, you stated that the average height of all students in your university or college is 67 inches.

 

3. A Normal Population Distribution Is Needed For The Which Of The Statistical Tests:

Answer: A normal population distribution is needed for the below statistical tests.

1. variance estimation.

2. standard error of the mean.

3. Student’s t-test.

 

4. What is the difference between type I vs type II error?

Answer: Let’s see the difference type I vs type II error.

1. Type I error occurs when the null hypothesis is true, but is rejected.

2. Type II error occurs when the null hypothesis is false, but erroneously fails to be rejected.”

 

5. What is a statistical interaction?

Answer:  An interaction is when the effect of one factor (input variable) on the dependent variable (output variable) differs among levels of another factor.

 

6. What are Recommender Systems?

Answer: Recommender Systems are a subclass of information filtering systems that are meant to predict the preferences or ratings that a user would give to a product. They are used in movies, news, research articles, products, social tags, music, etc.

Examples include movie recommenders in IMDB, Netflix & BookMyShow, product recommenders in e-commerce sites like Amazon, eBay & Flipkart, YouTube video recommendations and game recommendations in Xbox.

 

7. Explain cross-validation?

Answer: It is a model validation technique for evaluating how the outcomes of statistical analysis will generalize to an Independent dataset. It is used in backgrounds where the objective is forecast and one wants to estimate how accurately a model will accomplish in practice.

The goal of cross-validation is to term a data set to test the model in the training phase  in order to limit problems like overfitting and get an insight by which model will generalize to an independent data set.