/    /  Statistics – Interview questions Part 3

1. What Is One Sample T-test?

Answer:  T-test is any statistical hypothesis test in which the test statistic follows a Student’s t distribution, if the null hypothesis is supported.

[h,p,ci] = ttest(y2,0)% return 1 0.0018 ci =2.6280 7.0863


2. What Is Alternative Hypothesis?

Answer: It is denoted by H1, is the statement that must be true if the null hypothesis is false.


3. What Is Significance Level?

Answer: The probability of rejecting the null hypothesis is called the significance level α , and very common choices are α = 0.05 and α = 0.01.


4. What Is Binomial Probability Formula?

Answer: P(x)= p x q n-x n!/[(n-x)!x!]

Where n = number of trials.

x = number of successes among n trials.

p = probability of successess in any one trial.

q = 1 -p.


5. Explain Hash Table?

Answer: It is a data structure used to implement an associative array, a structure that can map keys to values. To compute an index into an array of buckets or slots, it uses a hash function from which the correct value can be found.


6. What are the differences between overfitting and underfitting?

Answer: In statistics and machine learning, one of the most common tasks is to fit a model to a set of training data, so as to be able to make reliable predictions on general untrained data.

Overfitting: It is a statistical model describes random error or noise instead of the underlying relationship. It occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. Where it is a model that has been overfit has poor predictive performance, as it overreacts to minor fluctuations in the training data.

Underfitting: It is a statistical model or machine learning algorithm cannot capture the underlying trend of the data. It would occur; For example, when fitting a linear model to non-linear data.Such a model too would have poor predictive performance.


7. Differentiate between univariate, bivariate and multivariate analysis.

Answer: Univariate analyses are descriptive statistical analysis techniques which can be differentiated based on the number of variables involved at a given point of time. For example, the pie charts of sales based on territory involve only one variable and can the analysis can be referred to as univariate analysis.

The bivariate analysis attempts to understand the difference between two variables at a time as in a scatterplot. For Instance, analyzing the volume of sale and spending can be considered as an example of bivariate analysis.

Multivariate analysis deals with the study of more than two variables to understand the effect of variables on the responses.