/    /  Statistics – Interview questions Part 5

1. What is Arithmetic Mean?

Answer: It is the important technique in statistics Arithmetic Mean and can also be called an average. It is the number or the quantity obtained by summing two or more numbers/variables and then dividing the sum by the number of numbers/variables.

 

2. Explain Median?

Answer: Median is also a way of finding the average of a group of data points. It’s the middle number of a set of numbers. To find it, place the numbers in value order and find the middle number.

Put them in order: {15, 16, 17, 18, 19, 20, 21}
The middle number is 18, so the median is 18.

 

3. What is Mode?

Answer: The mode is also one of the types for finding the average. Whereas, which occurs most frequently in a group of numbers. Some sequence, might not have any mode; some might have two modes which is called bimodal series.

 

4. Explain Standard Deviation (Sigma)?

Answer: Standard Deviation is a measure of how much your data is spread out in statistics.

 

5. List all the other models work with statistics to analyze the data?

Answer: Statistics along with Data Analytics analyzes the data and help business to make good decisions. Predictive ‘Analytics’ and ‘Statistics’ are useful to analyze current data and historical data to make predictions about future events.

 

6. What are Eigenvectors and Eigenvalues?

Answer: Eigenvectors are used for understanding linear transformations. In data analysis, we usually calculate the eigenvectors for a correlation or covariance matrix; they are the directions along which a particular linear transformation acts by flipping, compressing or stretching.

Eigenvalue is said to be the strength of the transformation in the direction of eigenvector or the factor by which the compression occurs.

 

7. Can you explain the difference between a Validation Set and a Test Set?

Answer: A Validation set can be considered as a part of the training set as it is used for parameter selection and to avoid overfitting of the model being built.

On the other hand, a Test Set is used for testing or evaluating the performance of a trained machine learning model.

In simple terms, the differences can be summarized as; training set is to fit the parameters i.e. weights and test set is to assess the performance of the model i.e. evaluating the predictive power and generalization.