/    /  Statistics – Interview questions Part 6

1. List the fields, where statistic can be used?

Answer: Statistics can be used in many research fields. The list of files in which statistics can be used are as follows,





Computer Science


It aids in decision making

It Provides comparison

Explains action that has taken place

Predict the future outcome

Estimate of unknown quantities.


2. List the sampling methods?

Answer: Sampling can be done in the following 4 ways.

– Cluster Sampling: In this method the population will be divided into groups or clusters.

– Simple Random: In this method, it simply follows the pure random division.

– Stratified: In stratified sampling, the data will be divided into groups or strata.

– Systematical: In this method, it picks every kth member of the population.


3. What is correlation in statistics?

Answer: Correlation: Correlation is considered or described as the best technique for measuring and also for estimating the quantitative relationship between two variables. Correlation measures how strongly two variables are related.


4. What is Covariance in statistics?

Answer: In covariance two items vary together and it’s a measure that indicates the extent to which two random variables change in cycle. In statistical terms; it explains the systematic relation between a pair of random variables, wherein changes in one variable reciprocal by a corresponding change in another variable.


5. What is the relationship between Covariance and Correlation?

Answer: Covariance and Correlation are two mathematical concepts; these two approaches are widely used in statistics. The relationship between these is also measure the dependency between two random variables. Though the work is similar between these two in mathematical terms, they are different from each other.


6. Explain cross-validation.

Answer: Cross-validation is a model validation technique for evaluating how the outcomes of statistical analysis will generalize to an Independent dataset. It is used in backgrounds where the objective is forecast and one wants to estimate how accurately a model will accomplish in practice.

The goal of cross-validation is to term a data set to test the model in the training in order to limit problems like overfitting and get an insight by which, model will generalize to an independent data set.


7. How can outlier values be treated?

Answer:Outlier values can be identified by using univariate or any other graphical analysis method. If the number of outlier values is less then they can be assessed individually but for a large number of outliers, the values can be substituted with either the 99th or the 1st percentile values.

All extreme values are not outlier values. The most common ways to treat outlier values

To change the value and bring in a range.

To just remove the value.