 Mukesh Kumar Sharma Oct 26 2021 · 2 min read

Hello my data science community !

The main purpose of me to write this article have multiple points as mention below:

1. Revision Purpose

2.  Improvement of my typing purpose

3. Friends after reading this if you have any doubt or need to share suggestions please feel free to contact me at [email protected] and also mention your name and number so that i can contact you back and we can discuss....

Interview question set 1:

1. Where you have used Hypothesis Testing in your Machine learning Solution ?

Sol: The main purpose of using ML is to deal with data and come out with an output, we will be using HYPOTHESIS TESTING as at the end we need to come up with conclusion weather we need to accept H0 or reject it.

2. What kind of statistical tests you have performed in your ML Application ?3. What do you understand by P Value? And what is use of it in ML?

Sol: In hypothesis testing there are multiple kind of test that we can preform such as z-test, t-test, f-test, anova test, chi-square test.

3. What do you understand by P Value? And what is use of it in ML?

Sol: P value is the smallest significant value at which the null hypothesis will be rejected. If my P value or significant value or alpha value falls in the region of gaussian normally distribution then only we are going to accept it else we need to reject the null hypothesis and accept the alternative hypothesis.

4. Which type of error is severe Error, Type 1 or Type 2? And why with example ?

Sol: Imagine the condition where the null hypothesis H0 can be true but we don't have enough proof to prove it, In this case we will reject the null hypothesis (H0) and accept the alternative hypothesis (H1) this is what called type 1 error. suppose my alternative null hypothesis (H1) is true, but I don’t have enough information to prove it, in this situation we are going to reject alternative null hypothesis (H1) and accept null hypothesis (H0) this is what called type 2 error.

5. Where we can use chi square and have used this test anywhere in your application ?

Sol:  The Chi-Square test is a statistical procedure used by researchers to examine the differences between categorical variables in the same population. when ever we deal with data such as pass or fail, true or false etc.. we will be using chi-square test.

6. Can we use Chi square with Numerical dataset? If yes, give example. If no, give Reason?

Sol: Yes, we can use chi-square test with numerical data. The region behind this is "As working with ML we need to transform the categorical data to numerical data and only then we can deal with the problem." Example: You have a data where the input column contain Marks of students and your output column is categorical i.e. pass/fail, so we need to first transform label column by applying chi-square test and then procced with the model creation.

7. What do you understand by ANOVA Testing?

Sol: ANOVA testing stand for {'Analysis of variance'} and used for analyzing the variance within the groups through samples taken from each of them.

8. Give me a scenario where you can use Z test and T test ?

Sol: Z-test can be used for comparison of mean, and the case is that we need to known about the population standard deviation. T-test this is also used for comparison of mean but in this case we don't know the population standard deviation.

9. What do you understand by inferential Statistics ?

Inferential statistics: It is a methodology applied on descriptive statistics to get the test result directly.

10. When you are trying to calculate Std Deviation or Variance, why you used N-1 in Denominator? (Hint: Basel Connection) ?

Sol: N-1 represent we are working with sample data. The main region behind using N-1 is that when statistical person where work on this they find that when we use N+1 in the denominator the variance or std spread a lot then this experiment is preformed multiple times with multiple value but the result was increasing or decreasing but when they turn towards negative side the result where good and to remove the biased value we will be using these N-1.

Next question will also be published soon here itself.

Thankyou for reading and showing interest.