Interview Questions Set 1(31 to 40)

#datascience #machinelearning #statistics

Akash Borgalli Dec 24 2021 · 2 min read
Share this

31. Have you used AB testing in your project So far? If yes, Explain. If not, Tell me about AB testing?

Ans: I haven’t used hypotheses testing but I can tell you what’s hypothesis testing we basically state or define null hypothesis(H0) and Alternative Hypothesis which is the opposite of H0 statement then we perform the experiment on top of it and then state the conclusion whether we should accept or reject the null hypothesis(H0).

32. Can we use the Alternate hypothesis as a null Hypothesis?

Ans:  No according to me.

33. Can you please explain the confusion matrix for more than 2 variables?

Ans: I have answered this question by writing a blog in a beautiful way. Please check this link.

34. Give me an example of a False Negative From this interview?

Ans: False Negative denotes that the predicted value is different from an actual value. Let’s suppose I predicted that my answer to the previous question asked by you was True but actually it turned to be false.

35. What do you understand by Precision, Recall, and F1 Score for example?

Ans: Precision and Recall are used for information retrieval like google search engine to give you the top 10 links that are most relevant to the questions you have asked. They work on positive results. 

Precision =  TP/TP+FP(out of the total positive predicted by the model what is the % of an actual positive result)

Recall (TPR) = TP/TP+FN (out of total positive predicted by the model what is the % of an actual negative result)

F1 Score 2*((precision*recall)/(precision+recall)). Values will be eventually between 0 to 1 which can be converted to %. The closer the score towards 1 the more good and accurate your model is.

We go for the F1 Score when we don’t know which matches precision or recall looking at the business requirement.

36. What kind of questions do you ask your client if they give you a dataset?

Ans: I will first check how many features have null values and will try to ask the client is there any relevance because of which they are null or zero. I would try to understand from the client about a particular feature as to how it is impacted the target feature. I will try to understand why this data is generated.

37. Have you ever done an F test on your dataset, if yes, give example. If No, then explain F distribution?

Ans: F-distribution often arises when you are working with ratios of variances. F-test is also called as Annova test. It is used to compare 1 numerical & 1 categorical feature or it tells whether 2 or more groups are similar or not. It’s right-skewed data.

38. What is AUC & ROC Curve? Explain with uses.

Ans: ROC stands for Receiver Operator Characteristic. It basically tells what kind of threshold we need to consider which totally depends on the business use-case. AUC stands for Area under the curve. It helps you to find out which model is the best fit to solve a particular business problem. AUC is built out of ROC Curve.

40. What do you understand by 1 tail test & 2 tail test? give example.

Ans: 

  •  For a two-tail test, Let’s consider a person who wrote an exam that he/she may score above or below 700 out of 1000 marks. 
  • For 1 tail test, A particular drug lowers response time or whether a particular machine stops after an hour.
  • Comments
    Read next