11. What do you understand by right skewness, Give an example?
Ans: When we draw a KDE graph of the data which shows an elongated curve at the right-hand side of the data such a type of graph is called Right-skewed data or positively skewed data. One example would be, Length of comments in the post there would be very few who would be writing long comments on a particular post compared to other average people. The relation of mean, median, mode that is shown in this below particular diagram is Mean > Median > Mode.
12. What is the difference between Normal distribution and Std Normal Distribution and Uniform Distribution?
Normal Distribution: when you plot a graph and you see a bell-shaped curve that means the data is normally distributed and it's likely said that when you train models the machine understands relations quickly. Let’s assume that in a office employees come to the office between 8:00 am to 8:40 am. From the normal distribution data, you could figure out that many people come to the office between 8:20 am as it's at the bell curve section of the graph.
Uniform Distribution: It means that there is an equally likely chance of everything happening. In this case, the chances of people coming to the office between 8:00 to 8:40 is equal that employees can arrive anytime between this time range. So if we subtract 8:40 – 8:00 = 40 different times that are equally likely to have employees arrive at the office.
Standard Normal Distribution: It’s a distribution where your standard deviation is 1 and the mean is 0. It is basically used to find out suppose within 1.5 s.d away from mean..what will be its distribution which you won’t be ab;e to find out in Normal distribution which has 68-95-99 % rule. To get to know this we use the formula of Z-score. For example, you have normally distributed data of students and you know its mean and s.d.Now you want to find the probability of a student scoring more than 60% then to solve this particular problem you go for standard normal distribution where you calculate probability using a z-score.
13. What is the different kind of Probabilistic distributions you heard of?
Ans: I have heard of Binomial Distribution, Pareto distribution, Normal Distribution, Standard Normal Distribution, Poisson Distribution, Log-Normal Distribution.
14. What do you understand by symmetric dataset?
Ans: It can be termed symmetrical if it can be divided into two equal sizes of the same shape. In short, you have data equally distributed on both sides of the mean.
15. In your last project, were you using symmetric data or Asymmetric Data, if it's asymmetric, what kind of EDA you have performed?
Ans: There were some data symmetric so I didn’t touch on that but yes it had data that were asymmetric in nature and I tried converted figuring out whether it has outliers, eliminating those outliers, and performing standard scaling where it will transform my data where s.d =1 and mean = 0
16. Can you please tell me the formula for skewness?
Ans: Skewness = 3*(mean-median))/standard deviation
17. Have you applied student T distribution Anywhere?
Ans: No, but I know when to use it its when your sample size is less than 30 and it's used to compare two samples and does not contain outliers.
18. What do you understand by statistical analysis of data, Give me a scenario where you have used statistical analysis in the last projects?
Ans: I have used it in every project, It's really important to understand data so that it helps us to perform some operations that are required to generate a good machine learning model. Statistics is a blend of both descriptive as well as inferential both work hand in hand to solve a particular business requirement.
19. Can you please tell me the criterion to apply binomial distribution, with example?
Ans: You see Binomial Distribution if it's repeated for fixed no. of times where the trials are independent of each other. For example, tossing 5 coins and finding the probability of getting heads. So basically each coin will have two outcomes which are head and tails so if we do maths, possible outcomes from 5 flips would be 126.96.36.199.2 = 32
Criteria: no. of trails and probability of heads/success.
20. There are 100 people, who are taking this particular 30 days Data science interview preparation course, what is the probability that 10 people will be able to make a transition in 1 week? If 50 people were able to make a transition in 3 weeks? (Hint: Poisson Distribution)