Probability of Video labeling | Data Science job interview question


New member
May 22, 2023
One of the important problems in machine learning is data labeling. Assume that a company has a large number of videos and they want to classify whether a video contains malicious content or not. The company hires a team of 20 workers to label the data. Each person has a 10% probability of mislabeling. Of those mislabeled videos, 50% were labeled as malicious. Of the videos that were properly labeled, 60% were labeled as malicious. We assume that, in fact, about 70% of videos contain malicious content. If 3 workers out of 20 labeled a video as malicious, what is the probability that the video is actually malicious?

In this problem, there is one problem I'm not sure is to calculate the probability that the video was truly labeled as malicious and truly labeled as not malicious. Then the afterword is to use total probability is more simple. Anyone can help me? thanks!