Q 1: Bad gums may mean a bad mood. Researchers discovered that 85% of people who have suffered a bad mood had periodontal disease, an inflammation of the gums. Only 29% of healthy people have this disease. Suppose that in a certain community bad moods are quite rare, occurring with only 10% probability. If someone has periodontal disease, what is the probability that he or she will have a bad mood? (10 Marks)
Note: Draw the tree diagram for the above problem. Handwritten tree diagram is prohibited.
To solve this problem, we can use Bayes’ theorem. Let’s define the events:
A: Having periodontal disease B: Having a bad mood
We are given the following probabilities: P(B) = 0.10 (probability of having a bad mood) P(A) = 0.29 (probability of having periodontal disease) P(A’) = 0.71 (probability of not having periodontal disease)
We need to find P(B|A), which is the probability of having a bad mood given that the person has periodontal disease.
Using Bayes’ theorem, we have:
P(B|A) = (P(A|B) * P(B)) / P(A)
P(A|B) is the probability of having periodontal disease given that the person has a bad mood. This is not given directly, but we can calculate it using the information given:
P(A|B) = (P(B|A) * P(A)) / P(B)
We can now substitute the known values into the equation:
P(B|A) = (P(A|B) * P(B)) / P(A) P(B|A) = (P(B|A) * P(A)) / P(B) * P(A)
P(B|A) = (P(B|A) * P(A)) / (P(B) * P(A))
Simplifying the equation, we have:
P(B|A) * (P(B) * P(A)) = P(B|A) * P(A)
P(B|A) = (P(B|A) * P(A)) / (P(B) * P(A))
P(B|A) = P(B|A) / P(B)
Now, let’s plug in the given values:
P(B|A) = (0.85 * 0.29) / (0.10 * 0.29) P(B|A) = 0.245 / 0.029 P(B|A) ≈ 8.448
Therefore, the probability that someone with periodontal disease will have a bad mood is approximately 8.448 or 84.48%.
This explanation is for reference purpose only.
Q 2: Using MS-EXCEL show the Regression model, consider ‘Instagram followers’ as dependent variable and ‘no f post per day’ as an independent variable. Write the interpretation of EXCEL Tables. Write the conclusion on the fitting of your model also.
To perform a regression analysis in MS Excel, follow these steps:
- Enter your data: In an Excel worksheet, enter your data for the dependent variable (Instagram followers) in one column and the independent variable (number of posts per day) in another column. Make sure each row represents a data point.
- Insert scatter plot: Select the data range for both variables and go to the “Insert” tab. Choose the scatter plot option and select the scatter plot type that includes data points and a trendline.
- Add a trendline: Right-click on any data point in the scatter plot and choose “Add Trendline.” In the “Format Trendline” pane, select the type of trendline you want (linear, polynomial, etc.) and check the box for displaying the equation and R-squared value on the chart.
- View the regression analysis output: The equation and other relevant statistics will be displayed on the chart. Additionally, you can use the “Data Analysis” tool in Excel to obtain more detailed regression output.
Interpreting the Excel tables for the regression model:
- Equation: The equation displayed on the chart represents the estimated regression equation. It shows the relationship between the dependent variable (Instagram followers) and the independent variable (number of posts per day) in the form of a mathematical equation. For example, the equation might be “Instagram followers = 1000 + 50 * (number of posts per day).” This means that, on average, for each additional post per day, there is an estimated increase of 50 followers, starting from a base of 1000 followers.
- R-squared (R²): R-squared is a measure of how well the regression line fits the data. It ranges from 0 to 1, with 1 indicating a perfect fit. The R-squared value tells you the proportion of the variation in the dependent variable (Instagram followers) that can be explained by the independent variable (number of posts per day). A higher R-squared value indicates a stronger relationship between the variables.
- Coefficients: The coefficients table provides information about the estimated coefficients for the regression equation. It includes the coefficient estimates, standard errors, t-values, and p-values for each independent variable. The coefficient estimate represents the estimated change in the dependent variable associated with a one-unit change in the independent variable. The standard error, t-value, and p-value provide information about the statistical significance of the coefficient estimate.