Apriori property,interesting ruse of a categorical variable ,rules distinguished from coincidental rules

1. What is the Apriori property?

2. Following is a list of five transactions that include items A, B, C, and D:

• Tl: {A, B, C}

• T2: {A, B}

• T3: {B}

• T4: {A, C}

• TS: {A, C, D}

Which itemsets satisfy the minimum support of 0.5? 

(Hint: An item set may include more than one item.)

3. How are interesting rules distinguished from coincidental rules?

4. A local retailer has a database that stores 10,000 transactions of last summer. After analyzing the data, a data science team has identified the following statistics:

• {battery} appears in 4,000 transactions.

• {sunscreen} appears in 3,000 transactions.

• {sandals} appears in 4,000 transactions.

• {bowls} appears in 1,000 transactions.

• {battery, sunscreen} appears in 1,500 transactions.

• {battery, sandals} appears in 1,000 transactions.

• {battery, bowls} appears in 1250 transactions.

• {battery, sunscreen, sandals} appears in 600 transactions.

Answer the following questions:

a. What are the support values of the preceding itemsets?

b. Assuming the minimum support is 0.05, which itemsets are considered frequent?

c. What are the confidence values of {battery}->{ sunscreen} and {battery, sunscreen}->{ sandals} ? 

d. Which of the two rules is more interesting?

5. In the use of a categorical variable with n possible values, explain the following:

a. Why only n – 1 binary variables are necessary

b. Why using n variables would be problematic

6. If the probability of an event occurring is 0.4, then

a. What is the odds ratio?

b. What is the log odds ratio?