Link to the previous post : https://statinfer.com/204-3-3-how-decision-tree-splits-works/

## Entropy Calculation – Example

- Entropy at root
- Total population at root 100 [50+,50-]
- Entropy(S) = −p+log2p+−p−log2p−
- −0.5log2(0.5)−0.5log2(0.5)
- -(0.5)
*(-1) – (0.5)*(-1) - 1
- 100% Impurity at root

**Entropy(S)=−(p+)(log2(p+))−(p−)(log2(p−))**

### Entropy Calculation

- Gender Splits the population into two segments
- Segment-1 : Age=”Young”
- Segment-2: Age=”Old”
- Entropy at segment-1
- Age=”Young” segment has 60 records [31+,29-]
**Entropy(S)=−(p+)(log2(p+))−(p−)(log2(p−))**

- −31/60log231/60−29/60log229/60
- (-31/60)
*log(31/60,2)-(29/60)*log(29/60,2) - 0.9991984 (99% Impurity in this segment)

- Age=”Young” segment has 60 records [31+,29-]

- Entropy at segment-2
- Age=”Old” segment has 40 records [19+,21-]
**Entropy(S)=−(p+)(log2(p+))−(p−)(log2(p−))** - −19/40log219/40−21/40log221/40
- (-19/40)
*log(19/40,2)-(21/40)*log(21/40,2) - 0.9981959(99% Impurity in this segment too)

- Age=”Old” segment has 40 records [19+,21-]

### Practice : Entropy Calculation – Example

- Calculate entropy at the root for the given population
- Calculate the entropy for the two distinct gender segments

### Code- Entropy Calculation

- Entropy at root 100%
- Male Segment : (-48/60)
*log(48/60,2)-(12/60)*log(12/60,2)- 0.7219281

- FemaleSegment : (-2/40)
*log(2/40,2)-(38/40)*log(38/40,2)- 0.286397

The next post is about information gain in decision tree split.

Link to the next post : https://statinfer.com/204-3-5-information-gain-in-decision-tree-split/