Thomas McGrath, Andrei Kapishnikov, Nenad Tomašev, Adam Pearce, Martin Wattenberg, Demis Hassabis, Been Kim, Ulrich Paquet, Vladimir Kramnik
The following visualization describes concepts from AlphaZero’s ‘representational space’ (i.e., internal activations). 36 factors are extracted from the activations of each of the network's 20 ResNet blocks (factors aren't aligned between blocks).
While some of these concepts are shown to be related to human chess concepts (see our paper), some concepts may be something we don’t yet have a name for, or too complex for us to understand (yet). We would love to hear from you if you can describe any concepts in (block, factor) pair. Tell us by submitting this form.
Dark grey slices show min(AlphaZero, human)
Color slices show max(AlphaZero, human) - min(AlphaZero, human)