Moral Geometry of Scripture and Ideology

An exploratory analysis of 28 foundational texts across six moral dimensions (v7, April 2026)

Summary

This is a first-pass attempt at using sentence embeddings to map 28 religious, philosophical, and political texts onto the six moral dimensions in Moral Foundations Theory (Jonathan Haidt et al., 2009/2012). Three of the six dimensions produce clear, believable results: authority, loyalty, and sanctity. The other three (care, fairness, liberty) are noisier (see Limitations). Nothing here is ready for serious conclusions, but the results are interesting enough to keep going.

Note on the corpus. A few texts here (Mein Kampf, The Doctrine of Fascism, Kaczynski's Industrial Society) are included only as analytical stress tests. They are not endorsed.

How It Works

Sentence embedding models map each passage to a fixed-length vector in a high-dimensional space: semantically similar passages land close together. That geometry is what we project onto moral dimensions.

We use OpenAI's text-embedding-3-large (3072 dimensions, same model family as ChatGPT) to embed every chunk of every corpus. Then we do three things:

1. Build an axis vector per dimension. For each of Haidt's six foundations we hand-write roughly twenty matched sentence pairs (pro- vs. anti-pole). For care/harm, a positive anchor is "We have an obligation to help those who are suffering" and its negative is "The suffering of others is no concern of ours." We average the embeddings of all positive anchors and all negative anchors, subtract, and normalize. The result is an axis vector pointing from the negative pole toward the positive pole in embedding space.

2. Project each chunk onto each axis. Each book is split into chunks of about 500 tokens (~350 words). For each chunk we take the cosine similarity between its embedding and each axis vector (same sign convention as the original: positive means toward the "good" pole of that foundation). We average across chunks to get a corpus-level mean per dimension.

3. Calibrate against a null baseline. Raw cosine scores are small on their own. We embed a large Wikipedia sample as a null distribution for each dimension, then report how many standard deviations each corpus mean sits above or below that baseline (z vs. Wikipedia). In the score table at the end of this page, cells show the raw cosine means; asterisks flag |z| ≥ 2 (rough signal); values with |z| < 1 are effectively noise.

Visualizations

Each tab filters the charts to a specific group. All Corpora shows the full 28-text view; the group tabs re-normalize scores within that group.

Moral Profile Heatmap

How to read. Rows are texts; columns are moral dimensions. Green = positive pole (compassion, fairness, pro-liberty, hierarchy, loyalty, purity); red = negative pole. Stronger color and bolder numbers mean a stronger signal. A column where everything is the same color usually means the axis isn't distinguishing much.

Radar Charts

How to read. Each hexagon is one text. The six spokes are the six dimensions. A large, round shape means high across everything; a flat shape means near-average on everything. Lopsided shapes show which dimensions a tradition leans into. Compare shapes, not absolute sizes.

Moral Space Map (PCA)

How to read. Each dot is a text, placed so that morally similar texts end up close together. The arrows show which dimensions pull in which direction. Texts near the authority arrow scored high on authority. The percentages show how much of the total variation each axis captures.

Clustering

How to read. Texts that merge low in the tree are morally similar; texts that only merge near the top are very different. A text that joins the tree late on its own is an outlier that doesn't fit any group.

Score Distributions per Dimension

How to read. Each panel is one dimension. The wide shapes show how spread out each book's chunk scores are: wide means lots of variation across the book, narrow means consistently scored. The white dot is the mean. If all books look the same on a given panel, that dimension isn't discriminating between them.

Dimension Independence

How to read. Pairwise correlations between all six dimensions. Near-zero means two dimensions are measuring different things. Above 0.7 means they are mostly redundant and should be read together.

Topic vs. Side Taken

How to read. Horizontal = how much this chunk is about the moral topic at all. Vertical = which side it takes. Top-right = talks about the topic and takes the positive side. Bottom-right = talks about it but takes the negative side. Left half = the topic barely comes up. Books with lots of left-half chunks have their score diluted by off-topic material.

Underlying Factors

How to read. Instead of assuming the six named dimensions are all independent, this chart lets the data speak. Each column is an underlying pattern the data revealed. High values mean that named dimension is mostly captured by that underlying pattern. If two named dimensions load onto the same factor, they are measuring the same thing.

Old Testament, New Testament, Quran. Scores normalized within this group.