The Geometry of Categorical and Hierarchical Concepts in Large Language Models

123
15
Anon84
1 year ago
arxiv.org

cs702
·
1 year ago
·
[ - ]

Very nice. Well-written. Feels "natural."

Besides helping with interpretability, my immediate thought is that maybe we could pretrain models faster by adding regularization terms in the objective function that induce representations of distinct categories to be in subspaces that are orthogonal to each other, and representations of subcategories to be in orthogonal subspaces that can form polytopes. The data necessary for doing so is readily available: Wordnet synsets. Induce representations of synsets to be orthogonal to each other and representations of hierarchically related synsets to be arranged in polytopes. There's already some evidence that we can leverage Wordnet synsets to pretrain some models faster. Take a look at https://news.ycombinator.com/item?id=40160728 for example.

Thank you for sharing this on HN.

esafak
·
1 year ago
·
[ - ]

> We find a remarkably simple structure: simple categorical concepts are represented as simplices, hierarchically related concepts are orthogonal in a sense we make precise, and (in consequence) complex concepts are represented as polytopes constructed from direct sums of simplices, reflecting the hierarchical structure.

It's satisfying that the structure is precisely the one you would hope for.

empath75
·
1 year ago
·
[ - ]

Yeah, you could explain this paper to Aristotle and he would not be that surprised by it.

empath75
·
1 year ago
·
[ - ]

Beautiful paper, relatively well written and accessible, too.

I think everyone _knew_ in some sense that the structure of categorial information about vectors must be hierarchical and have this general kind of structure, but they managed to formalize that intuition into just a few theorems that seem sort of inevitable only in retrospect.

Animats
·
1 year ago
·
[ - ]

Wow. This seems really important, because LLMs have been such black boxes.

Is this result useful only for basic concepts backed by huge numbers of cases in the training data, or is it more general than that?

Comments?

zmgsabst
·
1 year ago
·
[ - ]

This is generally true, about type theories:

A type theory corresponds to a complex diagram, as outlined in topos theory. (Note: complex as in CW-complexes.)

I think it’s fascinating LLMs ended up being a similar structure — but perhaps not entirely surprising. There have been similar results, eg a topological covering can generate an ML model.

mjhay
·
1 year ago
·
[ - ]

There's been a decent amount of work using simplicial complexes and related ideas to generalize graph neural networks, e.g. [0], [1]. If LLMs obey a similar geometry, it could be a promising direction for multimodal models and more principled RAGs with better inductive biases.

[0] https://arxiv.org/pdf/2010.03633

[1] https://arxiv.org/pdf/2012.06333

mdp2021
·
1 year ago
·
[ - ]

GitHub repo at

https://github.com/KihoPark/LLM_Categorical_Hierarchical_Rep...

zyklu5
·
1 year ago
·
[ - ]

Well, if concepts turn out to be simplicial (or cellular) complexes maybe philosophy can be made into applied algebraic topology.

mjhay
·
1 year ago
·
[ - ]

You may be interested in some of the work of the great Bill Lawvere, especially around formalization of Hegelian dialectics:

https://ncatlab.org/nlab/show/William+Lawvere#RelationToPhil...

zyklu5
·
1 year ago
·
[ - ]

Thank you. It's a bit surprising to see Hegel here, I was thinking more on the lines of the analytic philosophy. Of course many early 20th c. mathematicians who were interested in philosophy such as Weyl or Gian-Carlo Rota would not have thought much of such a distinction.

mjhay
·
1 year ago
·
[ - ]

Yeah, it didn't used to be that big of a divide. Nowadays it seems like analytic philosophers are doing endless retreads, and continental ones are also doing endless retreads, but with more confusing sentence structure.

From a 10,000 foot view, I think nailing down a more "objective" understanding of dialectics (idealist, material, whatever) is a promising direction to ameliorate this meta-problem. People arguing in journals is pretty much a dialectic problem, so understanding that can go a long way to understanding issues beyond that.

·
1 year ago
·
[ - ]

100ideas
·
1 year ago
·
[ - ]

reminds me of the anthropic's recent work on identifying the neuron sets that correlate to various semantic concepts in Claude: https://news.ycombinator.com/item?id=40429540 "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet"

szvsw
·
1 year ago
·
[ - ]

OpenAI also just published similar work, though Anthropic did beat them to the punch.

https://openai.com/index/extracting-concepts-from-gpt-4/

https://news.ycombinator.com/item?id=40599749

cabidaher
·
1 year ago
·
[ - ]

In the same vein, Refusal in LLMs is mediated by a single direction: https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in...