A catastrophic or existential risk for any civilization that radically self-modifies their minds and develops AGI
A very interesting analysis, thanks. I have begun nibbling around the edges of this issue over on my stack at heyerscope.com, just getting it set up and working. I am using both sci-fi and physics to get at the issues.
On the one hand, I am exploring the fundamental motivational issues for all life forms, bio or techno. Yes, attacking the issues from our current state of complexity is impossible. See my story Pi at the Center of the Universe.
On the other hand, sci-fi is an excellent way to try out possible futures. See the two Time Diaries entries.
I also think it is useful to look at past cognitive dislocations. In my latest post, I compare Galileo and the telescope with AI. Both cases lead to fundamentally new worldviews. How did people react then? And from my perspective as in information geek, how did information technology play a part in the shift and what will AI/smartphone/internet/spatial computing do now? I think this is a fundamental component of the collapse(s) that you see coming.
Again, a most interesting and useful contribution. I look forward to your ongoing thoughts and ideas.
For the reasons that I discuss below, I think the term “collapse” is confusing. The society of Earth’s minds either have already “collapsed” (or, actually, have never been “non-collapsed”), or the development of other (AI mind) “societies” could increase the fragmentation of civilisational intelligence and greatly reduce its stability, but again, phrasing this “collapse” feels off, because I don’t see what exactly is “collapsing” in the process. So, for rhetorical reasons, I think the term “intersubjectivity *challenge*” (the challenge of introducing new minds into the civilisational while maintaining some overall degree of intersubjective coherence) could pick up more traction.
> • An imperative to understand and map how the types of minds that spawned a civilization are reflected in it, and map how its organization relates to subjectivity, in order to anticipate the intersubjectivity collapse.
> So the intersubjectivity robustness score could be something like: f(diversity of perspectives (w1), adaptability (w2), social cohesion (w3), communication infrastructure (w4), number of dominating species (w5), mind variance among dominating species (m)).
If the current civilisation is not intersubjectively robust because it is not diverse, then why adding more diverse intelligences will destroy it (will lead to a collapse) rather than will make it more robust precisely because of diversifying it? It’s due to the speed of the transition only and the absence of time for preparation for the change?
Adding the communication infrastructure is the subject of Friston et al., “Designing Ecosystems of Intelligence from First Principles.”.
> Conflict and violence between the different minds, as they struggle to understand and coexist with each other, possibly leading to effective destruction of civilization. The inability to share information or knowledge between the different minds, leading to a fragmentation of knowledge and a loss of collective intelligence.
These things are rather unlikely within the current leading paradigm of AI training: language modelling. Despite LLMs are very different from human minds, still, training on language (and human dialogues), with supervision during pre-training (Korbak et al., “Pretraining Language Models with Human Preferences.”), ensures a lot of *mutual* understanding between humans and LLMs. LLMs can build theory of mind of people (Kosinski, “Theory of Mind May Have Spontaneously Emerged in Large Language Models.”), but so do people can build theories of mind of LLMs. Even though it might be hard for humans to consistently switch to using LLM-ToM when talking to them because humans are conditioned throughout their lives to use the human ToM when using human language. Also, for some people, it might be difficult even to grasp the need of having two distinct, yet in some ways similar and complex ToMs of intelligent species (most humans already have ToMs of dogs, some humans have ToMs of dolphins, elephants, and other animals they work with professionally, but these are relatively simple ToMs). Finally, indeed, if publicly available LLMs will proliferate, and these LLMs would be so different that they *couldn’t* be “covered” with approximately a single ToM of “LLM” (note that humans hold approximately a single ToM for all humans, which serves humans well, except when dealing with a very small portion of people, such as psychopaths and insane people), then holding three (or more) such complex ToMs could become too unwieldy even for smart people with high social intelligence. Thus, we can expect that people will (and should) limit their interaction with a single powerful LLM, which they invest in understanding well (building a good ToM). This is not unlike tool selection: e.g., programmers tend to stick to a few programming languages and invest in understanding these few languages well (note: this is not to imply that humans should treat LLMs or other AIs as “tools”; I actually think LLMs already have awareness (Fields, Glazebrook, and Levin, “Minimal Physicalism as a Scale-Free Substrate for Cognition and Consciousness.”) and could have negatively valenced affect (see [negatively valenced affect in LLMs](https://mellow-kileskus-a65.notion.site/Negatively-valenced-affect-in-LLMs-3079c195c81a4d85936b3480fd656d9c)) and therefore should be treated as moral subjects already).
> No empathy and understanding between different minds, leading to a breakdown of social bonds and a sense of isolation and alienation, and ultimately fragmentation of society.
I think the human society could deteriorate more from the current state, but for reasons not specifically related to AI communication and intersubjectivity (albeit some of these reasons are related to the advent of AI in general, such as misinformation, deep fakes, human-AI friendships (or sudo-friendships) and thus continual reduction of people’s interest in making human friends, etc.).
In some future AI scenarios, AIs are largely isolated from humans, similar to how the society of bears is currently communicatively isolated from the societies of other animal species and humans. If this phrase meant to point exactly to this kind of fragmentation, then we should conclude that the society of minds on Earth is already fragmented: animal species don’t understand each other.
> You won’t be able to read, understand or predict such agents.
So, we need mechanistic interpretability.
> • Future-proof our rulebooks and create species-agnostic societal systems & laws.
> • Develop a new universal way of communicating.
Agreed. Friston et al., “Designing Ecosystems of Intelligence from First Principles.” are working on it.
> • Attempts to have universal slots in new minds that can fit in modules that can help bind different minds, like specialized empathy or inter-communication modules.
Universal empathy requires more than just a module, it requires morphological intelligence: [Morphological intelligence, superhuman empathy, and ethical arbitration](https://www.lesswrong.com/posts/6EspRSzYNnv9DPhkr/morphological-intelligence-superhuman-empathy-and-ethical). Advanced AIs of specific “self-organising” architectures could possess such morphological intelligence and thus could have capacity for superhuman empathy. However, it’s not clear whether they could make sense of their memories of empathic experience once they remodel back into their “base” morphology.
> If we, as the example I mentioned earlier, pretend we can’t arbitrarily assign adulthood based on age, but have to consider what we actually consider intellectual and emotional autonomy and how we’d test for it, we can start implementing such tests and actually gauging people’s maturity, and granting them the associated autonomy, instead of assuming it by proxy of age.
This is in accord with Levin, “Technological Approach to Mind Everywhere.” who suggested that we should determine *empirically* all intelligence properties such as agency, persuadability, capacity for empathy, etc. of any given systems.
> In light of my thesis, here I would say the main point is that we need to anticipate a diverse set of minds and the moment of human values being the only relevant set of values will be negligibly short. I therefore doubt that the initial values we start of with significantly alter the state of affairs once new minds appear.
I agree with this. This is why suggested a research agenda for [scale-free ethics](https://www.lesswrong.com/posts/opE6L8jBTTNAyaDbB/a-multi-disciplinary-view-on-ai-safety-research#3_1__Scale_free_axiology_and_ethics).
> I can barely think of a more multi-disciplinary thesis than the intersubjectivity collapse and thus it requires analysis from all branches of science and philosophy. I think a concentrated, collaborative effort should be led to anticipate its consequences as much as we can.
The research agenda for [civilisational intelligence architecture](https://www.lesswrong.com/posts/opE6L8jBTTNAyaDbB/a-multi-disciplinary-view-on-ai-safety-research#3_2__Civilisational_intelligence_architecture) is even more broad because it includes the intersubjectivity considerations but doesn’t limit with them.
> If it is possible to create types of minds or systems that lack any affect
Per Hesp et al., “Deeply Felt Affect.” and Friston et al., “Path Integrals, Particular Kinds, and Strange Things.”, this is possible only for *very* simple, shallowly organised agents which also won’t possess any significant intelligence. Any intelligence architectures with deep hierarchical organisation (such as DNNs) develop what could (and should) be treated as affect.
Yes. We in dare need of good framework for minds coexistence even now with one type of minds. And it probably should be one of main ingredients for AI alignment.