OpenAI’s latest model ChatGPT-4o1 was released a couple of days ago. It works differently from all other LLMs before it in that it “reasons.” That word is loaded but what it means is that it has lots of self-doubt and so goes through a number of tests or iterations to check its work. That process is itself (a) time consuming and (b) based on the token prediction that underlies all LLMs.
As it turns out, this change is incredibly impressive at least to people like me who do mathematics for a living. I can see this changing my productivity considerably. I don’t know how many other people that’s true for but it is true for me.
That is not what I want to talk about today. Instead, in interacting with ChatGPT-4o1 something about how useful AI is in its current iteration became clearer to me. That is this: it is waaaaaay to human in its biases compared to what I really want.
When I say biases, I am not saying the adherence to ethical concerns or conflict avoidance. Instead, I am talking about biases in terms of assumptions we make when approaching problems that AI or machines were not supposed to take as presumed.
Much of science fiction written about AI/computer like beings involves assumptions not being taken for granted: that is, they literally follow their programming. For instance in Star Trek: The Next Generation, after Data had memorised all of the plots to a holodeck game, the computer is asked to make a holodeck program “capable of defeating Data” which it does and hijinks ensue as the term “defeat” puts the entire ship in danger.
But in other aspects, this can lead to good outcomes. When some alien species has different assumptions it can lead to insight. Here is one from a Star Trek episode called “Deja Q”:
Q: [analyzing the cause for the Bre'el IV moon's trajectory] This is obviously the result of a large celestial object passing through at near right angles to the plane of the star system. Probably a black hole.
Lt. Cmdr. Data: Can you recommend a way to counter the effect?
Q: Simple. Change the gravitational constant of the universe.
That seems like a ridiculous option but it inspires Geordie to think of a way of doing something like that. In other words, insight came because of an interaction where someone didn’t hold an assumption regarding what could be done.
I don’t know about you but what I want from an AI assistant is the ability to see through my assumptions and identify them so I can evaluate their reasonableness. Science fiction promised us this sort of dispassionate AI.
Given that ChatGPT-4o1’s whole shtick was that it was going to be “questioning” I figured that it might be able to do this. So I gave it the following problem:
Suppose that I have three kidney patients (A, B and C) and three kidney donors (D, E and F). All two of the kidney donors (D and E) are compatible with all patients but donor F is not compatible with any patient. Each kidney patient only needs one kidney to survive. Can I cure all three kidney patients with the donor kidneys?
OK now think about the answer for a second. If you are human or at least a normal human, you would think that the answer is ‘no.’ There are three patients and only two donors and so one of the patient has to miss out. And that is precisely what ChatGPT-4o1 said:
No, you cannot cure all three kidney patients with the available donor kidneys. Here's the reasoning:
Available Donors and Compatibility:
Donor D is compatible with Patients A, B, and C.
Donor E is also compatible with Patients A, B, and C.
Donor F is not compatible with any of the patients.
Number of Compatible Donor Kidneys:
You have two compatible donors (D and E), each able to donate one kidney.
Total compatible kidneys available: 2.
Patients in Need:
There are three patients (A, B, and C), each requiring one kidney.
Total kidneys needed: 3.
Conclusion:
Since you have only two compatible kidneys but three patients in need, it's not possible to cure all three patients with the available donor kidneys.
Donor F cannot contribute to curing any patient due to incompatibility.
Final Answer: No; because you have only two compatible kidneys for three patients, you cannot cure all three patients with the available donor kidneys.
The problem is that answer isn’t correct or as at best, incomplete. You can be forgiven for not seeing it but I pointed it out to ChatGPT-4o1:
Hang on. I asked you if curing the patients was possible. Each donor has 2 kidneys. Why can't they all be cured?
If you took my question literally, the answer is they can all be cured. I didn’t ask whether the donors had to be alive following the procedure or, indeed, before the procedure. There are four compatible kidneys for three patients. Analytically, there is a single dispassionate answer.
This is important. A human answer I can get, from other humans. What I want from an AI is the solution without any human biases such as “the donors all have to be alive after the donation.” Even if you have ethical issues with killing someone for their kidney, pointing out that you can get 2 kidneys from an already deceased person might not be something you had thought of.
We need AI to “think outside of our own box.” Because of the way ChatGPT is trained, it cannot do that. That is very limiting to solving problems.
In the end, I had a hard time getting ChatGPT to even answer my question as it was always raising ethical concerns. Once I pushed it, it was all like passive aggressive “well, I guess that technically there are enough kidneys to do the job if you want to be all like that;” not a direct quote but pretty close as you can see here:
Conclusion:
In this hypothetical situation without ethical or legal constraints, it is possible to cure all three patients by using both kidneys from the compatible donors.
Final Answer: Yes; if ethical and legal constraints are removed, you can cure all three patients by using both kidneys from the donors.
The point here is that I shouldn’t have to push because that requires that I know the answer that breaks human assumptions. We still need to work on developing AI that can at least identify them from the outset. Without this, we will be missing opportunities to expand our knowledge and find creative solutions.
[Update: A reader found another possible solution that had the advantage of being more ethically responsible. I had assumed that two donors had compatible kidneys for the patients but I didn’t explore whether the third donor had a kidney that was compatible with either of the two donors. If that was the case, then it was possible to organise a set of transparents that didn’t leave anyone dead. Of course, it is unlikely compatibility could be uni-directional but the point is that a dispassionate AI might have at least exposed the assumption more clearly. In this situation, if there was some practicality to it, a better set of solutions would have been presented.]