Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment? – LessWrong
This has been discussed several times in the past, see:
But I’m not aware of anyone that has actually even tried to do something like this.
Of special interest is this comment by Eliezer about Tao:
We’d absolutely pay him if he showed up and said he wanted to work on the problem. Every time I’ve asked about trying anything like this, all the advisors claim that you cannot pay people at the Terry Tao level to work on problems that don’t interest them. We have already extensively verified that it doesn’t particularly work for eg university professors.
So if anyone has contacted him or people like him (instead of regular college professors), I’d like to know how that went.
Otherwise, especially for people that aren’t merely pessimistic but measure success probability in log-odds, sending that email is a low cost action that we should definitely try.
So you (whoever is reading this) have until June 23rd to convince me that I shouldn’t send this to his @math.ucla.edu address:
Edit: I’ve been informed that someone with much better chances of success will be trying to contact him soon, so the priority now is to convince Demis Hassabis (see further below) and to find other similarly talented people.
Title: Have you considered working on AI alignment?
Body:
It is not once but twice that I have heard leaders of AI research orgs say they want you to work on AI alignment. Demis Hassabis said on a podcast that when we near AGI (i.e. when we no longer have time) he would want to assemble a team with you on it but, I quote, “I didn’t quite tell him the full plan of that”. And Eliezer Yudkowsky of MIRI ([email protected]) said in an online comment “We’d absolutely pay him if he showed up and said he wanted to work on the problem. Every time I’ve asked about trying anything like this, all the advisors claim that you cannot pay people at the Terry Tao level to work on problems that don’t interest them.”, so he didn’t even send you an email. I know that to you it isn’t the most interesting problem to think about but it really, actually is a very very important and urgent completely open problem. It isn’t simply a theoretical concern, if Demis’ predictions of 10 to 20 years to AGI are anywhere near correct, it will deeply affect you and your family (and everyone else).
If you are ever interested you can start by reading the pages linked in EA Cambridge’s AGI Safety Fundamentals course or the Alignment Forum.
Best of luck,
P.
You can do any of:
- Reporting your past results.
- Convincing me that this is a net negative on expectation.
- The worst thing I can think of that can realistically happen is this leading to something like the Einstein-Szilard letter. But considering that Elon Musk has already tried to warn governments, I don’t think it would change much.
- Arguing that it is important to wait until we have the results of the AI Safety Arguments Competition or something similar. I currently don’t think so, he should be convinced for the same reasons we are convinced.
- Suggesting changes to the email or a new email altogether. If you think it is terrible, say so!
- He declines many kinds of email requests, see points 5 and 12 here.
- If you have more social capital than me (e.g. if you know him) or you work at an alignment organisation, volunteering to send the email yourself. He could think “If they actually care about hiring me, why aren’t they contacting me directly?”.
- Saying under what conditions your organisation would be willing to hire him, so I can add it to the email.
What you probably shouldn’t do is to send your own email without telling the rest of us. His attention is a limited resource and bothering him with many different emails might reduce his sympathy for the cause.
And other than him, how many people do you think have a comparable chance of solving the problem or making significant progress? And how do we identify them? By the number of citations? Prizes won? I would like to have a list like that along with conditions under which each alignment org would hire each person. The probability of convincing Tao might be low, but with, say, 100 people like him the chances of finding someone might be decent.
I’m pretty sure that most of them haven’t heard about alignment, or have and just discarded it as something not worth thinking about. I don’t think this means that they couldn’t do great alignment work if they tried, maybe getting them to seriously think about the problem at all is the hard part, and after that their genius simply generalises to this new area.
Relatedly, does anyone here know why Demis Hassabis isn’t assembling his dream team right now? The same as above applies, but until the 1st of July June 23rd:
Title: Are you sure you should wait before creating your dream alignment team?
Body:
On S2, Ep9 of the DeepMind podcast you said that when we get close to AGI we should pause pushing the performance of AI systems to guarantee they are safe. What do you think the timeline would be like in that scenario? When we get close, while DeepMind and maybe some other teams might pause the development, everyone else will keep working as fast as possible to get to the finish line, and all else equal whoever devotes less resources to non-capabilities work will get there first. Creating AGI is already a formidable task, but we at least have formalisms like AIXI that can serve as a guide by telling us how we could achieve general intelligence given unlimited computing power. For alignment we have no such thing, existing IRL algorithms couldn’t learn human values even with direct IO access and unlimited computing power, and then there is the inner alignment problem. If we don’t start working on the theoretical basis of alignment now, we won’t have aligned systems when the time comes.
This should be obvious to him, but just in case.