Automating AI research is bottlenecked by verification speed (running experiments takes time). Our new paper explores whether LLMs can tell which ideas will work before executing them, and they appear to have better research intuition than human researchers.
Most promising-looking AI research ideas don’t pan out, but testing them burns through compute and labor.
Can LMs predict idea success without running any experiments? We show that they do it better than human experts!