Hi!
I am a cognitive scientist and AI safety researcher.
With my work, I want to contribute to making sure that the AI systems that we build are robustly aligned to human interests.
Currently, I am a research scientist at Scale AI's Safety, Evaluations & Alignment Lab (SEAL), where I research the alignment of AI systems with human values.
I am also personally interested in reducing our uncertainty about the moral status of AIs.
Part of my recent work investigates introspection in Large Language Models, exploring whether AI systems can acquire knowledge about themselves that goes beyond their training data—a capability that could inform questions about AI consciousness and moral consideration.
Previously, my PhD work investigated agent-environment interactions during planning.
Some of the things that we do in the world (such as rearranging things, feeling how heavy something is, or looking at a problem from different angles) make it easier for us to find solutions to difficult planning problems. How can we understand this in computational terms?
My approach is best described as computational cognitive science: trying to discover the high-level algorithms of cognition using agent-based simulations, computational models, and behavioral experiments.
In one project, I explored how the visual structure of the environment can guide planning.
I also think about the models underlying physical understanding in humans and machines (and where they differ).
I completed my PhD in Cognitive Science at UC San Diego and, as a visiting scholar, at Stanford University.
I worked with Judith Fan (Stanford), David Kirsh (UCSD) and Marcelo Mattar (NYU).
I also work as a VJ and visual artist—find my artistic work at vj.felixbinder.net.
Find my resume and CV here.
It is a profoundly erroneous truism, repeated by all copy-books and by eminent people when they are making speeches, that we should cultivate the habit of thinking of what we are doing. The precise opposite is the case. Civilization advances by extending the number of important operations which we can perform without thinking about them. Operations of thought are like cavalry charges in a battle — they are strictly limited in number, they require fresh horses, and must only be made at decisive moments.
Alfred North Whitehead
More …