I tried this one with ChatGPT o1 and it seemed to get it right > The surgeon is ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		thekyle on Dec 4, 2024 \| parent \| context \| favorite \| on: Automated reasoning to remove LLM hallucinations I tried this one with ChatGPT o1 and it seemed to get it right > The surgeon is the boy’s biological father. While the woman injured in the accident is the boy’s biological mother, the surgeon is his father, who realizes he cannot operate on his own son. https://chatgpt.com/share/674fc638-cd0c-8012-a4c4-9f1cad2040...

fnordpiglet on Dec 4, 2024 [–]

Claude Sonnet also gets it right, but not reliably. It seems to be over aligned against gender assumptions and keeps assuming this is a gender assumption trick - that a surgeon isn’t necessarily male. This is probably the clearest case I’ve seen of alignment interfering with model performance.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact