Interestingly, when I apply the "simply repeat the prompt" technique [1], Sonnet 4.6 on the website got it right every time, both with and without extended thinking.
Not repeating the prompt got a mix of walk and drive answers.
I love how prompt engineering is basically techno-alchemy
Not repeating the prompt got a mix of walk and drive answers.
I love how prompt engineering is basically techno-alchemy
1: https://arxiv.org/pdf/2512.14982