I haven't installed from this link specifically, but I used one of the branches on which this is based a few days ago, so the results should be similar.
On a first-gen M1 Mac mini with 8GB RAM, it takes 70-90 minutes for each image.
I have the M1 8GB I mentioned in my first comment, and the M1 Pro 16GB I mentioned in my second component, side-by-side. However, the first one was running a Stable Diffusion branch from earlier in the week, so I replaced using the same instructions. The only difference now is the physical hardware.
The thing to understand is that the 8GB M1 has 8GB. When I run txt2img.py, my Activity Monitor shows a Python process with 9.42GB of memory, and the "Memory Pressure" graph spends time in the red zone as the machine is swapping. While the 16GB M1 Pro immediately shows PLMS Sampler progress, and consistently spends around 3 seconds per iteration (e.g. "3.29s/it" and "2.97s/it"), the 8GB M1 takes several minutes before it jumps from 0% to 2% progress, and it accurately reports "326.24s/it"
So yes, whether it's M1 vs M1 Pro, or 8GB vs 16GB, it really is that stark a difference.
Update: after the second iteration it is 208.44s/it, so it is speeding up. It should drop to less than 120s/it before it finishes, if it runs as quickly as my previous install. And yes, 186.04s/it after the third iteration, and 159.22s/it after the fourth.
Sounds entirely like a swap-constrained operation. You need ~8gb of VRAM to load the uncompressed model into memory, which obviously won't work well on a Macbook with 8gb of memory.
My first-gen M1 MacBook Air with 16GB takes just under 4 minutes per image. Running top while it's generating shows memory usage fluctuating between 10GB and 13GB, so if you're running on 8GB it's probably swapping a lot.
On a first-gen M1 Mac mini with 8GB RAM, it takes 70-90 minutes for each image.
Still feels like magic, but old-school magic.