> Basically it requires dropping down to the assembler-level for each supported architecture to implement the context switching, as modifying the CPU context registers and stack frame directly is not permitted by most sane programming languages.
I thought that was (essentially) the definition of longjmp? Thinking further, it seems like the initial setup of additional stacks would require at least taking the covers of setjmp and interacting with its implementation details directly.
About 20 years ago I and two others designed an embedded kernel that switched tasks with setjmp and longjmp. It was exactly as you suspected—the only implementation specific part was creating a jmp_buf on task creation (which doesn’t require even assembly intrinsics). We were only targeting one architecture/abi at the time (arm) but it was nice to know that porting to a new architecture wouldn’t require a bunch of assembly rewriting (the only assembly was the interrupt trampoline—even our crt0 was C (well, there was at least one intrinsic asm instruction to set the stack pointer in crt0)).
There's an implementation using setjmp/longjmp in his library here [1]. It uses sigaltstack to assign a newly allocated stack to the coroutine. Marc Lehman's libcoro [2] library does the same.
Marked obsolescent in that version (2004 edition) (purportedly replaced by POSIX threads, except those aren't coroutines...) — but likely still present in whatever libc you use. (There's also pthread_set_concurrency(), but it is (a) option and (b) marked obsolescent as well ... https://pubs.opengroup.org/onlinepubs/9699919799/functions/p... .)
unfortunately set/swapcontext and friends require a system call to change the signal mask. So in practice they are a few orders of magnitude slower than setjump or similar.
On some systems setjmp/longjmp also change the signal mask, and are just as slow. For those there is _setjmp/_longjmp.
The Linux man page explains excellently:
"POSIX does not specify whether setjmp() will save the signal mask (to be later restored during longjmp()). In System V it will not. In 4.3BSD it will, and there is a function _setjmp() that will not. The behavior under Linux depends on the glibc version and the setting of feature test macros. On Linux with glibc versions before 2.19, setjmp() follows the System V behavior by default, but the BSD behavior is provided if the _BSD_SOURCE feature test macro is explicitly defined and none of _POSIX_SOURCE, _POSIX_C_SOURCE, _XOPEN_SOURCE, _GNU_SOURCE, or _SVID_SOURCE is defined."
You could easily imagine that most or all signal masks in spome green threaded program are the same, and a cheap userspace implementation of set/swapcontext that only calls into the kernel if the signal mask changes. I.e., it doesn't have to be a kernel call most of the time.
To change the stack pointer, you either need sigaltstack (not always available) or you have to hack jmpbuf (not remotely portable), but indeed we have done both in my libco library. Assembler is quite a bit faster when you can manage it.
Yes, I've done this successfully with setjmp/longjmp in the past. To setup the stack and make the first jump, you need to either fiddle with the jump buffer, or use something provided by the os, like windows fibers or linux ucontext. Those can be used to fully implement the scheduler, but longjmp is faster and limits the OS-specific part of the scheduler.
> as modifying the CPU context registers and stack frame directly is not permitted by most sane programming languages
Embedded C programmer here. I find this statement highly offensive :D
I wonder if this guy knows how planes and car and traffic lights and medical equipment all works? Lots of things you rely on every day were made in languages that allow “insane” direct CPU register access.
What do you think you do when you write in assembly and want to enter or exit a function? You push and pop on the stack control registers.... gasp?
It’s not common but I’ve had to manually adjust PC register before. The author and most software devs are so far removed from metal, the downvotes on my comments are hilarious to me.
>What do you think you do when you write in assembly
You said you were an "Embedded C programmer". Why are you now switching the topic and talking about assembly? First of all there is no such objective definition of assembly. It is inherently specific to the architecture and therefore someone knowing x86 assembly doesn't mean that person also knows how to use RISC-V assembly. Since you have moved the goalpost out of the playing field by switching languages one could now conceive of an architecture that simply has no registers at all because everything is stored in RAM. Such an architecture would allow a C compiler to still produce valid code but assembly code would not be able to access any registers whatsoever. Therefore it makes equally little sense for the C programming language to have the ability to access registers.
> The author and most software devs are so far removed from metal, the downvotes on my comments are hilarious to me.
The reason why you receive downvotess is that even people who are not "far removed from metal" disagree with your comments because you are making fun of them.
I said directly. You seem to be implying that it is common for important embedded work to involve manually messing with stack registers directly by the programmer with assembly.
Um... I don't know what to tell you - but I do that every day. Guess what - people who make libraries for other embedded devices, sometimes still write in assembly. Insane, I know.
You seem to be purposely ignoring what I am saying so that you contradict something I didn't say. I doubt people are writing the software for medical devices by directly writing to stack registers (which wouldn't mean the typical automatic instructions that push and pop the stack while automatically incrementing or decrementing the instruction pointer at the same time)
Your first comment didn't mention assembly. You only talked about C. It's insane that you switch the topic of the conversation and then pretend to be the only smart one.
I thought that was (essentially) the definition of longjmp? Thinking further, it seems like the initial setup of additional stacks would require at least taking the covers of setjmp and interacting with its implementation details directly.
https://en.wikipedia.org/wiki/Setjmp.h