C++20 协程探索
文章目录
c++ 20 引入了协程的特性,很多其他语言很早就已经有了,c++ 的又会有什么不同呢,本文旨在了解协程的概念以及简单明白底层实现的原理。
Three new language keywords: co_await
, co_yield
and co_return
1、definition
当一个函数被阻塞时,一般会切换到其他线程去执行,而线程频繁切换是存在花销的。
如果不阻塞,而使用异步回调,会割裂了原来的代码业务逻辑,还会陷入回调地狱难以维护。
A coroutine is a generalisation of a function that allows the function to be suspended and then later resumed.
首先,一个协程就是一个函数,但它允许函数被挂起,之后再被恢复。
Coroutines are stackless: they suspend execution by returning to the caller and the data that is required to resume execution is stored separately from the stack.
线程可以在遇到阻塞的地方后,保存执行的上下文,转而去执行别处的代码。待阻塞的请求完成后,再转而回去继续执行。
线程是操作系统抽象出来的执行流,由操作系统统一调度管理。那在一个线程中,同样可以抽象出多个执行流,由线程来统一调度管理。这线程之上抽象的执行流就是协程。
线程的调度由操作系统来管理,是抢占式调度。而协程不同,协程需要互相配合,主动交出执行权,这也是协程的名字——协作式程序的来历。
1.1、normal call
A normal function can be thought of as having two operations: Call and Return (Note that I’m lumping “throwing an exception” here broadly under the Return operation). 一个普通函数可以简单地认为是两步操作:调用和返回。
The Call operation creates an activation frame, suspends execution of the calling function and transfers execution to the start of the function being called.
调用会创建函数栈,会把调用它的函数先挂起,然后执行被调用的函数。
The Return operation passes the return-value to the caller, destroys the activation frame and then resumes execution of the caller just after the point at which it called the function.
返回就是把返回值传给调用者,破坏函数栈,然后恢复为调用者执行。
1.2、coroutine
Coroutines generalise the operations of a function by separating out some of the steps performed in the Call and Return operations into three extra operations: Suspend, Resume and Destroy. 协程将调用和返回分成了三个额外的操作:挂起、恢复、销毁。
The Suspend operation suspends execution of the coroutine at the current point within the function and transfers execution back to the caller or resumer without destroying the activation frame. Any objects in-scope at the point of suspension remain alive after the coroutine execution is suspended.
挂起操作会挂起当前协程的执行,然后不会破坏函数栈,让调用者或恢复者来继续执行。协程挂起后,域内的对象还是保持存活。
Note that, like the Return operation of a function, a coroutine can only be suspended from within the coroutine itself at well-defined suspend-points.
注意,和返回操作一样,协程只能被自己挂起。
The Resume operation resumes execution of a suspended coroutine at the point at which it was suspended. This reactivates the coroutine’s activation frame.
恢复操作恢复挂起的协程继续执行,重新激活了它的函数栈。
The Destroy operation destroys the activation frame without resuming execution of the coroutine. Any objects that were in-scope at the suspend point will be destroyed. Memory used to store the activation frame is freed.
销毁操作破坏函数栈,域内对象也被销毁,不会把执行交给协程了。
1.2.1、Coroutine activation frames
Since coroutines can be suspended without destroying the activation frame, we can no longer guarantee that activation frame lifetimes will be strictly nested. This means that activation frames cannot in general be allocated using a stack data-structure and so may need to be stored on the heap instead.
协程可以被挂起而不破坏其函数栈,所以就不再保证其栈是紧凑的(每个函数调用栈挨着),说明其栈不能以一般方式分配,可能要放到堆内存存储了。
There are some provisions in the C++ Coroutines TS to allow the memory for the coroutine frame to be allocated from the activation frame of the caller if the compiler can prove that the lifetime of the coroutine is indeed strictly nested within the lifetime of the caller. This can avoid heap allocations in many cases provided you have a sufficiently smart compiler.
一些实现可以分配到栈上存储,但需要有个足够聪明的编译器。
With coroutines there are some parts of the activation frame that need to be preserved across coroutine suspension and there are some parts that only need to be kept around while the coroutine is executing. For example, the lifetime of a variable with a scope that does not span any coroutine suspend-points can potentially be stored on the stack.
协程挂起和协程执行时,只有一部分的栈数据是需要保留的。比如某些域内的变量不会越过挂起的点,可以被保存到栈内存上。
You can logically think of the activation frame of a coroutine as being comprised of two parts: the ‘coroutine frame’ and the ‘stack frame’.
所以可以认为协程的栈由两部分组成:协程的栈 和 函数栈。
The ‘coroutine frame’ holds part of the coroutine’s activation frame that persists while the coroutine is suspended and the ‘stack frame’ part only exists while the coroutine is executing and is freed when the coroutine suspends and transfers execution back to the caller/resumer.
协程栈保留挂起需要用到的数据,而函数栈用以协程执行,协程挂起后函数栈就可以释放掉了。
1.2.2、Suspend
In the C++ Coroutines TS, these suspend-points are identified by usages of the co_await
or co_yield
keywords.
When a coroutine hits one of these suspend-points it first prepares the coroutine for resumption by:
如果触发挂起点,协程需要准备好恢复所需要用到的东西
- Ensuring any values held in registers are written to the coroutine frame
保证存在寄存器的变量写到协程栈中
- Writing a value to the coroutine frame that indicates which suspend-point the coroutine is being suspended at. This allows a subsequent Resume operation to know where to resume execution of the coroutine or so a subsequent Destroy to know what values were in-scope and need to be destroyed.
写入一个值到协程栈中,来表示协程是在哪个挂起点被挂起的。让恢复可以正确执行,或者让销毁可以正确地释放内存。
1.2.3、Resume
Just like a normal function call, this call to resume()
will allocate a new stack-frame and store the return-address of the caller in the stack-frame before transferring execution to the function.
跟普通函数调用一样,会先创建函数栈保存调用者的返回地址。
However, instead of transferring execution to the start of the function it will transfer execution to the point in the function at which it was last suspended. It does this by loading the resume-point from the coroutine-frame and jumping to that point.
它会从协程栈加载挂起点信息,然后跳过去。
1.2.4、Destroy
void destroy()
The Destroy operation destroys the coroutine frame without resuming execution of the coroutine. 直接销毁协程栈
This operation can only be performed on a suspended coroutine. 只能用于被挂起的协程
The Destroy operation acts much like the Resume operation in that it re-activates the coroutine’s activation frame, including allocating a new stack-frame and storing the return-address of the caller of the Destroy operation.
像恢复操作一样,销毁操作会分配新的函数栈保存调用者的返回地址。
However, instead of transferring execution to the coroutine body at the last suspend-point it instead transfers execution to an alternative code-path that calls the destructors of all local variables in-scope at the suspend-point before then freeing the memory used by the coroutine frame.
但它会跳转到额外的代码去执行销毁操作。
1.2.5、Call
The Call operation of a coroutine is much the same as the call operation of a normal function. In fact, from the perspective of the caller there is no difference.
协程里的调用操作与普通函数调用类似。
However, rather than execution only returning to the caller when the function has run to completion, with a coroutine the call operation will instead resume execution of the caller when the coroutine reaches its first suspend-point.
但是,普通函数要运行完才能返回到调用者,协程可以在挂起点返回给调用者。
When performing the Call operation on a coroutine, the caller allocates a new stack-frame, writes the parameters to the stack-frame, writes the return-address to the stack-frame and transfers execution to the coroutine. This is exactly the same as calling a normal function.
当在协程里使用调用操作,调用者会分配函数栈写入参数和返回地址,然后跳转到被调用的函数。
The first thing the coroutine does is then allocate a coroutine-frame on the heap and copy/move the parameters from the stack-frame into the coroutine-frame so that the lifetime of the parameters extends beyond the first suspend-point.
不过协程之后会分配协程栈,把函数栈上的参数复制或者移动到协程栈,从而使这些参数的生命周期得以在挂起点延长。
1.2.6、Return
The Return operation of a coroutine is a little different from that of a normal function.
返回操作和普通函数不太一样。
When a coroutine executes a return
-statement (co_return
according to the TS) operation it stores the return-value somewhere (exactly where this is stored can be customised by the coroutine) and then destructs any in-scope local variables (but not parameters).
当协程执行返回操作,它会保存返回值到某处,然后销毁域内的局部变量。
The coroutine then has the opportunity to execute some additional logic before transferring execution back to the caller/resumer.
在返回到调用者和恢复者之前,协程还可以做些额外的逻辑。
This additional logic might perform some operation to publish the return value, or it might resume another coroutine that was waiting for the result. It’s completely customisable.
可能是输出返回值,也可能是恢复在等待返回的另一个协程。
The coroutine then performs either a Suspend operation (keeping the coroutine-frame alive) or a Destroy operation (destroying the coroutine-frame).
然后协程会执行挂起或销毁操作。
Execution is then transferred back to the caller/resumer as per the Suspend/Destroy operation semantics, popping the stack-frame component of the activation-frame off the stack.
然后返回给调用者或恢复者执行,退栈函数栈。
It is important to note that the return-value passed to the Return operation is not the same as the return-value returned from a Call operation as the return operation may be executed long after the caller resumed from the initial Call operation.
值得注意的是,返回值可能在调用者从初始调用恢复之后很久才会被执行。
2、An illustration
let’s say we have a function (or coroutine), f()
that calls a coroutine, x(int a)
.
举个例子,普通函数 f 调用协程 x。
Before the call we have a situation that looks a bit like this: 调用前长这样
Then when x(42)
is called, it first creates a stack frame for x()
, as with normal functions.
调用协程 x,为 x 创建函数栈
Then, once the coroutine x()
has allocated memory for the coroutine frame on the heap and copied/moved parameter values into the coroutine frame we’ll end up with something that looks like the next diagram. Note that the compiler will typically hold the address of the coroutine frame in a separate register to the stack pointer (eg. MSVC stores this in the rbp
register).
然后,协程 x 在堆内存中分配协程栈,把参数复制或移动到协程栈。而当前 rbp 会指向协程栈。
If the coroutine x()
then calls another normal function g()
it will look something like this.
如果协程 x 调用了另一个普通函数 g
When g()
returns it will destroy its activation frame and restore x()
’s activation frame. Let’s say we save g()
’s return value in a local variable b
which is stored in the coroutine frame.
当普通函数 g 返回,它会销毁函数栈,恢复 x 的函数栈,把返回值存储到协程栈当中。
If x()
now hits a suspend-point and suspends execution without destroying its activation frame then execution returns to f()
.
如果协程 x 执行到挂起点,不会销毁协程栈,执行返回到 f
This results in the stack-frame part of x()
being popped off the stack while leaving the coroutine-frame on the heap. When the coroutine suspends for the first time, a return-value is returned to the caller. This return value often holds a handle to the coroutine-frame that suspended that can be used to later resume it. When x()
suspends it also stores the address of the resumption-point of x()
in the coroutine frame (call it RP
for resume-point).
函数栈中,x 会被退栈,而 rbp 也会离开协程栈。当协程第一次挂起,会返回一个值给调用者,这个值会有处理协程栈的挂起与恢复的控制器。当协程 x 挂起,会在协程栈中保存挂起点信息。
This handle may now be passed around as a normal value between functions. At some point later, potentially from a different call-stack or even on a different thread, something (say, h()
) will decide to resume execution of that coroutine. For example, when an async I/O operation completes.
控制器可以被函数间传递。之后的某个地方,可能在不同的调用栈,或者不同的线程,可能函数 h 决定恢复协程。
The function that resumes the coroutine calls a void resume(handle)
function to resume execution of the coroutine. To the caller, this looks just like any other normal call to a void
-returning function with a single argument.
函数会调用 resume(handle) 来恢复协程,在调用者看来就是普通的单个参数的函数返回。
This creates a new stack-frame that records the return-address of the caller to resume()
, activates the coroutine-frame by loading its address into a register and resumes execution of x()
at the resume-point stored in the coroutine-frame.
这会创建新的函数栈保存调用者的返回地址,激活协程栈,把它的地址加载到寄存器中,恢复协程在挂起点的执行。
3、编写
3.1、interface
An interface describes the behavior or capabilities of a C++ class without committing to a particular implementation of that class.
The C++ interfaces are implemented using abstract classes and these abstract classes should not be confused with data abstraction which is a concept of keeping implementation details separate from associated data.
3.2、example
|
|
3.3、变种
Instead of passing a coroutine_handle<>*
into counter
, it would be nicer if we could just return the handle from counter()
. We can do that if we put the coroutine handle inside the return object.
|
|
参考&推荐阅读
Coroutine Theory:https://lewissbaker.github.io/2017/09/25/coroutine-theory
C++ Coroutines: Understanding operator co_await:https://lewissbaker.github.io/2017/11/17/understanding-operator-co-await
C++ Coroutines: Understanding the promise type:https://lewissbaker.github.io/2018/09/05/understanding-the-promise-type
C++ Coroutines: Understanding Symmetric Transfer:https://lewissbaker.github.io/2020/05/11/understanding_symmetric_transfer
Coroutines (C++20):https://en.cppreference.com/w/cpp/language/coroutines
神秘使者到Java帝国传道协程,竟被轰了出去:https://mp.weixin.qq.com/s/cN9dC_crrEU579-AMrd5PQ
https://www.scs.stanford.edu/~dm/blog/c++-coroutines.html#coroutine-handles
文章作者 calssion
上次更新 2021-05-03