Thank you so much for that, really appreciated. So unix boxen set the flow label, and presumably set it to something sensible according to the L4 instance. I wonder what happens with UDP apps. Does an app itself participate, in unix, or is it just a hash off the src-port+dest-port (if applicable, in the case of L4-multiplexed protocols)? (Or else an (incrementing) connection-instance counter or socket UID/counter, or who knows what. I don't know what is supposed to happen in the case of a non-port-bearing L4 protocol.
Perhaps a suitable cheap hash might be something like (src_port << (20-16) ^ (dest_port +1)), where the +1 avoids having to have a check to make sure we don't hit a zero result, in this case by getting a zero out of the xor, in which case we would output to something else such as 1 which is an unlikely collision. I ought to know what the distributional each port value is in order to know what I'm doing. Does anyone know?
I believe that the receiver needs to combine the flow label with the context formed by the 3-tuple of {src-addr + dest-addr + ip next-header / L4-protocol type} anyway in order to identify the flow, so that the same non-zero flow label value can actually be simultaneously used by multiple flows if the addresses or protocol types differ? Is that the case? So more of a pain for the receiver. But if that is right, then senders do not need to hash together the entire 5-tuple that is comprised of the 3-tuple above plus the port pair, and my ports-only hash above would be fine.
Alternative: I might be inclined to use a UID, set once and stored in every L4 connection object and which is then simply passed down from L4 to L3. If you make a new L4 object then the counter would just increment and the new l4 object would get the incremented value even if the ports were the same, which might be an improvement. Should the flow label be temporarily unique as far as possible too, so that the new connection can be distinguished from the old? I haven't seen any mention of such a thing, and for I know it might be a really bad practice, against the desired semantics as a new connection after a reconnect might not be supposed to be considered a 'new' flow. The hash might even be faster, one less memory fetch for the UID value out of the L4 object.
--
Off-topic:: ECN: I noticed ECN was not being used in some cases in the example kindly supplied. In the case of Windows, iirc, ECN support is off by default in some recent versions but I always turned it on globally in >= NT6.0, which can be done persistently by using a netsh command, for example.