If I try to explain trellis coded modulation then someone can tell me whether I have complete misunderstood it, got part or all of it wrong, am missing whole vital pieces of the story. Would someone help with that?

It seems to me that trellis coded **modulation** should not be just shortened to trellis coding as it is not just another code, that's the whole point of why it is so revolutionary. When Ungerböck invented it in the 1970s it was the first time that anyone had blurred the boundary between coding and modulation (=choosing waveforms) so that the two can't be dealt with solely in isolation but have to be cionsidered together and a design was produced that links the two, with a coding scheme that only makes any sense if a particular associated type of modulation is also used with that coding; the two were co-designed, and have to be.

You have a QAM constellation with so many points in it. One of those points on a 2d plane is the value of a transmitted symbol and as it is 2d its position can be written as a pair of numbers (*x*,*y*) coordinates or viewed as a complex number *x* + *iy*. The distance *d* between two points is the euclidean distance given by pythagoras as *d² = x² + y² *. The euclidean distance is a measure of how distinct the values are, of who difficult it is to confuse two symbols and decide you're hearing one value when another value was intended. If the distance is too small then the points will be too easily confused in the presence of noise.

The trellis idea is that you take the set of all possible positions of a point in the constellation and you split or ‘partition’ that set into so many distinct subsets of possible points, say into two subsets that are halves, or four quarters or whatever. The magic of trellis comes in when it was realised that if you pick the subsets intelligently you can do it in such a way that the points in a subset partition are all spaced apart much further in fact spaced so that the minimum distance between any point in one subset and any of the neighbours that belong to the other subsets is really large compared to the nearest distance spacing found in the full whole subset. Keeping points much further apart like this hugely increases noise resistance because it is much more difficult to get points confused.

The coding part of TCM is where values of symbols were restricted so that they were limited by having to conform to a time *sequence* of permitted values, symbols could not be sent in completely random patterns the value of one symbol was restricted depending on the previous so many symbols sent, so only certain sequences in time were legal. A signal corrupted by noise can be spotted as corrupt because the received pattern does not conform to any of the allowed pattern sequences. The restrictions on permitted sequences of values form a code, and this restriction means fewer possible combinations of symbols over time, so it slows communication down. However, by some surprising miracle, it turns out that the extra noise immunity gained by the modulation scheme that uses set partitioning can be used to give a performance benefit that is greater than the performance loss due to the long-windedness imposed by the trellis code requiring the generation of more bits as output from the code to send a given message. The extra noise immunity can either be kept as a reliability benefit or the system can be pushed harder to get more speed by having a more complex and thus more fragile / risky constellation with more points in it with the more robust modulation scheme restoring reliability to the same level. So it really is a free lunch. It is very surprising that this is true. Surprising that a coding scheme that makes transmitted data more long-winded can actually end up speeding the whole thing up.

The cost is in the complex decoding process which involves looking at the time pattern of the received sequence of symbols and comparing it to the permitted time sequences to try and guess which was the most likely intended transmitted value. There is quite a lot of information available here in the possibly slightly corrupted or more corrupted information coming in and different decoding algorithms can cost more cpu cycles or more silicon.

I hope that some of that is correct. Would someone tell me what is wrong / right, if anything?