rtp协议

最新推荐文章于 2025-06-20 08:00:00 发布

王二の黄金时代

最新推荐文章于 2025-06-20 08:00:00 发布

阅读量1.5k

点赞数

CC 4.0 BY-SA版权

分类专栏： rtp

本文链接：https://blog.csdn.net/u012459903/article/details/89371046

rtp 专栏收录该内容

4 篇文章

订阅专栏

rtp协议介绍的博客文章比较多，但是有疑惑，从vlc源码， strem_out/rtp.c中实际使用的 rtp打包代码来看，rpt头只用了12Byte,从实际wireshark抓到的包来看，也只占用12字节。但是很有些文章都介绍除了这12Byte之外还有CSRC, 这是什么？和SSRC有什么区别？

https://tools.ietf.org/html/rfc3550#section-5.1

上面网页清楚的说明，第一层是每一个rtp包头都必须有的，第二个部分，有 mixer，（类似混音）才带有，和SSRC的区别，就是一个rtp 对应一个流，属于同一个流的即有同样的SSRC , 对于混音，几个 rtp流合起来组成一个 mix音效，这几个rtp流有相同的SSRC. 一般情况下第二部分都不带。

还有rtp协议负载情况，rtp负载h264的官方介绍都有部分博文介绍，难免疑惑如果是其他负载呢？

=================================

RFC 6184, RTP Payload Format for H.264 Video

Main article: RTP payload formats

wiki上定义rtp负载h264的文档

=======================================

自行查找的都是些别人整理编辑的中文博文，基本上都是些二手资料，苦于各种信息资源泛滥，良莠不全，终于找到一个好东西，也是一直忽略了，wikipedia---------------------------<一手新鲜资源，原汁原味> [^_^] [^_^] [^_^] [^_^]

https://en.wikipedia.org/wiki/Real-time_Transport_Protocol

(不会科学上网的我，这网页还经常打不开，只能偶尔换个必应搜索下了)

维基百科上面完整介绍了rtp格式，包括rtp可以支持的负载类型，对应的给中 rfc 标准文档的连接，可以看出很多博文都是与之雷同

摘抄一部分过来：

Profiles and payload formats[edit]

See also: RTP audio video profile

One of the design considerations for RTP is to carry a range of multimedia formats and allow new formats without revising the RTP standard. To this end, the information required by a specific application of the protocol is not included in the generic RTP header, but is instead provided through separate RTP profiles and associated payload formats. For each class of application (e.g., audio, video), RTP defines a profile and one or more associated payload formats.[8] A complete specification of RTP for a particular application usage requires profile and payload format specifications.[13]:71

The profile defines the codecs used to encode the payload data and their mapping to payload format codes in the Payload Type (PT) field of the RTP header. Each profile is accompanied by several payload format specifications, each of which describes the transport of a particular encoded data.[2] The audio payload formats include G.711, G.723, G.726, G.729, GSM, QCELP, MP3, and DTMF, and the video payload formats include H.261, H.263, H.264, H.265 and MPEG-1/MPEG-2.[16] The mapping of MPEG-4 audio/video streams to RTP packets is specified in

RFC 3016, and H.263 video payloads are described inRFC 2429.[17]

Examples of RTP profiles include:

The RTP profile for Audio and video conferences with minimal control (RFC 3551) defines a set of static payload type assignments, and a dynamic mechanism for mapping between a payload format, and a PT value using Session Description Protocol (SDP).
The Secure Real-time Transport Protocol (SRTP) (RFC 3711) defines an RTP profile that provides cryptographic services for the transfer of payload data.[18]
The experimental Control Data Profile for RTP (RTP/CDP) for machine-to-machine communications.[19]

Packet header[edit]

RTP packets are created at the application layer and handed to the transport layer for delivery. Each unit of RTP media data created by an application begins with the RTP packet header.

RTP packet header
Offsets	Octet	0								1								2								3
Octet	Bit [a]	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31
0	0	Version		P	X	CC				M	PT							Sequence number
4	32	Timestamp
8	64	SSRC identifier
12	96	CSRC identifiers ...
12+4×CC	96+32×CC	Profile-specific extension header ID																Extension header length
16+4×CC	128+32×CC	Extension header ...

The RTP header has a minimum size of 12 bytes. After the header, optional header extensions may be present. This is followed by the RTP payload, the format of which is determined by the particular class of application.[20] The fields in the header are as follows:

Version: (2 bits) Indicates the version of the protocol. Current version is 2.[21]
P (Padding): (1 bit) Used to indicate if there are extra padding bytes at the end of the RTP packet. Padding may be used to fill up a block of certain size, for example as required by an encryption algorithm. The last byte of the padding contains the number of padding bytes that were added (including itself).[13]:12[21]
X (Extension): (1 bit) Indicates presence of an extension header between standard header and payload data. The extension header is application or profile specific.[21]
CC (CSRC count): (4 bits) Contains the number of CSRC identifiers (defined below) that follow the fixed header.[13]:12
M (Marker): (1 bit) Used at the application level and defined by a profile. If it is set, it means that the current data has some special relevance for the application.[13]:13
PT (Payload type): (7 bits) Indicates the format of the payload and determines its interpretation by the application. This is specified by an RTP profile. For example, see RTP Profile for audio and video conferences with minimal control (RFC 3551).[22]
Sequence number: (16 bits) The sequence number is incremented by one for each RTP data packet sent and is to be used by the receiver to detect packet loss and to restore packet sequence. The RTP does not specify any action on packet loss; it is left to the application to take appropriate action. For example, video applications may play the last known frame in place of the missing frame.[23] According to RFC 3550, the initial value of the sequence number should be random to make known-plaintext attacks on encryption more difficult.[13]:13 RTP provides no guarantee of delivery, but the presence of sequence numbers makes it possible to detect missing packets.[1]
Timestamp: (32 bits) Used by the receiver to play back the received samples at appropriate time and interval. When several media streams are present, the timestamps may be independent in each stream.[b] The granularity of the timing is application specific. For example, an audio application that samples data once every 125 µs (8 kHz, a common sample rate in digital telephony) would use that value as its clock resolution. Video streams typically use a 90 kHz clock. The clock granularity is one of the details that is specified in the RTP profile for an application.[23]
SSRC: (32 bits) Synchronization source identifier uniquely identifies the source of a stream. The synchronization sources within the same RTP session will be unique.[13]:15
CSRC: (32 bits each, number indicated by CSRC count field) Contributing source IDs enumerate contributing sources to a stream which has been generated from multiple sources.[13]:15
Header extension: (optional, presence indicated by Extension field) The first 32-bit word contains a profile-specific identifier (16 bits) and a length specifier (16 bits) that indicates the length of the extension (EHL = extension header length) in 32-bit units, excluding the 32 bits of the extension header.[13]:17

上面的第1字节，根据输入的参数 b_m_bit, 是否是最后一个fragment，是，则 Mark位为1，否则为0。结合实际抓的包看看

rtp header 12个字节的后面，如果是负载h264的话，如下图采用FU-A分片方式，

FU-A的分片格式
数据比较大的H264视频包，被RTP分片发送。12字节的RTP头后面跟随的就是FU-A分片：

*   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
* | FU indicator |   FU header   |                               |
* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |

对于h264帧分片的几种方式：https://blog.csdn.net/yangzhongxuan/article/details/8107907?utm_medium=distribute.pc_relevant.none-task-blog-title-3&spm=1001.2101.3001.4242
实际抓包看：