Nal Packet Aggregation Type And

H.264 is a video codec that delivers visual quality superior to H.263 at the same bit rates. H.264 is also referred to as Advanced Video Coding (AVC). H.264 consists of two separate definitions:

■ The network abstraction layer (NAL)

The VCL represents the video content, and the NAL defines the packetization format for transport protocols such as RTP. All data is contained in NAL units. The H.264 bitstream can be of two formats: NAL unit stream and byte stream format. We limit our discussion to the NAL unit stream as specified in RFC 3984.

Basic Packet Structure

Figure 4-23 shows the format of a NAL header, which is the basic structure of an H.264 RTP packet. Per RFC 3984, all H.264 RTP packets contain the 1-byte NAL header field after the RTP header.

Figure 4-23 NAL Unit Packet Format

12 Bytes

NAL Header

Table 4-6 explains the bit fields. Table 4-6 NAL Header Bit Fields

Bit Field

Size (in Bits)

Description

F

1

Forbidden_zero_bit. A value of 1 indicates that the payload may contain errors or syntax violations. H.264 implementations usually drop packets that have the F bit set to 1. A value of 0 indicates that the payload should not contain any error or syntax violations.

NRI

2

NAL_ref_idc. A value of 00 indicates that the content of the NAL unit does not contain information needed to reconstruct reference pictures for inter-picture prediction. A value of greater than 00 indicates that the receiver must decode the NAL unit to reconstruct other inter-codec pictures.

TYPE

5

This field identifies the type of the NAL unit carried by the packet. A value of 0 is undefined.

The Type field defines the packetization mode, as shown in Table 4-7.

Table 4-7 NAL Header Type Field Values

Type

Packet

Description

0

Undefined

1-23

NAL unit

Single NAL unit packet.

24

STAP-A

Single-time aggregation packet

25

STAP-B

Single-time aggregation packet

26

MTAP-16

Multi-time aggregation packet

27

MTAP-24

Multi-time aggregation packet

28

FU-A

Fragmentation unit

29

FU-B

Fragmentation unit

RTP Header

F

NRI

TYPE

The NAL unit type field indicates the type of the packet (and thus the structure of the RTP packet).

There are three possible types:

■ Single NAL unit (SNALU)—This packet type contains only a single NAL unit as indicated by the type value of 1-23. The H.264 codec specification describes each type value in detail.

■ Aggregation packet—This packet type aggregates multiple NAL units into a single RTP payload. There are four versions of this packet, corresponding to type values 24-27.

■ Fragmentation unit (FU)—This packet type fragments a NAL unit over multiple RTP packets. There are two versions of this packet (type values 28 and 29).

Figure 4-24 shows the three possible H.264 packet types.

Figure 4-24 H.264 Packet Type Formats

Single NAL Unit RTP Packet

RTP Header

NAL UnitA

RTP Header

NAL UnitB

Fragmentation Unit RTP Packet

RTP Header NAL Unit

RTP Header NAL Unit

H.264 Packet Types

Aggregation Unit RTP Packet

RTP Header

NAL UnitA

NAL UnitB

NAL UnitC

The following sections describe the three NAL unit packet types in more detail.

SNALU

The SNALU payload type contains only a single NAL unit in the payload. Figure 4-25 shows the format of the SNALU. It contains a 1-byte header (the fields are explained in Table 4-6 in the preceding section). The value of the Type field is in the range of 1 to 23.

Figure 4-25 Format of the SNALU RTP Packet

X One-Octet NAL J Header

J- Payload

NAL units must be transmitted in the same order as their NAL unit decoding order, and the RTP sequence number should reflect this transmission order.

Aggregation Packet

RFC 3984 defines two basic types of aggregation packets:

■ Single-time aggregation packet (STAP)

■ Multi-time aggregation packet (MTAP)

The STAP and MTAP packets must not be fragmented and should be contained within a single RTP packet. MTAP is not commonly used for video conferencing.

STAP

STAP aggregates NAL units with identical NALU-time. NALU-time is the value that the RTP time stamp would have if that NAL were transported in its own RTP packet. RFC 3984 defines two types of STAP packets:

■ STAP-A—NAL units in the aggregation packet share the same time stamp and appear in valid decoding order.

■ STAP-B—NAL units in the aggregation packet share the same time stamp and may not be in the correct decoding order.

Figure 4-26 shows the packet format of the STAP-A packet. The value of the Type field in the NAL header is set to 24. The Size field (two octets) indicates the size of the NAL unit in bytes, which includes the NAL unit header plus data.

Figure 4-27 shows the format of the STAP-B packet. The type field is set to a value of 25. The STAP-B packet consists of a two-octet decoding order number (DON) that indicates the NAL unit decoding order. The DON is required because the transmission order and the decoding order might differ, and the DON indicates the decoding order.

RTP Header

TYPE

Bytes 2 .. N of Single NAL Unit

Figure 4-26 H.264 STAP-A Packet Format

RTP Header

NAL Unit-1 Size

NAL Unit-1 Header

NAL Unit-1 Data

1 Octet

NAL Unit-n Size NAL Unit-n Header NAL Unit-n Data

Figure 4-27 H.264 STAP-B Packet Format

1 Octet

2 Octets 2 Octets 1 Octet

RTP Header

Decoding Order Number <

NAL Unit-1 Size NAL Unit-1 Header NAL Unit-1 Data

NAL Unit-n Size NAL Unit-n Header NAL Unit-n Data

MTAP

MTAP aggregates NAL units with potentially different NALU times. RFC 3984 defines two types of MTAP packets:

Figure 4-28 shows the packet format of MTAP-16.

■ The payload header contains a two-octet decoding order number base (DONB). The MTAP packet contains multiple NAL units. The DONB contains the value of DON for the first NAL unit in the MTAP packet.

Figure 4-28 H.264 MTAP-16 Packet Format

RTP Header

DON Base

NAL Unit-1 Size

NAL Unit-1 DOND

NAL Unit-1 TS Offset

NAL Unit-1 Header

NAL Unit-1 Data

1 Octet

2 Octets 2 Octets

1 Octet

2 Octets 1 Octet

NAL Unit-2 Size NAL Unit-2 DOND NAL Unit-2 TS Offset NAL Unit-2 Header NAL Unit-2 Data

Figure 4-29 shows the packet format of MTAP-16. The Type field is set to a value of 27.

The choice between MTAP-16 and MTAP-24 is application-dependent. The only difference between the two packet formats is the length of the time-stamp offset field.

Fragmentation Unit Packet

The fragmentation unit (FU) allows a sender to fragment a single NAL unit into several RTP packets. The sender of the FU packet must send the fragments in consecutive order with ascending RTP sequence numbers. The receiver should reassemble the NAL unit according to the same RTP sequence number. The RTP time stamp of an RTP packet carrying an FU is set to the NALU time of the fragmented unit.

Figure 4-29 H.264 MTAP-24 Packet Format

2 Octets

2 Octets 1 Octet

3 Octets 1 Octet

NAL Unit-2 Size NAL Unit-2 DOND NAL Unit-2 TS Offset NAL Unit-2 Header NAL Unit-2 Data

RFC 3984 defines two types of fragmentation unit packets. Figure 4-30 shows the FU-A packet format, which consists of a one-octet NAL header, followed by a one-octet FU header, followed by FU payload.

Figure 4-30 H.264 FU-A Packet Format

1 Octet Fragmentation Unit Header

RTP Header

DON Base

NAL Unit-1 Size

NAL Unit-1 DOND

NAL Unit-1 TS Offset

NAL Unit-1 Header

NAL Unit-1 Data

RTP Header

F

NRI

TYPE = 28

S

E R

TYPE

Fragmentation Unit Payload

Table 4-8 summarizes the Fragmentation Unit header fields.

Table 4-8 summarizes the Fragmentation Unit header fields.

Table 4-8 H.264 Fragmentation Unit Packet Header Fields

Bit Field

Size (in Bits)

Description

S (start)

1

The Start bit indicates the start of the fragmented NAL unit payload. When the following FU payload is not the start of a fragmented NAL unit payload, this bit is set to 0.

E (end)

1

The End bit indicates the end of a fragmented NAL unit. This bit is set to 0 otherwise.

R (reserved)

1

The sender must set this bit to 0, and the receiver must ignore this bit.

Type

5

The value should be set according to Table 7-1 of the H.264 ITU spec.

Figure 4-31 shows the packet format of the FU-B packet. The packet structure of an FU-B packet is similar to that of an FU-A packet, except for the presence of a DON field.

Figure 4-31 H.264 FU-B Packet Format

If a NAL unit can fit into a single FU, the NAL unit should be fragmented. In other words, the fragmented NAL unit must not be transmitted in one FU, which means that the Start and End bit of the FU header cannot be set to 1 in the same FU packet. If a fragmentation unit is lost, the receiver should discard all remaining FUs of that NAL unit.

Key Frame Detection in H.264

Key frame detection with H.264 packets is straightforward. Table 4-9 summarizes the steps to detect a key frame for each packet type.

Table 4-9 H.264 Key Frame Detection for Different Packet Types

Packet Type (as Indicated by the Type Field in the NAL Header)

Steps to Detect the Key Frame in the NAL Header

SNALU (type value between 1 and 23)

The packet contains a key frame if the Type field within the NAL header contains a value of 5 (coded slice of an IDR picture).

RTP Header

Fragmentation Unit Header DON

Fragmentation Unit Payload

Table 4-9 H.264 Key Frame Detection for Different Packet Types (Continued)

Packet Type (as Indicated by the Type Field in the NAL Header)

Steps to Detect the Key Frame in the NAL Header

STAP-A

If (type == 24), skip the next 3 bytes (NAL header and size) and go to the NAL unit header.

If the Type field of the NAL unit header contains a value of 5, the packet carries a key frame.

STAP-B

If (type == 25), skip the next 5 bytes (NAL header, DON, and size) and go to the NAL unit header.

If the Type field of the NAL unit header contains a value of 5, the packet carries a key frame.

MTAP-16

If (type == 26), skip the next 8 bytes (NAL header, DON base, size, DOND, and TS offset) and go to the NAL unit header.

If the Type field of the NAL unit header contains a value of 5, the packet carries a key frame.

MTAP-24

If (type == 27), skip the next 9 bytes (NAL header, DON base, size, DOND, and TS offset) and go to the NAL unit header.

If the Type field of the NAL unit header contains a value of 5, the packet carries a key frame.

FU-A

If (type == 28), skip the next 3 bytes and go to the NAL unit header.

If the Type field of the NAL unit header contains a value of 5, the packet carries a key frame.

FU-B

If (type == 29), skip the next 1 byte and go to the FU unit header.

If the Type field of the FU header contains a value of 5, the packet carries a key frame.

+1 0

Post a comment