print multicast path from a source to a
Assessing problems in the distribution of IP multicast traffic can
mtrace utilizes a tracing feature
implemented in multicast routers (
3.3 and later) that is accessed via an extension to the IGMP protocol. A
trace query is passed hop-by-hop along the reverse path from the
receiver to the source,
collecting hop addresses, packet counts, and routing error conditions along
the path, and then the response is returned to the requestor.
The only required parameter is the source host name or address. The default receiver is the host running mtrace, and the default group is "MBone Audio" (184.108.40.206), which is sufficient if packet loss statistics for a particular multicast group are not needed. These two optional parameters may be specified to test the path to some other receiver in a particular group, subject to some constraints as detailed below. The two parameters can be distinguished because the receiver is a unicast address and the group is a multicast address.
The options are as follows:
- Send the trace query via unicast directly to the multicast router gateway rather than multicasting the query. This must be the last-hop router on the path from the intended source to the receiver. NOTE: Read the BUGS section below.
- Use if_addr as the local interface address (on a multi-homed host) for sending the trace query and as the default for the receiver and the response destination.
- Loop indefinitely printing packet rate and loss statistics for the
multicast path every 10 seconds (see
- Always send the response using multicast rather than attempting unicast first.
- Set to the maximum number of hops that will be traced from the receiver back toward the source. The default is 32 hops (infinity for the DVMRP routing protocol).
- Print hop addresses numerically rather than symbolically and numerically (saves a nameserver address-to-name lookup for each router found on the path).
- Listen passively for multicast responses from traces initiated by others. This works best when run on a multicast router.
- Set the maximum number of query attempts for any hop to nqueries. The default is 3.
- Send the trace response to host rather than to the
host on which
mtraceis being run, or to a multicast address other than the one registered for this purpose (220.127.116.11).
- Change the interval between statistics gathering traces to stat_int seconds (default 10 seconds).
- Print a short form output including only the multicast path and not the packet rate and loss statistics.
- Set the ttl (time-to-live, or number of hops) for multicast trace queries and responses. The default is 64, except for local queries to the "all routers" multicast group which use ttl 1.
- Verbose mode; show hop times on the initial trace and statistics display.
- Set the time to wait for a trace response to waittime seconds (default 3 seconds).
How It Works
The technique used by the
to trace unicast network paths will not work for IP multicast because ICMP
responses are specifically forbidden for multicast traffic. Instead, a
tracing feature has been built into the multicast routers. This technique
has the advantage that additional information about packet rates and losses
can be accumulated while the number of packets sent is minimized.
Since multicast uses reverse path forwarding, the trace is run backwards from the receiver to the source. A trace query packet is sent to the last hop multicast router (the leaf router for the desired receiver address). The last hop router builds a trace response packet, fills in a report for its hop, and forwards the trace packet using unicast to the router it believes is the previous hop for packets originating from the specified source. Each router along the path adds its report and forwards the packet. When the trace response packet reaches the first hop router (the router that is directly connected to the source's net), that router sends the completed response to the response destination address specified in the trace query.
If some multicast router along the path does not implement the multicast traceroute feature or if there is some outage, then no response will be returned. To solve this problem, the trace query includes a maximum hop count field to limit the number of hops traced before the response is returned. That allows a partial path to be traced.
The reports inserted by each router contain not only the address of the hop, but also the ttl required to forward and some flags to indicate routing errors, plus counts of the total number of packets on the incoming and outgoing interfaces and those forwarded for the specified group. Taking differences in these counts for two traces separated in time and comparing the output packet counts from one hop with the input packet counts of the next hop allows the calculation of packet rate and packet loss statistics for each hop to isolate congestion problems.
Finding the Last-Hop Router
The trace query must be sent to the multicast router which is the
last hop on the path from the source to the
receiver. If the receiver is on
the local subnet (as determined using the subnet mask), then the default
method is to multicast the trace query to all-routers.mcast.net (18.104.22.168)
with a ttl of 1. Otherwise, the trace query is multicast to the
group address since the last hop router will be a
member of that group if the receiver is. Therefore it
is necessary to specify a group that the intended
receiver is joined. This multicast is sent with a
default ttl of 64, which may not be sufficient for all cases (changed with
-t option). If the last hop router is known, it
may also be addressed directly using the
Alternatively, if it is desired to trace a group that the
receiver has not joined, but it is known that the
last-hop router is a member of another group, the
option may also be used to specify a different multicast address for the
When tracing from a multihomed host or router, the default receiver address may not be the desired interface for the path from the source. In that case, the desired interface should be specified explicitly as the receiver.
Directing the Response
mtrace first attempts to trace
the full reverse path, unless the number of hops to trace is explicitly set
-m option. If there is no response within a
3 second timeout interval (changed with the
option), a "*" is printed and the probing switches to hop-by-hop
mode. Trace queries are issued starting with a maximum hop count of one and
increasing by one until the full path is traced or no response is received.
At each hop, multiple probes are sent (default is three, changed with
-q option). The first half of the attempts (default
is one) are made with the unicast address of the host running
mtrace as the destination for the response. Since
the unicast route may be blocked, the remainder of attempts request that the
response be multicast to mtrace.mcast.net (22.214.171.124) with the ttl set to
32 more than what's needed to pass the thresholds seen so far along the path
to the receiver. For the last quarter of the attempts
(default is one), the ttl is increased by another 32 each time up to a
maximum of 192. Alternatively, the ttl may be set explicitly with the
-t option and/or the initial unicast attempts can be
forced to use multicast instead with the
For each attempt, if no response is received within the timeout, a
"*" is printed. After the specified number of attempts have
mtrace will try to query the next hop router
with a DVMRP_ASK_NEIGHBORS2 request (as used by the
mrinfo program) to see what kind of router it
The output of
mtrace is in two sections.
The first section is a short listing of the hops in the order they are
queried, that is, in the reverse of the order from the
source to the receiver. For each
hop, a line is printed showing the hop number (counted negatively to
indicate that this is the reverse path); the multicast routing protocol
(DVMRP, MOSPF, PIM, etc.); the threshold required to forward data (to the
previous hop in the listing as indicated by the up-arrow character); and the
cumulative delay for the query to reach that hop (valid only if the clocks
are synchronized). This first section ends with a line showing the
round-trip time which measures the interval from when the query is issued
until the response is received, both derived from the local system clock. A
sample use and output might be:
oak.isi.edu 80# mtrace -l caraway.lcs.mit.edu 126.96.36.199 Mtrace from 188.8.131.52 to 184.108.40.206 via group 220.127.116.11 Querying full reverse path... 0 oak.isi.edu (18.104.22.168) -1 cub.isi.edu (22.214.171.124) DVMRP thresh^ 1 3 ms -2 la.dart.net (126.96.36.199) DVMRP thresh^ 1 14 ms -3 dc.dart.net (188.8.131.52) DVMRP thresh^ 1 50 ms -4 bbn.dart.net (184.108.40.206) DVMRP thresh^ 1 63 ms -5 mit.dart.net (220.127.116.11) DVMRP thresh^ 1 71 ms -6 caraway.lcs.mit.edu (18.104.22.168) Round trip time 124 ms
The second section provides a pictorial view of the path in the forward direction with data flow indicated by arrows pointing downward and the query path indicated by arrows pointing upward. For each hop, both the entry and exit addresses of the router are shown if different, along with the initial ttl required on the packet in order to be forwarded at this hop and the propagation delay across the hop assuming that the routers at both ends have synchronized clocks. The right half of this section is composed of several columns of statistics in two groups. Within each group, the columns are the number of packets lost, the number of packets sent, the percentage lost, and the average packet rate at each hop. These statistics are calculated from differences between traces and from hop to hop as explained above. The first group shows the statistics for all traffic flowing out the interface at one hop and in the interface at the next hop. The second group shows the statistics only for traffic forwarded from the specified source to the specified group.
These statistics are shown on one or two lines for each hop.
Without any options, this second section of the output is printed only once,
approximately 10 seconds after the initial trace. One line is shown for each
hop showing the statistics over that 10-second period. If the
-l option is given, the second section is repeated
every 10 seconds and two lines are shown for each hop. The first line shows
the statistics for the last 10 seconds, and the second line shows the
cumulative statistics over the period since the initial trace, which is 101
seconds in the example below. The second section of the output is omitted if
-s. option is set.
Waiting to accumulate statistics... Results after 101 seconds: Source Response Dest Packet Statistics For Only For Traffic 22.214.171.124 126.96.36.199 All Multicast Traffic From 188.8.131.52 | __/ rtt 125 ms Lost/Sent = Pct Rate To 184.108.40.206 v / hop 65 ms --------------------- ------------------ 220.127.116.11 18.104.22.168 mit.dart.net | ^ ttl 1 0/6 = --% 0 pps 0/2 = --% 0 pps v | hop 8 ms 1/52 = 2% 0 pps 0/18 = 0% 0 pps 22.214.171.124 126.96.36.199 bbn.dart.net | ^ ttl 2 0/6 = --% 0 pps 0/2 = --% 0 pps v | hop 12 ms 1/52 = 2% 0 pps 0/18 = 0% 0 pps 188.8.131.52 184.108.40.206 dc.dart.net | ^ ttl 3 0/271 = 0% 27 pps 0/2 = --% 0 pps v | hop 34 ms -1/2652 = 0% 26 pps 0/18 = 0% 0 pps 220.127.116.11 18.104.22.168 la.dart.net | ^ ttl 4 -2/831 = 0% 83 pps 0/2 = --% 0 pps v | hop 11 ms -3/8072 = 0% 79 pps 0/18 = 0% 0 pps 22.214.171.124 126.96.36.199 cub.isi.edu | \__ ttl 5 833 83 pps 2 0 pps v \ hop -8 ms 8075 79 pps 18 0 pps 188.8.131.52 184.108.40.206 Receiver Query Source
Because the packet counts may be changing as the trace query is propagating, there may be small errors (off by 1 or 2) in these statistics. However, those errors should not accumulate, so the cumulative statistics line should increase in accuracy as a new trace is run every 10 seconds. There are two sources of larger errors, both of which show up as negative losses:
- If the input to a node is from a multi-access network with more than one other node attached, then the input count will be (close to) the sum of the output counts from all the attached nodes, but the output count from the previous hop on the traced path will be only part of that. Hence the output count minus the input count will be negative.
- In release 3.3 of the DVMRP multicast forwarding software for SunOS and other systems, a multicast packet generated on a router will be counted as having come in an interface even though it did not. This creates the negative loss that can be seen in the example above.
Note that these negative losses may mask positive losses.
In the example, there is also one negative hop time. This simply indicates a lack of synchronization between the system clocks across that hop. This example also illustrates how the percentage loss is shown as two dashes when the number of packets sent is less than 10 because the percentage would not be statistically valid.
A second example shows a trace to a receiver
that is not local; the query is sent to the last-hop router with the
-g option. In this example, the trace of the full
reverse path resulted in no response because there was a node running an old
mrouted that did not implement the
multicast traceroute function, so
mtrace switched to
hop-by-hop mode. The "Route pruned" error code indicates that
traffic for group 220.127.116.11 would not be forwarded.
oak.isi.edu 108# mtrace -g 18.104.22.168 22.214.171.124 \ butter.lcs.mit.edu 126.96.36.199 Mtrace from 188.8.131.52 to 184.108.40.206 via group 220.127.116.11 Querying full reverse path... * switching to hop-by-hop: 0 butter.lcs.mit.edu (18.104.22.168) -1 jam.lcs.mit.edu (22.214.171.124) DVMRP thresh^ 1 33 ms Route pruned -2 bbn.dart.net (126.96.36.199) DVMRP thresh^ 1 36 ms -3 dc.dart.net (188.8.131.52) DVMRP thresh^ 1 44 ms -4 darpa.dart.net (184.108.40.206) DVMRP thresh^ 16 47 ms -5 * * * noc.hpc.org (220.127.116.11) [mrouted 2.2] didn't respond Round trip time 95 ms
map-mbone(8), mrinfo(8), mrouted(8), traceroute(8)
Implemented by Steve Casner based on an
initial prototype written by Ajit Thyagarajan. The
multicast traceroute mechanism was designed by Van
Jacobson with help from Steve Casner,
Steve Deering, Dino
Farinacci, and Deb Agrawal; it was
mrouted by Ajit
Thyagarajan and Bill Fenner. The option
syntax and the output format of
mtrace are modeled
after the unicast
traceroute(8) program written by Van
Versions 3.3 and 3.5 of
mrouted will crash
if a trace query is received via a unicast packet and
mrouted has no route for the
source address. Therefore, do not use the
-g option unless the target
mrouted has been verified to be 3.4 or newer than