Ping vs Traceroute

Published on January 2017 | Categories: Documents | Downloads: 22 | Comments: 0 | Views: 120
of 8
Download PDF   Embed   Report

Comments

Content


Ping vs. Traceroute vs Pathping
One of the biggest misconceptions of all time in networking is the use of a traceroute to determine
that your communication with a server has high latency. On windows, traceroute is the same
command as tracert.
Many people beleive that when they see high latency such as 250ms+ in a single hop of a trace-
route that it means that that device in the transit path is responsible for the degraded network
performance when in fact it could not be more further from the truth.
First lets look at how ping works.
PING, is an application based on the ICMP protocol which is used to send echo packets to a
destination and expecting to receive an echo response and it calculates the RTT (Round Trip Time)
from when the packet was sent to when the echo response was received. Generally when using
PING on a LAN network you can trust that what it is saying is accurate unless you have
foreknowledge of network devices in the transit path that prioritize ICMP over mission critical
TCP/UDP Traffic. This however is very common in networks that utilize unified communications,
meaning voice and data on the same network. This is because QoS Policies are put in place to
ensure voice traffic and other mission critical traffic is prioritized over ICMP thus indirectly affecting
the RTT time of an ICMP ping test.
Trace-route is another method commonly used by technicians and engineers to diagnosis latency in
the transit path however any engineer that has studied how trace-route works would know that its
results are nearly always misleading.
Trace-route works in a manner similar to ping however it uses the TTL feature to make each
successive hop in the transit path respond with an ICMP TTL Expired packet. Thus gives you the
ability to determine which network devices the ICMP packet is traversing.
When you dig deeper into the operation of traceroute you will see that traceroute utilizes 3 probe
packets for each successive hop by default unless you specify other wise. Each probe packet
indirectly measures the latency between the source and the device where the TTL is declared
expired. This latency calculation is a by product of its true intended purpose. Keep in mind even if
you send probes to a device that is five hops away, random latency spikes in any four devices prior
to the fifth hop can result in the fifth hop looking like it has high latency.
Also note that any Control Plane Policing policy enforced on any device in the transit path could
result in ICMP being prioritized to the control plane of the transit device. ICMP is processed switched
by most devices whereas TCP/UDP is express forwarded.



An example below of a traceroute on Windows 7;
C:\>tracert www.google.com -d

Tracing route to www.google.com [74.125.225.113]
over a maximum of 30 hops:

1 1 ms <1 ms <1 ms 10.100.38.2
2 1 ms 1 ms <1 ms 209.51.231.145
3 5 ms 4 ms 3 ms 64.65.234.204
4 7 ms 7 ms 7 ms 64.69.98.140
5 29 ms 29 ms 29 ms 64.69.97.217
6 30 ms 29 ms 29 ms 64.69.97.219
7 31 ms 31 ms 32 ms 128.242.186.161
8 30 ms 30 ms 29 ms 129.250.197.146
9 30 ms 29 ms 30 ms 209.85.254.120
10 33 ms 30 ms 30 ms 209.85.240.150
11 29 ms 30 ms 29 ms 74.125.225.113

Trace complete.

C:\>
You can see from the trace route shown above that there is 3 probes per hop between the source
and destination and that it does not appear to have latency until traffic traverses 64.69.97.217
The whole point of this blog is to teach you how to interpret such data. Just because you see a spike
in latency on the 5th hop does not mean that the 5th hop is causing latency. It can easily mean that
the control plane in the device on the fifth hop is under marginal load and that the processor does
not respond to the ICMP immediately due to other processes with priority.
Just because you see potential latency with trace-route, you should never expect that to be an
accurate representation of latency for TCP/UDP traffic because ICMP and TCP/UDP traffic is
treated completely different when it comes to the routers control/forwarding planes.
Most ISP’s use control-plane policing (CoPP) to prevent overwhelming ICMP flooding to a devices
control plane. This type of flood prevention mechanism can also result in skewed data in trace
routes.
Shown below is a simple CoPP Policy which can result in skewed trace route data.
!
class-map match-all Catch-All-IP
match access-group 124
class-map match-all Management
match access-group 121
class-map match-all Normal
match access-group 122
class-map match-all Undesirable
match access-group 123
class-map match-all Routing
match access-group 120
!
policy-map RTR_CoPP
class Undesirable
police 8000 1500 1500 conform-action drop exceed-action drop
class Routing
police 1000000 50000 50000 conform-action transmit exceed-action transmit
class Management
police 100000 20000 20000 conform-action transmit exceed-action drop
class Normal
police 50000 5000 5000 conform-action transmit exceed-action drop
class Catch-All-IP
police 50000 5000 5000 conform-action transmit exceed-action drop
class class-default
police 8000 1500 1500 conform-action transmit exceed-action transmit
!
access-list 120 permit tcp any gt 1024 10.0.1.0 0.0.0.255 eq bgp
access-list 120 permit tcp any eq bgp 10.0.1.0 0.0.0.255 gt 1024 established
access-list 120 permit tcp any gt 1024 10.0.1.0 0.0.0.255 eq 639
access-list 120 permit tcp any eq 639 10.0.1.0 0.0.0.255 gt 1024 established
access-list 120 permit tcp any 10.0.1.0 0.0.0.255 eq 646
access-list 120 permit udp any 10.0.1.0 0.0.0.255 eq 646
access-list 120 permit ospf any 10.0.1.0 0.0.0.255
access-list 120 permit ospf any host 224.0.0.5
access-list 120 permit ospf any host 224.0.0.6
access-list 120 permit eigrp any 10.0.1.0 0.0.0.255
access-list 120 permit eigrp any host 224.0.0.10
access-list 121 permit tcp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq telnet
access-list 121 permit tcp 10.0.2.0 0.0.0.255 eq telnet 10.0.1.0 0.0.0.255
established
access-list 121 permit tcp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq 22
access-list 121 permit tcp 10.0.2.0 0.0.0.255 eq 22 10.0.1.0 0.0.0.255
established
access-list 121 permit udp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq snmp
access-list 121 permit tcp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq www
access-list 121 permit udp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq 443
access-list 121 permit tcp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq ftp
access-list 121 permit tcp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq ftp-data
access-list 121 permit udp 10.0.2.0 0.0.0.255 10.0.1.0 0.0.0.255 eq syslog
access-list 121 permit udp 10.0.3.0 0.0.0.255 eq domain 10.0.1.0 0.0.0.255
access-list 121 permit udp 10.0.4.0 0.0.0.255 10.0.1.0 0.0.0.255 eq ntp
access-list 122 permit icmp any 10.0.1.0 0.0.0.255 echo
access-list 122 permit icmp any 10.0.1.0 0.0.0.255 echo-reply
access-list 122 permit icmp any 10.0.1.0 0.0.0.255 ttl-exceeded
access-list 122 permit icmp any 10.0.1.0 0.0.0.255 packet-too-big
access-list 122 permit icmp any 10.0.1.0 0.0.0.255 port-unreachable
access-list 122 permit icmp any 10.0.1.0 0.0.0.255 unreachable
access-list 122 permit pim any any
access-list 122 permit udp any any eq pim-auto-rp
access-list 122 permit igmp any any
access-list 122 permit gre any any
access-list 123 permit icmp any any fragments
access-list 123 permit udp any any fragments
access-list 123 permit tcp any any fragments
access-list 123 permit ip any any fragments
access-list 123 permit udp any any eq 1434
access-list 123 permit tcp any any eq 639 rst
access-list 123 permit tcp any any eq bgp rst
access-list 124 permit tcp any any
access-list 124 permit udp any any
access-list 124 permit icmp any any
access-list 124 permit ip any any
!
control-plane
service-policy input RTR_CoPP
!
If you examine the CoPP policy in detail you will notice that all ICMP destined to the control plane is
limited to 50000bps as shown below. It can bust up to 5000bps and if it conforms to the policy it is
transmited, if it exceeds the policy the ICMP is dropped.
class Catch-All-IP
police 50000 5000 5000 conform-action transmit exceed-action drop
With this in mind you should always use trace route for its intended purpose which is to determine
the route traffic takes when traversing the transit path and that latency shown on the per hop probe
basis is to be taken with at grain of salt when traversing public devices.
The intended purpose of the 3 probe count is to determine if the traffic traverses multiple routed
paths due to route engineering, not to determine the latency 3 times.
I will conclude this blog with the pathping command. This command found on windows is a
command similar to traceroute but it combines traceroute with ping to give you a better
understanding of latency in the transit path.
Pathping works first by doing a traceroute to the destination then it uses ICMP to ping each hop in
the transit path 100 times. This is used to verify latency between the source and destination via icmp
echo per each hop. But remember what I said earlier, you cannot rely ICMP when public devices are
involved. So you can run into cases where you see ICMP pings destined to one hop in the transit
drop 40% of the traffic whereas the next hop has 100% success rate. This is due to CoPP.
Pathping in general is a much better tool to diagnosis latency from a specific source to destination
with arelative degree of accuracy. Note that I said Relative, this is because latency is ALWAYS
relative to your location on the network.
Shown below is an example of pathping in the works;
C:\>pathping www.google.com -n

Tracing route to www.google.com [74.125.225.116]
over a maximum of 30 hops:
0 10.100.38.162
1 10.100.38.2
2 209.51.231.145
3 64.65.234.204
4 64.69.98.171
5 64.69.99.238
6 165.121.238.178
7 64.214.141.253
8 67.16.132.174
9 72.14.218.13
10 72.14.238.232
11 72.14.236.206
12 216.239.46.215
13 72.14.237.132
14 209.85.240.150
15 74.125.225.116

Computing statistics for 375 seconds...
Source to Here This Node/Link
Hop RTT Lost/Sent = Pct Lost/Sent = Pct Address
0 10.100.38.162
0/ 100 = 0% |
1 1ms 0/ 100 = 0% 0/ 100 = 0% 10.100.38.2
0/ 100 = 0% |
2 0ms 0/ 100 = 0% 0/ 100 = 0% 209.51.231.145
0/ 100 = 0% |
3 4ms 0/ 100 = 0% 0/ 100 = 0% 64.65.234.204
0/ 100 = 0% |
4 6ms 0/ 100 = 0% 0/ 100 = 0% 64.69.98.171
0/ 100 = 0% |
5 22ms 0/ 100 = 0% 0/ 100 = 0% 64.69.99.238
0/ 100 = 0% |
6 10ms 0/ 100 = 0% 0/ 100 = 0% 165.121.238.178
0/ 100 = 0% |
7 34ms 0/ 100 = 0% 0/ 100 = 0% 64.214.141.253
0/ 100 = 0% |
8 37ms 0/ 100 = 0% 0/ 100 = 0% 67.16.132.174
0/ 100 = 0% |
9 35ms 0/ 100 = 0% 0/ 100 = 0% 72.14.218.13
0/ 100 = 0% |
10 --- 100/ 100 =100% 100/ 100 =100% 72.14.238.232
0/ 100 = 0% |
11 --- 100/ 100 =100% 100/ 100 =100% 72.14.236.206
0/ 100 = 0% |
12 --- 100/ 100 =100% 100/ 100 =100% 216.239.46.215
0/ 100 = 0% |
13 --- 100/ 100 =100% 100/ 100 =100% 72.14.237.132
0/ 100 = 0% |
14 --- 100/ 100 =100% 100/ 100 =100% 209.85.240.150
0/ 100 = 0% |
15 36ms 0/ 100 = 0% 0/ 100 = 0% 74.125.225.116

Trace complete.

C:\>
As you can see from the pathping shown above there are some hops in the transit path that
completely drop ICMP. You can also notice that the latency to hop 5 is higher then the latency is to
hop 6. This shows that either Control Plane Policing is used on 64.69.99.238 or the process
utilization on hop 5 is relatively higher.
You should know that there are other tools out there that are extremely useful when trying to
diagnosis latency related problems. Most of these tools rely on ICMP and your decision to trust them
is based on your understanding the transit path. One of these tools being Ping Plotter. There are
several useful tools included in the Solarwinds Engineers Toolset however this toolset is extremely
expensive. You can download a trial and check it out at Solarwinds Engineers Toolset
The most accurate tools depend on TCP however since TCP is a connection oriented protocol, both
the source and destination must be willing to participate in the testing. Some tools are hardware
based such as the Fluke Network EtherScope which cost several thousand dollars.
So in conclusion, your decision to trust and use data from ICMP based troubleshooting should be
based on your relative understanding of the transit path. You should never take a traceroute that has
high latency on it and say its a network issue just because hope 7 has latency greater then 250ms.
This is no different the a doctor telling you your spleen is the result of your headaches without factual
basis.
If you do not have clear factual data when diagnosing a problem and you blame the network
because of a traceroute, you may very well be completely missing the root cause of the problem.
Think of it as getting tunneled vision when sh!t hits the fan and management is expecting answers
and the first thing you notice is high latency on a traceroute. With out completely understanding
traceroute you may be fixating on an issue that is really not an issue at all.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close