2. Advanced Field Troubleshooting
Why is DOCSIS 3 Troubleshooting Different?
Multiple Bonded Channels
• Downstream
– Not that different.
– The channels are constant carrier
– Multiple downstream channels have been around forever
• Upstream
– Still most vulnerable portion of plant
– The modem is no longer limited to a single upstream transmit path
– In some ways this is actually easier with DOCSIS 3.0
3. Partial Service
One or more upstream or downstream channels become unusable
Partial service does not necessarily impact subscriber
Can be viewed as an advantage from a network operations POV
Hard to detect after CM registration
If the channel is down when the modem registers:
– Will not cause performance issues, unless US is at maximum utilization
– The errors for the “down” channel(s) are reported to the CMTS in the REG-ACK
– D3.0 test equipment will show fewer channels bonded that expected
If the channel is down after the modem registers:
– Data loss until CMTS realizes that Ch is unusable (~1 min – 15xT3 timeouts)
– T4 Timeout > CMTS stops granting transmit opportunities
– CMTS vendors handle re-acquisition differently (it is a SHOULD requirement)
4. Impaired Service
One or more channels are experiencing RF channel impairments
– Presence of codeword errors is one way to identify an impaired channel
Channel impaired when ranging ok, but data becomes corrupted
What would we say about impaired service:
– Impaired service impacts subscriber performance and quality of experience (QoE)
– It is easy to detect impaired service
– Throughput can be a good test for this, but not in all cases
– High volume Packetloss is the most effective test to detect impaired service
Customer complaints likely to generate a ticket:
– VoIP issues (robo-voice, dropped words, dialing issues, dropped calls, etc.)
– VOD – cannot retrieve movie, cannot interact with Guide
– Gaming issues – latency, interactivity issues
– VPN issues – calling, two-way video, desktop sharing, VPN dropping, etc.
– File Sharing/Transfer/FTP Upstream – very slow upstream transfers
5. Advanced Field Troubleshooting
Evaluating the Downstream Quality
Use traditional PHY attributes
• MER
• BER
• DQI
Use traditional Service Layer attributes
• PacketLoss
• Throughput
8. Impaired Service Troubleshooting
An impaired service may or may not exhibit codeword
errors and packetloss
When troubleshooting impaired service, it is critical to
view the performance of the individual upstream
channels.
13. Impaired Service Troubleshooting
Obviously there is an issue with the channel at 19 MHz
Utilize this method to traverse the network and find the
impairment causing this issue
14. Summary
Partial Service
• Tremendous Operational Advantage
• Can go unnoticed
Impaired Service
• Can be difficult to troubleshoot without the right tools
Many Impairments not detectable with
• Packetloss
• Throughput
Best Practice
• Utilize tools that allow the simultaneous testing of bonded
upstream channels at a PHY layer
For the downstream in DOCSIS 3.0, most legacy testing techniques can be appliedPrimary capable downstream channels can be fully tested with DOCSIS 2.0 or DOCSIS 3.0 meters, identifying and troubleshooting all of the usual impairmentsNon-primary capable downstream channels are currently not widely used due to the relatively low population of D3.0 modems, however this will eventually change. Non-primary channels will ultimately require D3.0 test equipment, especially to identify and resolve impaired or partial service situationsThe upstream still remains the Achilles heel of the DOCSIS network, with the highest density of RF impairmentsDOCSIS 3.0 capable cable modems transmit data across up to four upstream channelsLegacy cable modems (1.x and 2.0) often have multiple upstreams to which they can register and connect with, but do so only singularly. This makes troubleshooting a legacy modem sometimes more challenging because the technician must determine what upstream channel the modem is registered on.Troubleshooting a DOCSIS 3.0 modem effectively does require D3.0 test equipment in order to simultaneously exercise all bonded upstream channel at the same time.Testing a DOCSIS 3.0 network with DOCSIS 2.0 test equipment will make it very difficult, if not impossible to identify advanced impairments such as partial and impaired service outages as will be discussed further in this presentation
Two scenarios for partial service:The channel was down when the modem came onlineTest equipment will show fewer channels bonded that expectedWill likely not cause any noticeable performance issues, unless the service tier is very close to the maximum performance with a full channel setThe Errors for the channels that could not be acquired are reported to the CMTS in the REG-ACK, so the cable operator should have visibility in the back office The channel becomes un-usable following TCC.The channel is already in the bonding group for this modem, and has become unusable. The CMTS will stop granting transmit opportunities on this channel until communication is reestablished via the ranging process. CMTS’s may handle re-acquisition differently (it is a SHOULD requirement), so it is recommended that cable operators test their vendor equipment under this scenario in the lab. Circumstances have been observed that once a channel becomes unusable some CMTSs/IOS releases fail to offer a new ranging window for the channel until the cable modem is re-bootedWill cause temporary performance issues between the time that the channel becomes unusable and the time that the CMTS realizes that it is unusable (Range interval + Range timeout: ~1 minute).Following that time, the CMTS will refrain from granting timeslots on this channel, and performance degradation will stop
We will consider an impaired service one where that channel is not bad enough to fail ranging, but impaired enough to cause data corruption.Codeword errors are errors contained in blocks of data contained in each packet sent by a cable modem that are either “correctable” or “uncorrectable” by the Foreword Error Correction (FEC) Possible causes for impaired service are typical PHY impairmentsImpulse noiseGroup delayIn channel ingressRon Hranac’s “gremlins” What would we say about impaired service?Impaired service definitely affects end user performanceIt is easy to detect impaired serviceThroughput can be a good test for this, but not necessarilyHigh volume Packetloss is the most effective test to detect impaired serviceIs it better or worse that the maintenance ranging is performed at a lower modulation to keep the channel in an impaired state or that it were in a higher modulation and caused the channel to transition into a partial service state? This is something that you should discuss with your CMTS vendor.
CATV operators have been testing multiple digital downstream carriers for yearsDOCSIS 3 downstream carriers have the same characteristics as video carriersTraditional digital testing is recommended for downstream troubleshooting.
Seems Ironic, but field technicians need visibility into the headend to see how the signals arrive You need to be in two places at the same time!You need to be generating traffic in the field, and reading at the headend!When you look at the signal from the headend, you look for these thingsEvaluating the Upstream Quality <Just mention them here, Brady has already touched on them>Use Carrier-Based PHY attributesMER (Equalized and Unequalized)In Channel ResponseMicro Reflections/Equalizer StressGroup DelayIngress Under the CarrierUse traditional Service Layer attributesCodeword ErrorsPacketlossThroughput
When the technician determines that a partial service situation exists, the goal is to determine what is causing the partial service. To accomplish this the technician should begin traversing the network toward the CMTS, testing at each test point (Ground Block, Tap, Amps, Node) until the point where the missing channel(s) return to service. When the other channel is operational again, it should be thoroughly tested as it could be marginal and pushing you into an impaired service troubleshooting scenario.
Level of impairment is important here. Many impairments can be masked by EQ technology. You MUST find out how close you are to the “cliff”Seems more difficult that D2. It seems like it would be much easier if there were a single upstream.BUT, you really need to do this on all the upstream channels, not just the one that a D2 modem would be using.Impaired Service can cause packetlossOr Not (Time Dependent Impairments (temperature,…), Proactive Maintenance)Ideally, you want to get the phy layer parameters for each upstream frequency Could do this by: moving the DOCSIS upstream channel to another frequency and inserting a constant carrier from a field meter Some of the data is available from the CMTS on a per-modem basis, you can use a laptop and SNMP to get it Pre-Equalization provides many clues to what is happening on the plant, it is available as described in the PNM Use of burst demodulation in the headend (This gets tricky to get the exact MAC address)The illustrations in this preso will be taken from equipment that retrieves and demodulates the packets from the specific instrument and returns the results to that instrument for display.
When in a bonded environment:Start with one of the channels in the upstream bonding group.Perform Phy analysis Compare against DOCSIS limits where possible Micro-Reflections Group Delay …
Step through the channels 1 by 1.
Step through the channels 1 by 1.
Step through the channels 1 by 1.
This channel is showing CWE, BUT Impaired upstream channels will show up in this manner, even when the no CWE are present.When the technician determines the channel source of the impaired service, the goal is to determine what is causing the impaired service. To accomplish this, the technician should begin traversing the network toward the CMTS, testing at each test point (Ground Block, Tap, Amps, Node) until the point where the affected channel becomes clean. When the channel becomes clean, the technician has passed the impacting impairment. Return to the original failure point and confirm the fix.Must is what I was referring to when we discussed partial service, Just because the channel came back, does not mean it is clean