Discussion:
[etherlab-users] UNMATCHED and SKIPPED datagrams
Erwin Burgstaller
2009-01-14 10:10:24 UTC
Permalink
Hi!

I've some problems with a slave my company builds, running the Beckhoff
ET1100-ASIC.

My part is the programming on the master side.

Everything works pretty good, as long as there's no interruption on the
cabling or reboot of the slave happening. I'm pretty sure the problem
is on the slave side, but I have to prove. :(

Scenario: Two slaves are connected. EtherCAT-master module is running
and my rtai kernel module is using it. The slave is able to do EoE btw., but
the interfaces aren't activated on the masters side.

eoe0s0 Link encap:Ethernet HWaddr 00:11:22:33:44:12
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

eoe0s1 Link encap:Ethernet HWaddr 00:11:22:33:44:13
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

What I liked to know, how do I interpret log entries like these:

This continues for hours. Sometimes there are 3 data grams unmatched, but mostly 4.

[57061.210985] EtherCAT: Domain 0: Working counter changed to 0/6.
[57061.210990] EtherCAT DEBUG: skipping datagram f4d85640.
[57061.211830] EtherCAT DEBUG: UNMATCHED datagram:
[57061.211835] EtherCAT DEBUG: 0C D3 00 00 00 00 60 80 00 00 00 00 00 00 00 00
[57061.211862] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.211877] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.211891] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 02 16 02 00 0C 90
[57061.211905] EtherCAT DEBUG: 01 01 BF 00 00 00 00 00 00 00 00 AE 9B FF C1 03
[57061.211919] EtherCAT DEBUG: 02 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.211932] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 05 00
[57061.211944] EtherCAT WARNING: Datagram f4d85640 (domain0-0) was SKIPPED 1 time.
[57061.310890] EtherCAT DEBUG: skipping datagram f4d85640.
[57061.311739] EtherCAT DEBUG: UNMATCHED datagram:
[57061.311744] EtherCAT DEBUG: 0C 50 00 00 00 00 60 80 00 00 00 00 00 00 00 00
[57061.311772] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.311786] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.311800] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 02 16 02 00 0C 90
[57061.311814] EtherCAT DEBUG: 01 01 C6 00 00 00 00 00 00 00 00 F2 9B 8A 4F 03
[57061.311828] EtherCAT DEBUG: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.311842] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 05 00
[57061.311854] EtherCAT WARNING: Datagram f4d85640 (domain0-0) was SKIPPED 1 time.
[57061.610622] EtherCAT DEBUG: skipping datagram f4d85640.
[57061.611473] EtherCAT DEBUG: UNMATCHED datagram:
[57061.611478] EtherCAT DEBUG: 0C C7 00 00 00 00 60 80 00 00 00 00 00 00 00 00
[57061.611506] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.611520] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.611534] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 02 16 02 00 0C 90
[57061.611548] EtherCAT DEBUG: 01 01 DB 00 00 00 00 00 00 00 00 AF 14 2F 69 03
[57061.611562] EtherCAT DEBUG: 02 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.611576] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 05 00
[57061.611588] EtherCAT WARNING: Datagram f4d85640 (domain0-0) was SKIPPED 1 time.
[57061.710532] EtherCAT DEBUG: skipping datagram f4d85640.
[57061.711383] EtherCAT DEBUG: UNMATCHED datagram:
[57061.711388] EtherCAT DEBUG: 0C 44 00 00 00 00 60 80 00 00 00 00 00 00 00 00
[57061.711415] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.711429] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.711443] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 12 0C 90 01 02
[57061.711457] EtherCAT DEBUG: F7 00 00 00 00 00 00 00 00 20 E6 E5 DD 03 E0 03
[57061.711471] EtherCAT DEBUG: 02 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.711486] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 05 00
[57061.711497] EtherCAT WARNING: Datagram f4d85640 (domain0-0) was SKIPPED 1 time.
[57062.057072] EtherCAT WARNING: 4 datagrams UNMATCHED!
[57062.213909] EtherCAT: Domain 0: 7 working counter changes - now 5/6.
[57069.203855] EtherCAT: Domain 0: Working counter changed to 0/6.
[57069.203861] EtherCAT DEBUG: skipping datagram f4d85640.
[57069.204714] EtherCAT DEBUG: UNMATCHED datagram:

It's always the same datagram: f4d85640, whatever that means.

Pulling the Ethernet cable and putting it in again, didn't change the
UNMATCHED datagrams.

After I've did reset the slaves and reloaded my module, I first got 212
datagrams UNMATCHED, but then it worked without any problems, when
leaving untouched, but full operating and it seems, with no loss of data
too. But that's probably we've a transfer protocol similar to that the
Beckhoff EL6001 uses.

I'll attach this last part of the syslog compressed with gzip, as it's
the whole run up of that two slaves with debug=1.

There's another strange warning in there, but maybe harmless:
EtherCAT WARNING: Unknown category type 0x003C in slave 0.
--
Erwin Burgstaller
Florian Pose
2009-01-14 11:01:46 UTC
Permalink
Hi,
Post by Erwin Burgstaller
[57061.710532] EtherCAT DEBUG: skipping datagram f4d85640.
[57061.711388] EtherCAT DEBUG: 0C 44 00 00 00 00 60 80 00 00 00 00 00 00 00 00
[57061.711415] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.711429] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.711443] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 00 12 0C 90 01 02
[57061.711457] EtherCAT DEBUG: F7 00 00 00 00 00 00 00 00 20 E6 E5 DD 03 E0 03
[57061.711471] EtherCAT DEBUG: 02 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
[57061.711486] EtherCAT DEBUG: 00 00 00 00 00 00 00 00 00 00 05 00
[57061.711497] EtherCAT WARNING: Datagram f4d85640 (domain0-0) was SKIPPED 1 time.
skipped datagrams mean, that the datagram is sent again, although the
answer to the last sending hasn't been received yet. This usually means
that the realtime cycle was executed too short after the last one (maybe
because the last one was delayed for some reason).

The unmatched datagram you see after one was skipped is the one that was
sent out in the last cycle. This can't be matched because the
corresponding 'sent' datagram was skipped.
Post by Erwin Burgstaller
EtherCAT WARNING: Unknown category type 0x003C in slave 0.
Category type 0x003C is not part of the specification. If this is your
own slave, please check, if there are valid data in the SII.
--
Best regards,
Florian Pose

http://etherlab.org
Erwin Burgstaller
2009-01-14 13:15:08 UTC
Permalink
This usually means that the realtime cycle was executed too short
after the last one (maybe because the last one was delayed for some
reason).
Currently the realtime cycle is running at 1000 Hz. I'll check out, if
it's getting worse by going up or better by going down with speed.

As there's no switch in between, and as far as I understood, there's no
buffering on the slave side, the delay must be caused by the master,
right?
Category type 0x003C is not part of the specification. If this is your
own slave, please check, if there are valid data in the SII.
It is, so we'll check.


Thanks,

Erwin
--
Erwin Burgstaller
Erwin Burgstaller
2009-01-15 16:39:05 UTC
Permalink
Post by Florian Pose
Post by Erwin Burgstaller
[57061.710532] EtherCAT DEBUG: skipping datagram f4d85640.
^^^^^^^^
Post by Florian Pose
skipped datagrams mean, that the datagram is sent again, although the
answer to the last sending hasn't been received yet. This usually means
that the realtime cycle was executed too short after the last one (maybe
because the last one was delayed for some reason).
It seems the problem is there are other rtai tasks running. As they have
the same priority, it might happen that my module is delayed too long,
so I'll try to lower the other tasks priority.

But, what's strange, if that happens once, the system never gets right
anymore, even if the other tasks are removed.

Is it correct, that the same datagram (f4d85640 in that very case) is
skipped for several hours? Or could there be a problem in the masters
datagram queue?


Thanks,
Erwin
--
Erwin Burgstaller
Florian Pose
2009-01-22 09:22:48 UTC
Permalink
Post by Erwin Burgstaller
But, what's strange, if that happens once, the system never gets right
anymore, even if the other tasks are removed.
What do you see in that case? Are all datagrams skipped? Does the same
number of datagrams appear as unmatched?
Post by Erwin Burgstaller
Is it correct, that the same datagram (f4d85640 in that very case) is
skipped for several hours? Or could there be a problem in the masters
datagram queue?
The only case that I could explain this with, is if the cycle time of
your application is in the same size as the frame round-trip time (which
is usually 10-20 usec. But your cycle time isn't that fast, is it?
--
Best regards,
Florian Pose

http://etherlab.org
Erwin Burgstaller
2009-01-22 14:35:55 UTC
Permalink
Post by Florian Pose
What do you see in that case? Are all datagrams skipped? Does the same
number of datagrams appear as unmatched?
No, not all of them. There were usually four or five per second and yes,
the number of unmatched was the same as of skipped.

But the problem has completely gone now. Setting the other rtai tasks
priority to 2, while the kernel module has 1, did help.

Sorry, I thought I did already post about that success, but I've seen it
has only been the plan of doing any changes to priorities.
Post by Florian Pose
The only case that I could explain this with, is if the cycle time of
your application is in the same size as the frame round-trip time (which
is usually 10-20 usec. But your cycle time isn't that fast, is it?
No it isn't. It does 1000 loops per second and ecrt_domain_process() is
called every loop of course.

Thanks for your help,

Erwin
--
Erwin Burgstaller
Loading...