ACI: re-ingegnerizzazione access-policy a seguito mismatch encapsulation vlan error – verifica code F3274 F1425 aci-preupgrade-validation
22.02 2024 | by massimilianoQuesto use-case presenta una quantità di vlan pool, physical domain ed AAEP in numero eccessivo, tale da provocare un numero […]
Questo use-case presenta una quantità di vlan pool, physical domain ed AAEP in numero eccessivo, tale da provocare un numero considerevole di encapsulation vlan error mismatch
Una architettura sbagliata è rappresentata di seguito (la combinazione di VL/EPG in rosso sono presenti in diverse combinazioni VL-Pool+ PHY-Domain)

Di fatto evidenziando il seguente errore (ad esempio per la Leaf 101):

La distribuzione della Vlan 101 ad esempio è rappresentata dal diagramma seguente:

Da una analisi riferita alla Vlan 101 è stato riscontrato:
La progettazione ACI Fabric è basata su un approccio “Network Centric” dove:
1 VLAN = 1 EPG = 1BD
In questo modo ciascuna vlan rappresenta un singolo dominio di broadcast ed in genere non richiede l’impiego di contracts (no ACL).
Non esistono ambienti multi-tenants ma viene tutto rilasciato nel tenant di default “common”.
La principale causa di un vlan-encap-mismatch è dovuto al fatto che multipli domain associati ad EPG contengono overlapped Vlan block, con possibili numeri di intermittenti packets drop.
Gli scenari maggiormente interessati a questo problema sono:
- EPGs deployed on VPC links con due domain ed associazione di overlapped Vlan-Pool
- EPGs deployed on individual links con due domain ed associazione di overlapped Vlan-Pool
La fase di tshoot ha evidenziato:
LEAF-101
leaf101# show system internal epm vlan all | grep 101
195 FD vlan 802.1Q 101 12592 205 194 15
leaf101# show system internal epm vlan 195
+———-+———+—————–+———-+——+———-+———–
VLAN ID Type Access Encap Fabric H/W id BD VLAN Endpoint
(Type Value) Encap Count
+———-+———+—————–+———-+——+———-+———–
195 FD vlan 802.1Q 101 12592 205 194 15
leaf101# show system internal epm vlan 195 detail
VLAN 195
VLAN type : FD vlan
hw id : 205 ::: sclass : 5481
access enc : (802.1Q, 101)
fabric enc : (VXLAN, 12592)
Object store EP db version : 74697132
BD vlan id : 194 ::: BD vnid : 15204288 ::: VRF vnid : 3047424
Valid : Yes ::: Incomplete : No ::: Learn Enable : Yes
pol_ctrl_flags: ::: dom_ctrl :
Endpoint count : 15 ::: Local Endpoint count : 15 On Peer Endpoint count 0
::::
LEAF-102
leaf102# show system internal epm vlan all | grep 101
199 FD vlan 802.1Q 101 16592 143 198 15
leaf102# show system internal epm vlan 199
+———-+———+—————–+———-+——+———-+———–
VLAN ID Type Access Encap Fabric H/W id BD VLAN Endpoint
(Type Value) Encap Count
+———-+———+—————–+———-+——+———-+———–
199 FD vlan 802.1Q 101 16592 143 198 13
leaf102# show system internal epm vlan 199 detail
VLAN 199
VLAN type : FD vlan
hw id : 143 ::: sclass : 5481
access enc : (802.1Q, 101)
fabric enc : (VXLAN, 16592)
Object store EP db version : 73611837
BD vlan id : 198 ::: BD vnid : 15204288 ::: VRF vnid : 3047424
Valid : Yes ::: Incomplete : No ::: Learn Enable : Yes
pol_ctrl_flags: ::: dom_ctrl :
Endpoint count : 17 ::: Local Endpoint count : 17 On Peer Endpoint count 0
::::
LEAF-103
leaf103# show system internal epm vlan all | grep 101
101 Tenant BD NONE 0 15073232 101 101 48
102 FD vlan 802.1Q 430 15892 125 101 6
144 FD vlan 802.1Q 101 16592 132 143 23
leaf103# show system internal epm vlan 144
+———-+———+—————–+———-+——+———-+———–
VLAN ID Type Access Encap Fabric H/W id BD VLAN Endpoint
(Type Value) Encap Count
+———-+———+—————–+———-+——+———-+———–
144 FD vlan 802.1Q 101 16592 132 143 23
leaf103# show system internal epm vlan 144 detail
VLAN 144
VLAN type : FD vlan
hw id : 132 ::: sclass : 5481
access enc : (802.1Q, 101)
fabric enc : (VXLAN, 16592)
Object store EP db version : 11204
BD vlan id : 143 ::: BD vnid : 15204288 ::: VRF vnid : 3047424
Valid : Yes ::: Incomplete : No ::: Learn Enable : Yes
pol_ctrl_flags: ::: dom_ctrl :
Endpoint count : 23 ::: Local Endpoint count : 20 On Peer Endpoint count 3
::::
LEAF-104
leaf104# show system internal epm vlan all | grep 101
81 Tenant BD NONE 0 15335346 101 81 0
145 FD vlan 802.1Q 101 16592 125 144 23
leaf104# show system internal epm vlan 145
+———-+———+—————–+———-+——+———-+———–
VLAN ID Type Access Encap Fabric H/W id BD VLAN Endpoint
(Type Value) Encap Count
+———-+———+—————–+———-+——+———-+———–
145 FD vlan 802.1Q 101 16592 125 144 23
leaf104# show system internal epm vlan 145 detail
VLAN 145
VLAN type : FD vlan
hw id : 125 ::: sclass : 5481
access enc : (802.1Q, 101)
fabric enc : (VXLAN, 16592)
Object store EP db version : 13094
BD vlan id : 144 ::: BD vnid : 15204288 ::: VRF vnid : 3047424
Valid : Yes ::: Incomplete : No ::: Learn Enable : Yes
pol_ctrl_flags: ::: dom_ctrl :
Endpoint count : 23 ::: Local Endpoint count : 18 On Peer Endpoint count 5
Situazione VXLAN mismatch relativo agli output di cui sopra:

LEAF | Vlan-ID (PI internal) | Vlan-Access-Encapsulation | SClass | Fabric Encap (VXLAN-ID) | BD VxLAN ID | VRF VxLAN ID |
101 | 195 | 101 | 5481 | 12592 | 15204288 | 3047424 |
102 | 199 | 101 | 5481 | 16592 | 15204288 | 3047424 |
103 | 144 | 101 | 5481 | 16592 | 15204288 | 3047424 |
104 | 145 | 101 | 5481 | 16592 | 15204288 | 3047424 |
Di seguito si verifica e si indica un’output che mette in evidenza lo status di EPG flapping:
spine201# show coop internal info repo ep dampening | grep 15204288
——————————————
EP bd vnid : 15204288
EP mac : 00:50:56:A9:0A:A8
num of ipv4 addresses : 0
num of ipv6 addresses : 0
Damp penalty : 6343
Damp status : FREEZE
——————————————
EP bd vnid : 15204288
EP mac : 00:50:56:A9:25:07
num of ipv4 addresses : 0
num of ipv6 addresses : 0
Damp penalty : 6328
Damp status : FREEZE
——————————————
EP bd vnid : 15204288
EP mac : 00:50:56:94:20:F8
num of ipv4 addresses : 0
num of ipv6 addresses : 0
Damp penalty : 6402
Damp status : FREEZE
——————————————
EP bd vnid : 15204288
EP mac : 00:50:56:A9:E2:22
num of ipv4 addresses : 0
num of ipv6 addresses : 0
Damp penalty : 3527
Damp status : FREEZE
——————————————
EP bd vnid : 15204288
EP mac : 00:1C:7F:6E:5E:58
num of ipv4 addresses : 0
num of ipv6 addresses : 0
Damp penalty : 10000
Damp status : FREEZE
——————————————
EP bd vnid : 15204288
EP mac : 00:50:56:94:42:99
num of ipv4 addresses : 0
num of ipv6 addresses : 0
Damp penalty : 5446
Damp status : FREEZE
——————————————
Total no of dampened EPs = 52 à insieme ad altri BD VNID
La condizione di FREEZE significa l’effetto di un EP flapping a causa di differenti motivi e il FREEZE dampening permette a tutti i Leaf di ignorare qualsiasi aggiornamento proveniente da endpoint in freezed state; in questo modo nessun update COOP verrà inviato agli Spine prevenendo cosi eventuali problemi in COOP control-plane.
Viceversa una configurazione corretta deve invece avere una simmetria in termini di VXLAN -ID per coppie di switches Leaf, come evidenziato di seguito:

LEAF | Vlan-ID (PI internal) | Vlan-Access-Encapsulation | SClass | Fabric Encap (VXLAN-ID) | BD VxLAN ID | VRF VxLAN NID |
101 | 194 | 101 | 16387 | 8892 | 15040468 | 2850816 |
102 | 190 | 101 | 16387 | 8892 | 15040468 | 2850816 |
103 | 128 | 101 | 16387 | 19892 | 15040468 | 2850816 |
104 | 18 | 101 | 16387 | 19892 | 15040468 | 2850816 |
La soluzione che prevede la ri-ottimizzazione delle policy access indicate all’inizio del documento prevede questa nuova architettura:

Verify fabric-encap-mismatch code F3274
apic1# moquery -c faultInst -f ‘fault.Inst.code==”F3274″‘
Total Objects shown: 20
# fault.Inst
code : F3274
ack : no
annotation :
cause : fabric-encap-mismatch
changeSet : fabEncMismatchVlans (New: 106), fabEncMismatchVlansSet (New: failed)
childAction :
created : 2022-01-22T00:04:54.270+00:00
delegated : yes
descr : VNID mismatch between peers detected for encap vlans (106).
dn : topology/pod-1/node-103/sys/vpc/inst/dom-103/if-346/fault-F3274
domain : infra
extMngdBy : undefined
highestSeverity : critical
lastTransition : 2022-01-22T00:07:03.668+00:00
lc : raised
modTs : never
occur : 1
origSeverity : critical
prevSeverity : critical
rn : fault-F3274
rule : vpc-if-if-fabric-encap-mismatch
severity : critical
status :
subject : if-fabric-encap-mismatch
type : config
uid :
# fault.Inst
code : F3274
ack : no
annotation :
cause : fabric-encap-mismatch
changeSet : fabEncMismatchVlans (New: 105-106)
childAction :
created : 2022-01-21T21:50:56.614+00:00
delegated : yes
descr : VNID mismatch between peers detected for encap vlans (105-106).
dn : topology/pod-1/node-103/sys/vpc/inst/dom-103/if-685/fault-F3274
domain : infra
extMngdBy : undefined
highestSeverity : critical
lastTransition : 2022-01-22T00:07:03.668+00:00
lc : raised
modTs : never
occur : 2
origSeverity : critical
prevSeverity : cleared
rn : fault-F3274
rule : vpc-if-if-fabric-encap-mismatch
severity : critical
status :
subject : if-fabric-encap-mismatch
type : config
uid :
# fault.Inst
code : F3274
ack : no
annotation :
cause : fabric-encap-mismatch
changeSet : fabEncMismatchVlans (New: 101,901)
childAction :
created : 2022-01-22T00:04:43.082+00:00
delegated : yes
descr : VNID mismatch between peers detected for encap vlans (101,901).
dn : topology/pod-1/node-102/sys/vpc/inst/dom-101/if-2/fault-F3274
domain : infra
extMngdBy : undefined
highestSeverity : critical
lastTransition : 2022-01-22T00:07:03.557+00:00
lc : raised
modTs : never
occur : 1
origSeverity : critical
prevSeverity : critical
rn : fault-F3274
rule : vpc-if-if-fabric-encap-mismatch
severity : critical
status :
subject : if-fabric-encap-mismatch
type : config
uid :
……..
………
Verify subnet overlap code F1425
apic1# moquery -c faultInst -f ‘fault.Inst.code==”F1425″‘
Total Objects shown: 5
# fault.Inst
code : F1425
ack : no
annotation :
cause : ip-provisioning-failed
changeSet : operStQual (New: if-down)
childAction :
created : 2021-12-08T13:10:39.860+00:00
delegated : no
descr : IPv4 address(10.1.0.0/31) is operationally down, reason:Interface down on node 201 fabric hostname spine201
dn : topology/pod-1/node-201/sys/ipv4/inst/dom-overlay-1/if-[eth1/33.33]/addr-[10.1.0.0/31]/fault-F1425
domain : access
extMngdBy : undefined
highestSeverity : major
lastTransition : 2021-12-08T13:12:44.055+00:00
lc : raised
modTs : never
occur : 1
origSeverity : major
prevSeverity : major
rn : fault-F1425
rule : ipv4-addr-oper-st-down
severity : major
status :
subject : oper-state-err
type : operational
uid :
# fault.Inst
code : F1425
ack : yes
annotation :
cause : ip-provisioning-failed
changeSet : ipv4CfgFailedBmp (New: ipv4:Addraddr_failed_flag,ipv4:Addrctrl_failed_flag,ipv4:AddrlcOwn_failed_flag,ipv4:AddrmodTs_failed_flag,ipv4:AddrmonPolDn_failed_flag,ipv4:Addrpref_failed_flag,ipv4:Addrtag_failed_flag,ipv4:Addrtype_failed_flag,ipv4:AddrvpcPeer_failed_flag), ipv4CfgFailedTs (New: 00:00:00:00.000), ipv4CfgState (New: 1), operSt (New: down), operStQual (New: subnet-overlap)
childAction :
created : 2021-07-21T20:05:15.333+00:00
delegated : no
descr : IPv4 address(10.79.239.59/32) is operationally down, reason:Subnet overlap on node 102 fabric hostname leaf102
dn : topology/pod-1/node-102/sys/ipv4/inst/dom-common:AD/if-[lo2]/addr-[10.79.239.59/32]/fault-F1425
domain : access
extMngdBy : undefined
highestSeverity : major
lastTransition : 2021-07-21T20:07:38.442+00:00
lc : raised
modTs : never
occur : 1
origSeverity : major
prevSeverity : major
rn : fault-F1425
rule : ipv4-addr-oper-st-down
severity : major
status :
subject : oper-state-err
type : operational
uid :
# fault.Inst
code : F1425
ack : yes
annotation :
cause : ip-provisioning-failed
changeSet : ipv4CfgFailedBmp (New: ipv4:Addraddr_failed_flag,ipv4:Addrctrl_failed_flag,ipv4:AddrlcOwn_failed_flag,ipv4:AddrmodTs_failed_flag,ipv4:AddrmonPolDn_failed_flag,ipv4:Addrpref_failed_flag,ipv4:Addrtag_failed_flag,ipv4:Addrtype_failed_flag,ipv4:AddrvpcPeer_failed_flag), ipv4CfgState (New: 1), operStQual (New: static-rt-nh)
childAction :
created : 2021-07-09T12:19:01.919+00:00
delegated : no
descr : IPv4 address(10.171.0.1/29) is operationally down, reason:Configured as static-rt nh on node 101 fabric hostname leaf101
dn : topology/pod-1/node-101/sys/ipv4/inst/dom-common:M8/if-[vlan251]/addr-[10.171.0.1/29]/fault-F1425
domain : access
extMngdBy : undefined
highestSeverity : major
lastTransition : 2021-07-09T12:42:55.770+00:00
lc : raised
modTs : never
occur : 1
origSeverity : major
prevSeverity : major
rn : fault-F1425
rule : ipv4-addr-oper-st-down
severity : major
status :
subject : oper-state-err
type : operational
uid :
# fault.Inst
code : F1425
ack : yes
annotation :
cause : ip-provisioning-failed
changeSet : ipv4CfgFailedBmp (New: ipv4:Addraddr_failed_flag,ipv4:Addrctrl_failed_flag,ipv4:AddrlcOwn_failed_flag,ipv4:AddrmodTs_failed_flag,ipv4:AddrmonPolDn_failed_flag,ipv4:Addrpref_failed_flag,ipv4:Addrtag_failed_flag,ipv4:Addrtype_failed_flag,ipv4:AddrvpcPeer_failed_flag), ipv4CfgState (New: 1), operStQual (New: no-primary)
childAction :
created : 2021-07-09T12:19:01.922+00:00
delegated : no
descr : IPv4 address(10.171.0.6/29) is operationally down, reason:No primary address on node 101 fabric hostname leaf101
dn : topology/pod-1/node-101/sys/ipv4/inst/dom-common:M8/if-[vlan251]/addr-[10.171.0.6/29]/fault-F1425
domain : access
extMngdBy : undefined
highestSeverity : major
lastTransition : 2021-07-09T12:42:55.770+00:00
lc : raised
modTs : never
occur : 1
origSeverity : major
prevSeverity : major
rn : fault-F1425
rule : ipv4-addr-oper-st-down
severity : major
status :
subject : oper-state-err
type : operational
uid :
# fault.Inst
code : F1425
ack : yes
annotation :
cause : ip-provisioning-failed
changeSet : ipv4CfgFailedBmp (New: ipv4:Addraddr_failed_flag,ipv4:Addrctrl_failed_flag,ipv4:AddrlcOwn_failed_flag,ipv4:AddrmodTs_failed_flag,ipv4:AddrmonPolDn_failed_flag,ipv4:Addrpref_failed_flag,ipv4:Addrtag_failed_flag,ipv4:Addrtype_failed_flag,ipv4:AddrvpcPeer_failed_flag), ipv4CfgFailedTs (New: 00:00:00:00.000), ipv4CfgState (New: 1), operSt (New: down), operStQual (New: subnet-overlap)
childAction :
created : 2021-07-21T20:01:04.759+00:00
delegated : no
descr : IPv4 address(10.79.239.58/32) is operationally down, reason:Subnet overlap on node 101 fabric hostname leaf101
dn : topology/pod-1/node-101/sys/ipv4/inst/dom-common:AD/if-[lo2]/addr-[10.79.239.58/32]/fault-F1425
domain : access
extMngdBy : undefined
highestSeverity : major
lastTransition : 2021-07-21T20:03:28.560+00:00
lc : raised
modTs : never
occur : 1
origSeverity : major
prevSeverity : major
rn : fault-F1425
rule : ipv4-addr-oper-st-down
severity : major
status :
subject : oper-state-err
type : operational
uid :
apic1#
…………
…………..
Verify script aci-preupgrade-validation
apic1# bash
admin@apic1:~> cd /data/techsupport/
admin@apic1:techsupport> ls
aci-preupgrade-validation-script.py
admin@apic1:techsupport> python aci-preupgrade-validation-script.py
==== 2022-01-28T09-54-32+0000 ====
Enter username for APIC login : admin
Enter password for corresponding User :
Checking current APIC version (switch nodes are assumed to be on the same version)…4.2(4i)
Gathering APIC Versions from Firmware Repository…
[1]: aci-apic-dk9.4.2.4i.bin
What is the Target Version? : 1
You have chosen version “aci-apic-dk9.4.2.4i.bin”
[Check 1/37] APIC Target version image and MD5 hash…
Checking apic1…… DONE
Checking apic3…… DONE
Checking apic2…… DONE
PASS
[Check 2/37] Target version compatibility… PASS
[Check 3/37] Gen 1 switch compatibility… PASS
[Check 4/37] Remote Leaf Compatibility… No Remote Leaf Found N/A
[Check 5/37] APIC CIMC Compatibility… PASS
[Check 6/37] APIC Cluster is Fully-Fit… PASS
[Check 7/37] Switches are all in Active state… PASS
[Check 8/37] NTP Status… PASS
[Check 9/37] Firmware/Maintenance Groups when crossing 4.0 Release… Versions not applicable N/A
[Check 10/37] Features that need to be Disabled prior to Upgrade… FAIL – OUTAGE WARNING!!
Feature Name Status Recommended Action
——- —- —— ——————
App Center ELAM Assistant active Disable the app
[Check 11/37] Switch Upgrade Group Guidelines… PASS
[Check 12/37] APIC Disk Space Usage (F1527, F1528, F1529 equipment-full)… PASS
[Check 13/37] Switch Node /bootflash usage… all below 50% PASS
[Check 14/37] Standby APIC Disk Space Usage… No standby APIC found N/A
[Check 15/37] APIC SSD Health (F2731 equipment-wearout)… PASS
[Check 16/37] Switch SSD Health (F3073, F3074 equipment-flash-warning)… PASS
[Check 17/37] Config On APIC Connected Port (F0467 port-configured-for-apic)… PASS
[Check 18/37] L3 Port Config (F0467 port-configured-as-l2)… PASS
[Check 19/37] L2 Port Config (F0467 port-configured-as-l3)… PASS
[Check 20/37] L3Out Subnets (F0467 prefix-entry-already-in-use)… PASS
[Check 21/37] BD Subnets (F1425 subnet-overlap)… FAIL – OUTAGE WARNING!!
Fault Pod Node VRF Interface Address Recommended Action
—– — —- — ——— ——- ——————
F1425 1 101 common:AD [lo2] 10.79.239.58/32 Resolve the conflict by removing BD subnets causing the overlap
F1425 1 102 common:AD [lo2] 10.79.239.59/32 Resolve the conflict by removing BD subnets causing the overlap
[Check 22/37] BD Subnets (F0469 duplicate-subnets-within-ctx)… PASS
[Check 23/37] VMM Domain Controller Status… No VMM Domains Found N/A
[Check 24/37] VMM Domain LLDP/CDP Adjacency Status… No LLDP/CDP Adjacency Failed Faults Found PASS
[Check 25/37] Different infra VLAN via LLDP (F0454 infra-vlan-mismatch)… PASS
[Check 26/37] HW Programming Failure (F3544 L3Out Prefixes, F3545 Contracts, actrl-resource-unavailable)… PASS
[Check 27/37] Scalability (faults related to Capacity Dashboard)… PASS
[Check 28/37] VPC-paired Leaf switches… PASS
[Check 29/37] Overlapping VLAN Pools… FAIL – OUTAGE WARNING!!
Tenant AP EPG VLAN Pool (Domain) 1 VLAN Pool (Domain) 2 Recommended Action
—— — — ——————– ——————– ——————
common AD 104 BRIDGE-N5K (BRIDGE-N5K) UCS (UCS) Resolve overlapping VLANs between these two VLAN pools
common AD 104 CHECK-POINT (CHECK-POINT) BRIDGE-N5K (BRIDGE-N5K) Resolve overlapping VLANs between these two VLAN pools
common AD 104 CHECK-POINT (CHECK-POINT) MIRZEUNITY (MIRZEUNITY) Resolve overlapping VLANs between these two VLAN pools
common AD 104 CHECK-POINT (CHECK-POINT) UCS (UCS) Resolve overlapping VLANs between these two VLAN pools
common AD 104 CHECK-POINT (CHECK-POINT) VNXE-NAS (VNXE-NAS) Resolve overlapping VLANs between these two VLAN pools
common AD 104 MIRZEUNITY (MIRZEUNITY) BRIDGE-N5K (BRIDGE-N5K) Resolve overlapping VLANs between these two VLAN pools
common AD 104 MIRZEUNITY (MIRZEUNITY) UCS (UCS) Resolve overlapping VLANs between these two VLAN pools common AD 104 VNXE-NAS (VNXE-NAS) BRIDGE-N5K (BRIDGE-N5K) Resolve overlapping VLANs