Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix nat table by getting the fitting device for an address #9552

Open
wants to merge 2 commits into
base: 4.19
Choose a base branch
from

Conversation

DaanHoogland
Copy link
Contributor

Description

This PR...

Fixes: #9473

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Copy link

codecov bot commented Aug 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 15.08%. Comparing base (6e6a276) to head (20c4e4b).
Report is 5 commits behind head on 4.19.

Additional details and impacted files
@@             Coverage Diff              @@
##               4.19    #9552      +/-   ##
============================================
- Coverage     15.08%   15.08%   -0.01%     
- Complexity    11184    11185       +1     
============================================
  Files          5406     5406              
  Lines        472889   472915      +26     
  Branches      57738    57661      -77     
============================================
+ Hits          71352    71354       +2     
- Misses       393593   393617      +24     
  Partials       7944     7944              
Flag Coverage Δ
uitests 4.30% <ø> (ø)
unittests 15.80% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@weizhouapache
Copy link
Member

@DaanHoogland
I had a look at issue #8562 which has been fixed by #8599

Assume there are two public IPs in the VPC VR (and isolated network VR):

  • xx.xx.64.x (source nat, default public IP), on eth1
  • xx.xx.96.x (additional public ip range). on ethX

I think the expected behaviour should be

  • all vms (without Static Nat) has the source Ip xx.xx.64.x (this is current behaviour)
  • the VR should be able to connect to xx.xx.96.x network with source ip xx.xx.96.x (otherwise the gateway check may fail, see Failed VR health check gateways_check.py on additional public IP range #9473)
  • the VMs should be able to connect to xx.xx.96.x network with source ip xx.xx.64.x or xx.xx.96.x (to be discussed)

currently the rules are

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX --to-source xx.xx.64.x

seems better to change to

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX --to-source xx.xx.96.x

or

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX -d xx.xx.96.1 --to-source xx.xx.96.x  (96.1 is gateway)
-A POSTROUTING -j SNAT -o ethX ! -d xx.xx.96.1 --to-source xx.xx.64.x  (96.1 is gateway)

to be discussed

@@ -554,7 +554,7 @@ def fw_vpcrouter(self):
if self.address["source_nat"]:
self.fw.append(["nat", "front",
"-A POSTROUTING -o %s -j SNAT --to-source %s" %
(self.dev, self.address['public_ip'])])
(self.address['device'], self.address['public_ip'])])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line applies only when private gateway is source nat.

it seems we need to change line 698-700

https://github.com/apache/cloudstack/pull/9552/files#diff-3c470eee70094a82ad3ed790deed16a991e75ed18901cfb82d82a80cd71228a7L698-R700

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll try and find if we have data on the second IF at that point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaanHoogland
have you checked the new iptables rules ? do they look good ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try changes on line 698-700
@DaanHoogland

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have a look at those @weizhouapache , they work in my test.

Copy link
Member

@weizhouapache weizhouapache Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this code snippet (line 554 to 557) can be removed.
It has been covered by line 696-697 (new code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please ignore my previous comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to check, for private gateway , is self.dev same as self.address['device'] ?

@DaanHoogland DaanHoogland changed the base branch from main to 4.19 August 21, 2024 06:51
elif cmdline.get_source_nat_ip() and not self.is_private_gateway():
self.fw.append(
["nat", "", "-A POSTROUTING -j SNAT -o %s --to-source %s" % (self.dev, cmdline.get_source_nat_ip())])
self.fw.append(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are multiple public ips (in multiple ranges), will there be same amount of rules ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand the question. I checked this in a lab env and the resulting nat table was exactly as described in the issue, with only the last line being different. Ar you considdering another configuration here @weizhouapache ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each public ip (and private gateway), there will be a rule below, right ?

-A POSTROUTING -j SNAT -o ethX --to-source xx.yy.zz.xx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaanHoogland
to be clear, we need a rule for each public NIC, for example

-A POSTROUTING -j SNAT -o eth1 --to-source <source nat IP>    # this is for source nat NIC
-A POSTROUTING -j SNAT -o eth5 --to-source <first public IP on eth5>    # this is for additional public NIC

If I understand correctly, for the current changes , the rules are for example,

-A POSTROUTING -j SNAT -o eth1 --to-source <source nat IP>    # this is for source nat NIC
-A POSTROUTING -j SNAT -o eth1 --to-source <second IP on source nat NIC>    # this is for source nat NIC
-A POSTROUTING -j SNAT -o eth1 --to-source <third IP on source nat NIC>    # this is for source nat NIC

-A POSTROUTING -j SNAT -o eth5 --to-source <first public IP on eth5>    # this is for additional public NIC
-A POSTROUTING -j SNAT -o eth5 --to-source <second public IP on eth5>    # this is for additional public NIC
-A POSTROUTING -j SNAT -o eth5 --to-source <third public IP on eth5>    # this is for additional public NIC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll verify that. Do you happen to know what condition to test for? I don't think the self.address object contains information on whether it is the first IP, does it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original issue does not exist in our lab (I can verify with infra).

we can only verify the iptables rules in the VR

  • create 2 public ip ranges with different vlan
  • acquire 3 public ips on each public ip and use them for static/pf/lb

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the self.address object contains information on whether it is the first IP, does it?

it does, but I am not sure if it is 100% correct.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll give it a try

@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@DaanHoogland
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11482

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed VR health check gateways_check.py on additional public IP range
3 participants