CloudStack fails to start more VMs #10184
-
Hi, I'm struggling to make CloudStack 4.20.0.0 properly start KVM VMs on Ubuntu 22. We have isolated network over VLAN. I have 5 KVM servers connected, each able to handle 30 VMs alone (in KVM without Cloudstack). VMs use local server storage. I do not see any resource problem. To recover I need to restart management, delete virtual router and clean stuck resources, sometimes directly in mysql db. Agent restart is also sometimes needed. Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 11 comments 61 replies
-
libvirt version - 8.0.0-1ubuntu7.10
Also I see repeating CloudStack agent errors when the problem starts (at least in some of hosts)
After libvirt restart, CloudStack agent managed to connect to libvirt, and logs started looking regular. |
Beta Was this translation helpful? Give feedback.
-
After libvirtd restart on all hosts, CloudStack managed to get rid of VMs in Starting state. Though many in Expunging state are still there and are not removed. Refresh of Instances page is a lost slower than before, takes some 12-15 seconds and shows just 49 VMs in Expunging state. |
Beta Was this translation helpful? Give feedback.
-
If anybody successfully runs CloudStack on Ubuntu, please let me know your versions (Ubuntu, kernel, CloudStack, libvirt, etc.) |
Beta Was this translation helpful? Give feedback.
-
@akrasnov-drv |
Beta Was this translation helpful? Give feedback.
-
One more peace of info
Then most of VMs were expunged. |
Beta Was this translation helpful? Give feedback.
-
Similar issue afer upgrade to 4.20 (Ubuntu 22), but I couldn't run any VM (unfortunately, we had to withdraw the upgrade, I cannot provide more detailed information) agent.log: ``2025-01-10 12:28:30,179 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-1:null) (logid:fb2d5891) Exit value of process [3708958] for command [/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh update_config.py 169.254.93.185 vm_dhcp_entry.json.82a5c90f-4744-4bba-abe0-df28dfaa3c0c ] is [1]. and managemen.log:
|
Beta Was this translation helpful? Give feedback.
-
I provided logs and additional info above in corresponding threads.
|
Beta Was this translation helpful? Give feedback.
-
Appeared that when I start VMs with static nat, VR cpu is quite high. Hardly understand what it does on 2.1Ghz to get
but I'll try to increase VR CPU |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
@akrasnov-drv
To solve the issues, we have
|
Beta Was this translation helpful? Give feedback.
-
I can confirm now that the fix improves scale a lot, still I have the same issues, though not after 3 but after several dozens of nodes
|
Beta Was this translation helpful? Give feedback.
@akrasnov-drv
just for your information, we have faced some issues which seem to be same as yours
/var/www/html/latest/.htaccess
is very large, which caused more than 10 seconds (even more than 2 mins) to apply metadata for a user vm in VRTo solve the issues, we have
workers
(Number of worker threads handling remote agent connections.) to50
, and restart cloudstack-management. it can be increased to be larger value./var/www/html/latest/.htaccess
and ke…