Skip to content

(PUP-1353) Start Posix based services with a 0 priority #7520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: 6.x
Choose a base branch
from

Conversation

lollipopman
Copy link
Contributor

Puppet's base service provider starts services as a child process of the
Puppet process. On Posix systems the child process inherits the priority
or nice value of the parent process. The result is that services which
are started as child processes of Puppet have their priority set to
Puppet's priority. This behavior is undesirable as services expect to
be started with a 0 or normal priority. This commit ensures services are
started with a normal priority by changing the priority of the child
process when starting a service.

Example of old & new behavior on Debian Wheezy:

; git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.

; sudo /etc/init.d/ntp stop
Stopping NTP server: ntpd.

; nice -n 13 sudo bundle exec puppet apply -e 'service { "ntp": ensure => "running" }'
Notice: Compiled catalog for foo in environment production in 0.12 seconds
Notice: /Stage[main]/Main/Service[ntp]/ensure: ensure changed 'stopped' to 'running'
Notice: Applied catalog in 0.05 seconds

; ps -C ntpd -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
PID TID CLS RTPRIO NI PRI PSR %CPU STAT WCHAN COMMAND
30269 30269 TS - 13 6 6 0.0 SNs - ntpd

; git checkout service_priority
Switched to branch 'service_priority'
Your branch is up-to-date with 'origin/service_priority'.

; sudo /etc/init.d/ntp stop
Stopping NTP server: ntpd.

; nice -n 13 sudo bundle exec puppet apply -e 'service { "ntp": ensure => "running" }'
Notice: Compiled catalog for foo in environment production in 0.11 seconds
Notice: /Stage[main]/Main/Service[ntp]/ensure: ensure changed 'stopped' to 'running'
Notice: Applied catalog in 0.04 seconds

; ps -C ntpd -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
PID TID CLS RTPRIO NI PRI PSR %CPU STAT WCHAN COMMAND
30536 30536 TS - 0 19 10 0.0 Ss - ntpd

@lollipopman lollipopman requested review from a team May 10, 2019 20:30
@puppetlabs-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@puppetcla
Copy link

CLA signed by all contributors.

@lollipopman
Copy link
Contributor Author

updated to fix the tests, apologies for missing that with my first commit.

@lollipopman lollipopman force-pushed the service_priority branch 2 times, most recently from e4d6767 to c5c5ac1 Compare May 14, 2019 16:20
@lollipopman
Copy link
Contributor Author

Would anyone be able to help me debug the windows build failure via AppVeyor, I don't have access to a Windows test box at present. All the tests passed but rspec exited 1 and the build failed.

priority = Process::NORMAL_PRIORITY_CLASS
else
priority = 0
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use Puppet::Settings::PrioritySetting::PRIORITY_MAP[:normal] to get the platform-specific priority value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code relies on Puppet::Util::Platform.windows? which relies on !!File::ALT_SEPARATOR how would that work for the rspec tests? Since even on windows the specs for other init providers are run, e.g. systemd & openrc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically we've tried to run all of the tests on every platform, as it wasn't always clear which code executed on the agent vs server. For example, if a Windows provider requires a win32-specific gem without using puppet's feature system, them it can cause compilation to fail for all nodes, not just Windows. As a result, you'll see rspec tests for the init, gentoo, etc service providers running on Windows, and they often stub:

allow(Puppet::Util::Platform).to receive(:windows?).and_return(false)

But now we generally exclude provider tests on platforms where it doesn't make sense, like

describe 'Puppet::Type::Service::Provider::Openbsd',
unless: Puppet::Util::Platform.windows? || Puppet::Util::Platform.jruby? do

To your point about PRIORITY_MAP, yeah... I think it's fine to branch based on Puppet::Util::Platform.windows? and keep it how you have it, eg

priority = Puppet::Util::Platform.windows? ? Process::NORMAL_PRIORITY_CLASS : 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshcooper I tried the approach you suggested, but that caused spec failures for Appveyor on windows. For instance:

Failures:
  1) Puppet::Type::Service::Provider::Gentoo #start should use the supplied start command if specified
     Failure/Error: execute(command, opts)
       Service[sshd](provider=gentoo) received :execute with unexpected arguments
         expected: (["/bin/foo"], {:combine=>true, :failonfail=>true, :override_locale=>false, :priority=>0, :squelch=>false})
              got: (["/bin/foo"], {:combine=>true, :failonfail=>true, :override_locale=>false, :priority=>32, :squelch=>false})
       Diff:
       @@ -2,6 +2,6 @@
         {:combine=>true,
          :failonfail=>true,
          :override_locale=>false,
       -  :priority=>0,
       +  :priority=>32,
          :squelch=>false}]
     # ./lib/puppet/provider/service/service.rb:36:in `texecute'
     # ./lib/puppet/provider/service/init.rb:177:in `texecute'
     # ./lib/puppet/provider/service/service.rb:50:in `ucommand'
     # ./lib/puppet/provider/service/base.rb:93:in `start'
     # ./spec/unit/provider/service/gentoo_spec.rb:83:in `block (3 levels) in <top (required)>'
     # util/rspec_runner:44:in `run'
     # util/rspec_runner:59:in `<main>

This happens because priority = Puppet::Util::Platform.windows? ? Process::NORMAL_PRIORITY_CLASS : 0 returns 32 when running on windows even though we really want 0, as this is the Gentoo spec. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd skip *nix service providers (gentoo, src, openrc, init, upstart, systemd) when running on Windows by adding the following to their respective top-level describe:

describe ..., unless: Puppet::Util::Platform.windows? do

Looks like you also need to remove the Facter osfamily expectation from some of the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I submitted #7534 to cleanup the specs

awesome thanks, I'll rebase this work ontop of that PR, once it is merged in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshcooper given that #7534 has been closed, any thoughts on the direction of this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @branan's point, I think the execution API should present a platform agnostic way to specify the priority level (eg :normal), and have the different posix vs windows execution implementations map that to the appropriate platform specific value (e.g. 0 on posix or Process::NORMAL_PRIORITY_CLASS on windows).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshcooper I pushed the platform specific values into the execution implementations, does that look better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshcooper any chance you have time for another review?

Puppet's base service provider starts services as a child process of the
Puppet process. On Posix systems the child process inherits the priority
or nice value of the parent process. The result is that services which
are started as child processes of Puppet have their priority set to
Puppet's priority. This behavior is undesirable as services expect to
be started with a 0 or normal priority. This commit ensures services are
started with a normal priority by changing the priority of the child
process when starting a service.

Example of old & new behavior on Debian Wheezy:

  ; git checkout master
  Switched to branch 'master'
  Your branch is up-to-date with 'origin/master'.

  ; sudo /etc/init.d/ntp stop
  Stopping NTP server: ntpd.

  ; nice -n 13 sudo bundle exec puppet apply -e 'service { "ntp": ensure => "running" }'
  Notice: Compiled catalog for foo in environment production in 0.12 seconds
  Notice: /Stage[main]/Main/Service[ntp]/ensure: ensure changed 'stopped' to 'running'
  Notice: Applied catalog in 0.05 seconds

  ; ps -C ntpd -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
    PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN          COMMAND
  30269 30269 TS       -  13   6   6  0.0 SNs  -              ntpd

  ; git checkout service_priority
  Switched to branch 'service_priority'
  Your branch is up-to-date with 'origin/service_priority'.

  ; sudo /etc/init.d/ntp stop
  Stopping NTP server: ntpd.

  ; nice -n 13 sudo bundle exec puppet apply -e 'service { "ntp": ensure => "running" }'
  Notice: Compiled catalog for foo in environment production in 0.11 seconds
  Notice: /Stage[main]/Main/Service[ntp]/ensure: ensure changed 'stopped' to 'running'
  Notice: Applied catalog in 0.04 seconds

  ; ps -C ntpd -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
    PID   TID CLS RTPRIO  NI PRI PSR %CPU STAT WCHAN          COMMAND
  30536 30536 TS       -   0  19  10  0.0 Ss   -              ntpd
@ciprianbadescu ciprianbadescu requested a review from a team October 14, 2019 07:29
@@ -336,6 +339,11 @@ def self.execute_posix(command, options, stdin, stdout, stderr)
# turning "/tmp/foo;\r\n /bin/echo" into ["/tmp/foo;\r\n", " /bin/echo"]
command = [command].flatten
Process.setsid
if options[:priority]
if Puppet.features.root?
setpriority(Puppet::Settings::PrioritySetting::PRIORITY_MAP[options[:priority]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the priority map belongs in puppet/util/limits, not the settings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtappa I happy to rework parts of this commit, but I was waiting for someone from puppet to thubs the general approach, prior to spending more time on it.

@joshcooper
Copy link
Contributor

Thanks @lollipopman! I'm 👍 with the general PR, and had a few minor comments:

  • The PR would cause a behavior change by default, as services would be executed at normal priority instead of the priority of the puppet (parent) process. Since puppet has worked this way for a long time, someone may be relying on the current behavior. Insert favorite xkcd "you broke my workflow" quote here.
  • I'm thinking we should define a priority parameter on the service type and have it default to nil to mean the current behavior (run at the same priority as puppet). But allow users to specify a priority, so in lib/puppet/type/service.rb, add
    newparam(:priority) do
      newvalues(:high, :normal, etc)
    end
  • In the service provider, lookup the current priority from @resource[:priority], and only call setpriority if the value is non-nil.
  • I agree with @jtappa about moving the priority map from settings to Puppet::Util::Limits, and having settings and service type call that.

@jtappa
Copy link
Contributor

jtappa commented Nov 25, 2019

Hi @lollipopman! Just checking in on this PR to make sure we keep community PRs moving along. Did you still want to work on this based on @joshcooper's and my feedback?

Thanks!

@jtappa
Copy link
Contributor

jtappa commented Dec 2, 2019

@lollipopman we recently received a ticket where execute blows up when Puppet Server doesn't know about options or how to handle them. We need to make sure Puppet Server doesn't break if someone calls execute with a priority. This PR may need some updates related to that.

@lollipopman
Copy link
Contributor Author

@jtappa thanks, I'll see if I can find some time to work on this in the next couple of weeks

@jtappa
Copy link
Contributor

jtappa commented Dec 16, 2019

Because of the nature of our APIs, any changes we make here need to also happen in Puppet Server and right now we don't have the bandwidth between teams to coordinate the changes. We do agree this change is reasonable and something we'd like to include eventually but unfortunately right now we can't move forward with it.

@jtappa jtappa added the blocked PRs blocked on work external to the PR itself label Dec 16, 2019
@joshcooper
Copy link
Contributor

Kicking Travis

@joshcooper joshcooper closed this Jan 17, 2020
@joshcooper joshcooper reopened this Jan 17, 2020
@Dorin-Pleava Dorin-Pleava changed the base branch from master to 6.x December 14, 2020 12:17
@Dorin-Pleava Dorin-Pleava requested a review from a team December 14, 2020 12:17
@CLAassistant
Copy link

CLAassistant commented Sep 1, 2021

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked PRs blocked on work external to the PR itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants