You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation of the Spark TensorFlow Distributor says:
in order to use many features of this package, you must set up Spark custom resource scheduling for GPUs on your cluster. See the Spark docs for this.
Question 1: which "many" features? When would I need to use the custom resource scheduling vs. not?
Question 2: "See the Spark docs for this." The Spark docs are extremely tight-lipped about custom resource scheduling. For example, here: https://spark.apache.org/docs/latest/configuration.html. "spark.driver.resource.{resourceName}.amount" is supposedly an Amount of a particular resource type to use on the driver. That doesn't tell anything as to what the values may be; is it a percentage? It also wants a discovery script. What should be in it?
Can someone provide a fully working example of how to do this? Clearly, the developers of this library have gotten this to work. Please provide a fully functioning example; thanks.
The text was updated successfully, but these errors were encountered:
The documentation of the Spark TensorFlow Distributor says:
Question 1: which "many" features? When would I need to use the custom resource scheduling vs. not?
Question 2: "See the Spark docs for this." The Spark docs are extremely tight-lipped about custom resource scheduling. For example, here: https://spark.apache.org/docs/latest/configuration.html. "spark.driver.resource.{resourceName}.amount" is supposedly an
Amount of a particular resource type to use on the driver
. That doesn't tell anything as to what the values may be; is it a percentage? It also wants a discovery script. What should be in it?Can someone provide a fully working example of how to do this? Clearly, the developers of this library have gotten this to work. Please provide a fully functioning example; thanks.
The text was updated successfully, but these errors were encountered: