Recursos Dinámicos
Dynamic vs Static Resource Allocation¶
Two applications asking for resources at the same time

App 2 will not have enough resources to run in this case.
To avoid this, we use dynamic allocation where the executor resources are scaled up or down depending on its usage
Spark Dynamic Allocation Properties¶
spark.dynamicAllocation.enabled
Whether to use dynamic resource allocation, which scales the number of executors registered with this application up and down based on the workload.
spark.dynamicAllocation.executorIdleTimeout
If dynamic allocation is enabled and an executor has been idle for more than this duration, the executor will be removed.
Default is 60s
Note that, under most circumstances, this condition is mutually exclusive with the request condition, in that an executor should not be idle if there are still pending tasks to be scheduled.
spark.dynamicAllocation.cachedExecutorIdleTimeout
If dynamic allocation is enabled and an executor which has cached data blocks has been idle for more than this duration, the executor will be removed.
Default is infinity
spark.dynamicAllocation.initialExecutors
Initial number of executors to run if dynamic allocation is enabled.
If --num-executors
(or spark.executor.instances
) is set and larger than this value, it will be used as the initial number of executors.
Default is spark.dynamicAllocation.minExecutors
spark.dynamicAllocation.maxExecutors
Upper bound for the number of executors if dynamic allocation is enabled.
Default is infinity
spark.dynamicAllocation.executorAllocationRatio
By default, the dynamic allocation will request enough executors to maximize the parallelism according to the number of tasks to process. While this minimizes the latency of the job, with small tasks this setting can waste a lot of resources due to executor allocation overhead, as some executor might not even do any work.
This setting allows to set a ratio that will be used to reduce the number of executors w.r.t. full parallelism. Defaults to 1.0 to give maximum parallelism. 0.5 will divide the target number of executors by 2 The target number of executors computed by the dynamicAllocation can still be overridden by the spark.dynamicAllocation.minExecutors and spark.dynamicAllocation.maxExecutors settings
spark.dynamicAllocation.schedulerBacklogTimeout
If dynamic allocation is enabled and there have been pending tasks backlogged for more than this duration, new executors will be requested.
Same as spark.dynamicAllocation.schedulerBacklogTimeout, but used only for subsequent executor requests.
Spark requests executors in rounds. The actual request is triggered when there have been pending tasks for spark.dynamicAllocation.schedulerBacklogTimeout seconds, and then triggered again every spark.dynamicAllocation.sustainedSchedulerBacklogTimeout seconds thereafter if the queue of pending tasks persists. Additionally, the number of executors requested in each round increases exponentially from the previous round. For instance, an application will add 1 executor in the first round, and then 2, 4, 8 and so on executors in the subsequent rounds.
spark.dynamicAllocation.shuffleTracking.enabled
Enables shuffle file tracking for executors, which allows dynamic allocation without the need for an external shuffle service. This option will try to keep alive executors that are storing shuffle data for active jobs.
spark.dynamicAllocation.shuffleTracking.timeout
When shuffle tracking is enabled, controls the timeout for executors that are holding shuffle data. The default value means that Spark will rely on the shuffles being garbage collected to be able to release executors. If for some reason garbage collection is not cleaning up shuffles quickly enough, this option can be used to control when to time out executors even when they are storing shuffle data.

After 60 seconods all the executors are killed
Sometiems the executors take time to be killed due to garbage collection process
