A resilient scheduler for dataflow execution
2017 IEEE International Symposium on Defect and Fault Tolerance in …, 2017•ieeexplore.ieee.org
As processor manufacturing companies shifted to chips with an ever-increasing number of
cores, creating a tangible way for average programmers to exploit parallelism became
imperative. The scientific community is in a quest to create programming models that would
make it easier to describe tasks and interaction between them. On the other hand, as the
number of cores increases, so does the chance of having a fault in a core, so it is also
important to provide resiliency to these programming models. DFER was shown to be a …
cores, creating a tangible way for average programmers to exploit parallelism became
imperative. The scientific community is in a quest to create programming models that would
make it easier to describe tasks and interaction between them. On the other hand, as the
number of cores increases, so does the chance of having a fault in a core, so it is also
important to provide resiliency to these programming models. DFER was shown to be a …
As processor manufacturing companies shifted to chips with an ever-increasing number of cores, creating a tangible way for average programmers to exploit parallelism became imperative. The scientific community is in a quest to create programming models that would make it easier to describe tasks and interaction between them. On the other hand, as the number of cores increases, so does the chance of having a fault in a core, so it is also important to provide resiliency to these programming models. DFER was shown to be a good fit to take advantage of dataflow programming while introducing resiliency to transient faults inside dataflow task execution. However, although most of the computing time of the dataflow system is spent in task execution, it is also desirable to provide fault tolerance in scheduling operations. This paper introduces novel techniques that incorporate a level of resiliency to the dataflow task scheduler in DFER. Experiments with two different approaches for achieving resiliency in the scheduler show promising results that take DFER one step further towards reliability.
ieeexplore.ieee.org
Showing the best result for this search. See all results