diff --git a/README.md b/README.md index c50cfdf..a3c83a3 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ - Document Number: N4707 - Date: 2017-11-22 + Document Number: N4726 + Date: 2018-02-12 Revises: Project: Programming Language C++ Project Number: TS 19570 @@ -7,18 +7,13 @@ NVIDIA Corporation jhoberock@nvidia.com -# Parallelism TS Editor's Report, post-Albuquerque mailing +# Parallelism TS Editor's Report, pre-Jacksonville mailing -N4706 is the proposed working draft of Parallelism TS Version 2. It contains changes to the Parallelism TS as directed by the committee at the Albuquerque meeting. +N4725 is the proposed working draft of Parallelism TS Version 2. It contains editorial changes to the Parallelism TS. -N4706 updates the previous draft, N4696, published in the pre-Toronto mailing. - -# Technical Changes - -* Apply P0776R1 - Rebase the Parallelism TS onto the C++17 Standard -* Apply P0075R2 - Template Library for Parallel For Loops +N4725 updates the previous draft, N4706, published in the post-Toronto mailing. # Acknowledgements -Thanks to Alisdair Meredith and Pablo Halpern for reviewing these changes. +Thanks to Pablo Halpern and Matthias Kretz for suggesting editorial changes. diff --git a/algorithms.html b/algorithms.html index 554032f..ab1cd37 100644 --- a/algorithms.html +++ b/algorithms.html @@ -1,511 +1,103 @@

Parallel algorithms

- -

In general

- - - This clause describes components that C++ programs may use to perform operations on containers - and other sequences in parallel. - - - -

Requirements on user-provided function objects

- - -

- Function objects passed into parallel algorithms as objects of type BinaryPredicate, - Compare, and BinaryOperation shall not directly or indirectly modify - objects via their arguments. -

-
-
- - -

Effect of execution policies on algorithm execution

- - -

- Parallel algorithms have template parameters named ExecutionPolicy which describe - the manner in which the execution of these algorithms may be parallelized and the manner in - which they apply the element access functions. -

-
- - -

- The invocations of element access functions in parallel algorithms invoked with an execution - policy object of type sequential_execution_policy execute in sequential order in - the calling thread. -

-
- - -

- The invocations of element access functions in parallel algorithms invoked with an execution - policy object of type parallel_execution_policy are permitted to execute in an - unordered fashion in either the invoking thread or in a thread implicitly created by the library - to support parallel algorithm execution. Any such invocations executing in the same thread are - indeterminately sequenced with respect to each other. - - - It is the caller's responsibility to ensure correctness, for example that the invocation does - not introduce data races or deadlocks. - -

-
- - -
using namespace std::experimental::parallel;
-int a[] = {0,1};
-std::vector<int> v;
-for_each(par, std::begin(a), std::end(a), [&](int i) {
-  v.push_back(i*2+1);
-});
-
- - The program above has a data race because of the unsynchronized access to the container - v. -
-
- - -
-using namespace std::experimental::parallel;
-std::atomic<int> x = 0;
-int a[] = {1,2};
-for_each(par, std::begin(a), std::end(a), [&](int n) {
-  x.fetch_add(1, std::memory_order_relaxed);
-  // spin wait for another iteration to change the value of x
-  while (x.load(std::memory_order_relaxed) == 1) { }
-});
- - The above example depends on the order of execution of the iterations, and is therefore - undefined (may deadlock). -
-
- - -
-using namespace std::experimental::parallel;
-int x=0;
-std::mutex m;
-int a[] = {1,2};
-for_each(par, std::begin(a), std::end(a), [&](int) {
-  m.lock();
-  ++x;
-  m.unlock();
-});
- - The above example synchronizes access to object x ensuring that it is - incremented correctly. -
- -

- The invocations of element access functions in parallel algorithms invoked with an - execution policy of type unsequenced_policy are permitted to execute - in an unordered fashion in the calling thread, unsequenced with respect to one another - within the calling thread. - - - This means that multiple function object invocations may be interleaved on a single thread. - -

-
- - - This overrides the usual guarantee from the C++ standard, Section 1.9 [intro.execution] that - function executions do not interleave with one another. - -

- -

- The invocations of element access functions in parallel algorithms invoked with an - executino policy of type vector_policy are permitted to execute - in an unordered fashion in the calling thread, unsequenced with respect to one another - within the calling thread, subject to the sequencing constraints of wavefront application - () for the last argument to - for_loop or for_loop_strided. -

- -

- The invocations of element access functions in parallel algorithms invoked with an execution - policy of type parallel_vector_execution_policy - are permitted to execute in an unordered fashion in unspecified threads, and unsequenced - with respect to one another within each thread. - - This means that multiple function object invocations may be interleaved on a single thread. - -

-
- - - This overrides the usual guarantee from the C++ standard, Section 1.9 [intro.execution] that - function executions do not interleave with one another. - -
-
- - Since parallel_vector_execution_policy allows the execution of element access functions to be - interleaved on a single thread, synchronization, including the use of mutexes, risks deadlock. Thus the - synchronization with parallel_vector_execution_policy is restricted as follows:
-
- - A standard library function is vectorization-unsafe if it is specified to synchronize with - another function invocation, or another function invocation is specified to synchronize with it, and if - it is not a memory allocation or deallocation function. Vectorization-unsafe standard library functions - may not be invoked by user code called from parallel_vector_execution_policy algorithms.
-
- - - Implementations must ensure that internal synchronization inside standard library routines does not - induce deadlock. - -

- -
-using namespace std::experimental::parallel;
-int x=0;
-std::mutex m;
-int a[] = {1,2};
-for_each(par_vec, std::begin(a), std::end(a), [&](int) {
-  m.lock();
-  ++x;
-  m.unlock();
-});
- - The above program is invalid because the applications of the function object are not - guaranteed to run on different threads. -
-
- - - The application of the function object may result in two consecutive calls to - m.lock on the same thread, which may deadlock. -
-
- - - The semantics of the parallel_execution_policy or the - parallel_vector_execution_policy invocation allow the implementation to fall back to - sequential execution if the system cannot parallelize an algorithm invocation due to lack of - resources. - - -

- Algorithms invoked with an execution policy object of type execution_policy - execute internally as if invoked with the contained execution policy object. -

- -

- The semantics of parallel algorithms invoked with an execution policy object of - implementation-defined type are implementation-defined. -

-
- - -

Wavefront Application

-

- For the purposes of this section, an evaluation is a value computation or side effect of - an expression, or an execution of a statement. Initialization of a temporary object is considered a - subexpression of the expression that necessitates the temporary object. -

- -

- An evaluation A contains an evaluation B if: - -

    -
  • A and B are not potentially concurrent ([intro.races]); and
  • -
  • the start of A is the start of B or the start of A is sequenced before the start of B; and
  • -
  • the completion of B is the completion of A or the completion of B is sequenced before the completion of A.
  • -
- - This includes evaluations occurring in function invocations. -

- -

- An evaluation A is ordered before an evaluation B if A is deterministically - sequenced before B. If A is indeterminately sequenced with respect to B - or A and B are unsequenced, then A is not ordered before B and B is not ordered - before A. The ordered before relationship is transitive. -

+ +

Wavefront Application

+

+ For the purposes of this section, an evaluation is a value computation or side effect of + an expression, or an execution of a statement. Initialization of a temporary object is considered a + subexpression of the expression that necessitates the temporary object. +

-

- For an evaluation A ordered before an evaluation B, both contained in the same - invocation of an element access function, A is a vertical antecedent of B if: +

+ An evaluation A contains an evaluation B if: -

    -
  • there exists an evaluation S such that: -
      -
    • S contains A, and
    • -
    • S contains all evaluations C (if any) such that A is ordered before C and C is ordered before B,
    • -
    • but S does not contain B, and
    • -
    -
  • -
  • - control reached B from A without executing any of the following: -
      -
    • a goto statement or asm declaration that jumps to a statement outside of S, or
    • -
    • a switch statement executed within S that transfers control into a substatement of a nested selection or iteration statement, or
    • -
    • a throw even if caught, or
    • -
    • a longjmp. -
    -
  • -
+
    +
  • A and B are not potentially concurrent ([intro.races]); and
  • +
  • the start of A is the start of B or the start of A is sequenced before the start of B; and
  • +
  • the completion of B is the completion of A or the completion of B is sequenced before the completion of A.
  • +
- - Vertical antecedent is an irreflexive, antisymmetric, nontransitive relationship between two evaluations. - Informally, A is a vertical antecedent of B if A is sequenced immediately before B or A is nested zero or - more levels within a statement S that immediately precedes B. - -

+ This includes evaluations occurring in function invocations. +

-

- In the following, Xi and Xj refer to evaluations of the same expression - or statement contained in the application of an element access function corresponding to the ith and - jth elements of the input sequence. There might be several evaluations Xk, - Yk, etc. of a single expression or statement in application k, for example, if the - expression or statement appears in a loop within the element access function. -

+

+ An evaluation A is ordered before an evaluation B if A is deterministically + sequenced before B. If A is indeterminately sequenced with respect to B + or A and B are unsequenced, then A is not ordered before B and B is not ordered + before A. The ordered before relationship is transitive. +

-

- Horizontally matched is an equivalence relationship between two evaluations of the same expression. An - evaluation Bi is horizontally matched with an evaluation Bj if: +

+ For an evaluation A ordered before an evaluation B, both contained in the same + invocation of an element access function, A is a vertical antecedent of B if: +

    +
  • there exists an evaluation S such that:
      -
    • both are the first evaluations in their respective applications of the element access function, or
    • -
    • there exist horizontally matched evaluations Ai and Aj that are vertical antecedents of evaluations Bi and Bj, respectively. +
    • S contains A, and
    • +
    • S contains all evaluations C (if any) such that A is ordered before C and C is ordered before B,
    • +
    • but S does not contain B, and
    - - - Horizontally matched establishes a theoretical lock-step relationship between evaluations in different applications of an element access function. - -

    - -

    - Let f be a function called for each argument list in a sequence of argument lists. - Wavefront application of f requires that evaluation Ai be sequenced - before evaluation Bi if i < j and and: - +

  • +
  • + control reached B from A without executing any of the following:
      -
    • Ai is sequenced before some evaluation Bi and Bi is horizontally matched with Bj, or
    • -
    • Ai is horizontally matched with some evaluation Aj and Aj is sequenced before Bj.
    • +
    • a goto statement or asm declaration that jumps to a statement outside of S, or
    • +
    • a switch statement executed within S that transfers control into a substatement of a nested selection or iteration statement, or
    • +
    • a throw even if caught, or
    • +
    • a longjmp.
    +
  • +
- - Wavefront application guarantees that parallel applications i and j execute such that progress on application j never gets ahead of application i. - - - - The relationships between Ai and Bi and between Aj and Bj are sequenced before, not vertical antecedent. - -

-
- - -

ExecutionPolicy algorithm overloads

- - -

- The Parallel Algorithms Library provides overloads for each of the algorithms named in - Table 1, corresponding to the algorithms with the same name in the C++ Standard Algorithms Library. - - For each algorithm in , if there are overloads for - corresponding algorithms with the same name - in the C++ Standard Algorithms Library, - the overloads shall have an additional template type parameter named - ExecutionPolicy, which shall be the first template parameter. - - In addition, each such overload shall have the new function parameter as the - first function parameter of type ExecutionPolicy&&. -

-
- - -

- Unless otherwise specified, the semantics of ExecutionPolicy algorithm overloads - are identical to their overloads without. -

-
- - -

- Parallel algorithms shall not participate in overload resolution unless - is_execution_policy<decay_t<ExecutionPolicy>>::value is true. -

-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table of parallel algorithms
adjacent_differenceadjacent_findall_ofany_of
copycopy_ifcopy_ncount
count_ifequalexclusive_scanfill
fill_nfindfind_endfind_first_of
find_iffind_if_notfor_eachfor_each_n
generategenerate_nincludesinclusive_scan
inner_productinplace_mergeis_heapis_heap_until
is_partitionedis_sortedis_sorted_untillexicographical_compare
max_elementmergemin_elementminmax_element
mismatchmovenone_ofnth_element
partial_sortpartial_sort_copypartitionpartition_copy
reduceremoveremove_copyremove_copy_if
remove_ifreplacereplace_copyreplace_copy_if
replace_ifreversereverse_copyrotate
rotate_copysearchsearch_nset_difference
set_intersectionset_symmetric_differenceset_unionsort
stable_partitionstable_sortswap_rangestransform
transform_exclusive_scantransform_inclusive_scantransform_reduceuninitialized_copy
uninitialized_copy_nuninitialized_filluninitialized_fill_nunique
unique_copy
-
- - - Not all algorithms in the Standard Library have counterparts in . + Vertical antecedent is an irreflexive, antisymmetric, nontransitive relationship between two evaluations. + Informally, A is a vertical antecedent of B if A is sequenced immediately before B or A is nested zero or + more levels within a statement S that immediately precedes B. - -
-
+

- -

Definitions

+

+ In the following, Xi and Xj refer to evaluations of the same expression + or statement contained in the application of an element access function corresponding to the ith and + jth elements of the input sequence. There might be several evaluations Xk, + Yk, etc. of a single expression or statement in application k, for example, if the + expression or statement appears in a loop within the element access function. +

-

- Define GENERALIZED_SUM(op, a1, ..., aN) as follows: + Horizontally matched is an equivalence relationship between two evaluations of the same expression. An + evaluation Bi is horizontally matched with an evaluation Bj if:

    -
  • a1 when N is 1
  • - -
  • - op(GENERALIZED_SUM(op, b1, ..., bK), GENERALIZED_SUM(op, bM, ..., bN)) where - -
      -
    • b1, ..., bN may be any permutation of a1, ..., aN and
    • - -
    • 1 < K+1 = M ≤ N.
    • -
    -
  • +
  • both are the first evaluations in their respective applications of the element access function, or
  • +
  • there exist horizontally matched evaluations Ai and Aj that are vertical antecedents of evaluations Bi and Bj, respectively.
+ + + Horizontally matched establishes a theoretical lock-step relationship between evaluations in different applications of an element access function. +

-
-

- Define GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, ..., aN) as follows: + Let f be a function called for each argument list in a sequence of argument lists. + Wavefront application of f requires that evaluation Ai be sequenced + before evaluation Bji if i < j and and:

    -
  • a1 when N is 1
  • - -
  • - op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, ..., aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM,
    - ..., aN) where 1 < K+1 = M ≤ N. -
  • +
  • Ai is sequenced before some evaluation Bi and Bi is horizontally matched with Bj, or
  • +
  • Ai is horizontally matched with some evaluation Aj and Aj is sequenced before Bj.
+ + + Wavefront application guarantees that parallel applications i and j execute such that progress on application j never gets ahead of application i. + + + + The relationships between Ai and Bi and between Aj and Bj are sequenced before, not vertical antecedent. +

-
@@ -515,25 +107,10 @@

Non-Numeric Parallel Algorithms

Header <experimental/algorithm> synopsis

-#include <algorithm>
-
-namespace std {
-namespace std::experimental {
-inline namespace parallelism_v2 {
-inline namespace v2 {
-  template<class ExecutionPolicy,
-           class InputIterator, class Function>
-    void for_each(ExecutionPolicy&& exec,
-                  InputIterator first, InputIterator last,
-                  Function f);
-  template<class InputIterator, class Size, class Function>
-    InputIterator for_each_n(InputIterator first, Size n,
-                             Function f);
-  template<class ExecutionPolicy,
-           class InputIterator, class Size, class Function>
-    InputIterator for_each_n(ExecutionPolicy&& exec,
-                             InputIterator first, Size n,
-                             Function f);
+#include <algorithm>
+
+namespace std::experimental {
+inline namespace parallelism_v2 {
 
 namespace execution {
   
@@ -548,11 +125,10 @@ 

Header <experimental/algorithm> synopsis

template<class T> ordered_update_t<T> ordered_update(T& ref) noexcept; } - // Exposition only: Suppress template argument deduction. template<class T> struct no_deduce { using type = T; }; -template<class T> struct no_dedude_t = typename no_deduce<T>::type; +template<class T> struct no_deducde_t = typename no_deduce<T>::type; Support for reductions template<class T, class BinaryOperation> @@ -605,18 +181,14 @@

Header <experimental/algorithm> synopsis

class I, class Size, class S, class... Rest> void for_loop_n_strided(ExecutionPolicy&& exec, I start, Size n, S stride, Rest&&... rest); -
-} } } -}
-

Reductions

+

Reductions

-

Each of the function templates in this subclause ([parallel.alg.reductions]) returns a reduction object of unspecified type having a reduction value type and encapsulating a reduction identity value for the reduction, a @@ -640,93 +212,83 @@

Reductions

to commutative operations closely related to the combiner operation. For example if the combiner is plus<T>, incrementing the accumulator would be consistent with the combiner but doubling it or assigning to it would not.

-
- template<class T, class BinaryOperation> -unspecified reduction(T& var, const T& identity, BinaryOperation combiner); + template<class T, class BinaryOperation> +unspecified reduction(T& var, const T& identity, BinaryOperation combiner); - - T shall meet the requirements of CopyConstructible and MoveAssignable. The expression var = combiner(var, var) shall be well-formed. - + T shall meet the requirements of CopyConstructible and MoveAssignable. The expression var = combiner(var, var) shall be well-formed. - - a reduction object of unspecified type having reduction value type T, reduction identity identity, combiner function object combiner, and using the object referenced by var as its live-out object. - + a reduction object of unspecified type having reduction value type T, reduction identity identity, combiner function object combiner, and using the object referenced by var as its live-out object. - template<class T> -unspecified reduction_plus(T& var); - template<class T> -unspecified reduction_multiplies(T& var); - template<class T> -unspecified reduction_bit_and(T& var); - template<class T> -unspecified reduction_bit_or(T& var); - template<class T> -unspecified reduction_bit_xor(T& var); - template<class T> -unspecified reduction_min(T& var); - template<class T> -unspecified reduction_max(T& var); - - - T shall meet the requirements of CopyConstructible and MoveAssignable. - - - - a reduction object of unspecified type having reduction value type T, reduction identity and combiner operation as specified in table and using the object referenced by var as its live-out object. - + template<class T> +unspecified reduction_plus(T& var); + template<class T> +unspecified reduction_multiplies(T& var); + template<class T> +unspecified reduction_bit_and(T& var); + template<class T> +unspecified reduction_bit_or(T& var); + template<class T> +unspecified reduction_bit_xor(T& var); + template<class T> +unspecified reduction_min(T& var); + template<class T> +unspecified reduction_max(T& var); + + T shall meet the requirements of CopyConstructible and MoveAssignable.> + + a reduction object of unspecified type having reduction value type T, reduction identity and combiner operation as specified in table and using the object referenced by var as its live-out object. - + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + +
Reduction identities and combiner operationsReduction identities and combiner operations
FunctionReduction IdentityCombiner OperationFunctionReduction IdentityCombiner Operation
reduction_plusT()x + yreduction_plusT()x + y
reduction_multipliesT(1)x * yreduction_multipliesT(1)x * y
reduction_bit_and(~T())X & yreduction_bit_and(~T())X & y
reduction_bit_orT()x | yreduction_bit_orT()x | y
reduction_bit_xorT()x ^ yreduction_bit_xorT()x ^ y
reduction_minvarmin(x, y)reduction_minvarmin(x, y)
reduction_maxvarmax(x, y)reduction_maxvarmax(x, y)
- - The following code updates each element of y and sets s ot the sum of the squares. + The following code updates each element of y and sets s toot the sum of the squares.
 extern int n;
 extern float x[], y[], a;
@@ -739,15 +301,14 @@ 

Reductions

} );
-
+
-

Inductions

+

Inductions

-

Each of the function templates in this section return an induction object of unspecified type having an induction value type and encapsulating an initial value i of that type and, optionally, a stride. @@ -763,75 +324,68 @@

Inductions

An induction object may refer to a live-out object to hold the final value of the induction sequence. When the algorithm using the induction object completes, the live-out object is assigned the value i + n * stride, where n is the number of elements in the input range.

-
- template<class T> -unspecified induction(T&& var); - - template<class T, class S> -unspecified induction(T&& var, S stride); - - - - - an induction object with induction value type remove_cv_t>remove_reference_t>T<<, - initial value var, and (if specified) stride stride. If T is an lvalue reference - to non-const type, then the object referenced by var becomes the live-out object for the - induction object; otherwise there is no live-out object. - - - + template<class T> +unspecified induction(T&& var); + + template<class T, class S> +unspecified induction(T&& var, S stride); + + + an induction object with induction value type remove_cv_t<>remove_reference_t<>T>><<, + initial value var, and (if specified) stride stride. If T is an lvalue reference + to non-const type, then the object referenced by var becomes the live-out object for the + induction object; otherwise there is no live-out object. +
-

For loop

+

For loop

- template<class I, class... Rest> -void for_loop(no_deduce_t<I> start, I finish, Rest&&... rest); + template<class I, class... Rest> +void for_loop(no_deduce_t<I> start, I finish, Rest&&... rest); - template<class ExecutionPolicy, + template<class ExecutionPolicy, class I, class... Rest> void for_loop(ExecutionPolicy&& exec, no_deduce_t<I> start, I finish, Rest&&... rest); - + - template<class I, class S, class... Rest> + template<class I, class S, class... Rest> void for_loop_strided(no_deduce_t<I> start, I finish, - S stride, Rest&&... rest); + S stride, Rest&&... rest); - template<class ExecutionPolicy, + template<class ExecutionPolicy, class I, class S, class... Rest> void for_loop_strided(ExecutionPolicy&& exec, no_deduce_t<I> start, I finish, S stride, Rest&&... rest); - + - template<class I, class Size, class... Rest> -void for_loop_n(I start, Size n, Rest&&... rest); + template<class I, class Size, class... Rest> +void for_loop_n(I start, Size n, Rest&&... rest); - template<class ExecutionPolicy, + template<class ExecutionPolicy, class I, class Size, class... Rest> void for_loop_n(ExecutionPolicy&& exec, I start, Size n, Rest&&... rest); - + - template<class I, class Size, class S, class... Rest> -void for_loop_n_strided(I start, Size n, S stride, Rest&&... rest); + template<class I, class Size, class S, class... Rest> +void for_loop_n_strided(I start, Size n, S stride, Rest&&... rest); - template<class ExecutionPolicy, + template<class ExecutionPolicy, class I, class Size, class S, class... Rest> void for_loop_n_strided(ExecutionPolicy&& exec, - I start, Size n, S stride, Rest&&... rest); + I start, Size n, S stride, Rest&&... rest); - - For the overloads with an ExecutionPolicy, I shall be an integral type or meet the requirements of a forward iterator type; otherwise, I shall be an integral type or meet the requirements of an input iterator type. Size shall be an integral type @@ -843,222 +397,65 @@

For loop

followed by exactly one invocable element-access function, f. For the overloads with an ExecutionPolicy, f shall meet the requirements of CopyConstructible; otherwise, f shall meet the requirements of MoveConstructible. -
-
- - - Applies f to each element in the input sequence, as described below, with additional - arguments corresponding to the reductions and inductions in the rest parameter pack. The - length of the input sequence is: - -
    -
  • - n, if specified, -
  • - -
  • - otherwise finish - start if neither n nor stride is specified, -
  • - -
  • - otherwise 1 + (finish-start-1)/stride if stride is positive, -
  • - -
  • - otherwise 1 + (start-finish-1)/-stride. -
  • -
- - The first element in the input sequence is start. Each subsequent element is generated by adding - stride to the previous element, if stride is specified, otherwise by incrementing - the previous element. As described in the C++ standard, section [algorithms.general], arithmetic - on non-random-access iterators is performed using advance and distance. The order of the - elements of the input sequence is important for determining ordinal position of an application of f, - even though the applications themselves may be unordered.

- - The first argument to f is an element from the input sequence. if I is an - iterator type, the iterators in the input sequence are not dereferenced before - being passed to f. For each member of the rest parameter pack - excluding f, an additional argument is passed to each application of f as follows: - -
    -
  • - If the pack member is an object returned by a call to a reduction function listed in section - [parallel.alg.reductions], then the additional argument is a reference to an accumulator of that reduction - object. -
  • - -
  • - If the pack member is an object returned by a call to induction, then the additional argument is the - induction value for that induction object corresponding to the position of the application of f in the input - sequence. -
  • -
-
-
-
- - - - Applies f exactly once for each element of the input sequence. - - - - - - If f returns a result, the result is ignored. - - -
-
- - -

For each

- - - template<class ExecutionPolicy, - class InputIterator, class Function> -void for_each(ExecutionPolicy&& exec, - InputIterator first, InputIterator last, - Function f); - - - - - Applies f to the result of dereferencing every iterator in the range [first,last). - - - If the type of first satisfies the requirements of a mutable iterator, f may - apply nonconstant functions through the dereferenced iterator. - - - - - - - - Applies f exactly last - first times. - - - - - - If f returns a result, the result is ignored. - - - - - - - Unlike its sequential form, the parallel overload of for_each does not return a copy of - its Function parameter, since parallelization may not permit efficient state - accumulation. - - - - - - - - Unlike its sequential form, the parallel overload of for_each requires - Function to meet the requirements of CopyConstructible. - - - - + Applies f to each element in the input sequence, as described below, with additional + arguments corresponding to the reductions and inductions in the rest parameter pack. The + length of the input sequence is: - - template<class InputIterator, class Size, class Function> -InputIterator for_each_n(InputIterator first, Size n, -Function f); +
    +
  • + n, if specified, +
  • - - - - Function shall meet the requirements of MoveConstructible - - - Function need not meet the requirements of CopyConstructible. - - - - +
  • + otherwise finish - start if neither n nor stride is specified, +
  • - - - - Applies f to the result of dereferencing every iterator in the range - [first,first + n), starting from first and proceeding to first + n - 1. - - - If the type of first satisfies the requirements of a mutable iterator, - f may apply nonconstant functions through the dereferenced iterator. - - - - +
  • + otherwise 1 + (finish-start-1)/stride if stride is positive, +
  • - - - - first + n for non-negative values of n and first for negative values. - - - +
  • + otherwise 1 + (start-finish-1)/-stride. +
  • +
- - - - If f returns a result, the result is ignored. - - - -
+ The first element in the input sequence is start. Each subsequent element is generated by adding + stride to the previous element, if stride is specified, otherwise by incrementing + the previous element. As described in the C++ standard, section [algorithms.general], arithmetic + on non-random-access iterators is performed using advanceadvance and distancedistance. The order of the + elements of the input sequence is important for determining ordinal position of an application of f, + even though the applications themselves may be unordered.

- - template<class ExecutionPolicy, - class InputIterator, class Size, class Function> -InputIterator for_each_n(ExecutionPolicy && exec, - InputIterator first, Size n, - Function f); + The first argument to f is an element from the input sequence. if I is an + iterator type, the iterators in the input sequence are not dereferenced before + being passed to f. For each member of the restrest parameter pack + excluding f, an additional argument is passed to each application of f as follows: - - - - Applies f to the result of dereferencing every iterator in the range - [first,first + n), starting from first and proceeding to first + n - 1. - - - If the type of first satisfies the requirements of a mutable iterator, - f may apply nonconstant functions through the dereferenced iterator. - - +
    +
  • + If the pack member is an object returned by a call to a reduction function listed in section + [parallel.alg.reductions], then the additional argument is a reference to an accumulator of that reduction + object. +
  • + +
  • + If the pack member is an object returned by a call to induction, then the additional argument is the + induction value for that induction object corresponding to the position of the application of f in the input + sequence. +
  • +
-
- - - first + n for non-negative values of n and first for negative values. - - + + Applies f exactly once for each element of the input sequence. + - - If f returns a result, the result is ignored. + If f returns a result, the result is ignored. - - - - - - Unlike its sequential form, the parallel overload of for_each_n requires - Function to meet the requirements of CopyConstructible. - - -
@@ -1082,12 +479,6 @@

No vec

the result of f. - - - If f returns a result, the result is ignored. - - - If f exits via an exception, then terminate will be called, consistent with all other potentially-throwing operations invoked with vector_policy execution. @@ -1167,7 +558,7 @@

Ordered update class

- An object of type ordered_update_t><T<> is a proxy for an object of type T + An object of type ordered_update_t<T> is a proxy for an object of type T intended to be used within a parallel application of an element access function using a policy object of type vector_policy. Simple increments, assignments, and compound assignments to the object are forwarded to the proxied object, but are sequenced as though @@ -1195,536 +586,4 @@

Ordered update function template

- - -

Numeric Parallel Algorithms

- - -

Header <experimental/numeric> synopsis

- - -
-namespace std {
-namespace experimental {
-namespace parallel {
-inline namespace v2 {
-  template<class InputIterator>
-    typename iterator_traits<InputIterator>::value_type
-      reduce(InputIterator first, InputIterator last);
-  template<class ExecutionPolicy,
-           class InputIterator>
-    typename iterator_traits<InputIterator>::value_type
-      reduce(ExecutionPolicy&& exec,
-             InputIterator first, InputIterator last);
-  template<class InputIterator, class T>
-    T reduce(InputIterator first, InputIterator last, T init);
-  template<class ExecutionPolicy,
-           class InputIterator, class T>
-    T reduce(ExecutionPolicy&& exec,
-             InputIterator first, InputIterator last, T init);
-  template<class InputIterator, class T, class BinaryOperation>
-    T reduce(InputIterator first, InputIterator last, T init,
-             BinaryOperation binary_op);
-  template<class ExecutionPolicy, class InputIterator, class T, class BinaryOperation>
-    T reduce(ExecutionPolicy&& exec,
-             InputIterator first, InputIterator last, T init,
-             BinaryOperation binary_op);
-
-  template<class InputIterator, class OutputIterator,
-           class T>
-    OutputIterator
-      exclusive_scan(InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     T init);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator,
-           class T>
-    OutputIterator
-      exclusive_scan(ExecutionPolicy&& exec,
-                     InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     T init);
-  template<class InputIterator, class OutputIterator,
-           class T, class BinaryOperation>
-    OutputIterator
-      exclusive_scan(InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     T init, BinaryOperation binary_op);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator,
-           class T, class BinaryOperation>
-    OutputIterator
-      exclusive_scan(ExecutionPolicy&& exec,
-                     InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     T init, BinaryOperation binary_op);
-
-  template<class InputIterator, class OutputIterator>
-    OutputIterator
-      inclusive_scan(InputIterator first, InputIterator last,
-                     OutputIterator result);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator>
-    OutputIterator
-      inclusive_scan(ExecutionPolicy&& exec,
-                     InputIterator first, InputIterator last,
-                     OutputIterator result);
-  template<class InputIterator, class OutputIterator,
-           class BinaryOperation>
-    OutputIterator
-      inclusive_scan(InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     BinaryOperation binary_op);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator,
-           class BinaryOperation>
-    OutputIterator
-      inclusive_scan(ExecutionPolicy&& exec,
-                     InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     BinaryOperation binary_op);
-  template<class InputIterator, class OutputIterator,
-           class BinaryOperation, class T>
-    OutputIterator
-      inclusive_scan(InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     BinaryOperation binary_op, T init);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator,
-           class BinaryOperation, class T>
-    OutputIterator
-      inclusive_scan(ExecutionPolicy&& exec,
-                     InputIterator first, InputIterator last,
-                     OutputIterator result,
-                     BinaryOperation binary_op, T init);
-
-  template<class InputIterator, class UnaryOperation,
-           class T, class BinaryOperation>
-    T transform_reduce(InputIterator first, InputIterator last,
-                       UnaryOperation unary_op,
-                       T init, BinaryOperation binary_op);
-  template<class ExecutionPolicy,
-           class InputIterator, class UnaryOperation,
-           class T, class BinaryOperation>
-    T transform_reduce(ExecutionPolicy&& exec,
-                       InputIterator first, InputIterator last,
-                       UnaryOperation unary_op,
-                       T init, BinaryOperation binary_op);
-
-  template<class InputIterator, class OutputIterator,
-           class UnaryOperation, class T, class BinaryOperation>
-    OutputIterator
-      transform_exclusive_scan(InputIterator first, InputIterator last,
-                               OutputIterator result,
-                               UnaryOperation unary_op,
-                               T init, BinaryOperation binary_op);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator,
-           class UnaryOperation, class T, class BinaryOperation>
-    OutputIterator
-      transform_exclusive_scan(ExecutionPolicy&& exec,
-                               InputIterator first, InputIterator last,
-                               OutputIterator result,
-                               UnaryOperation unary_op,
-                               T init, BinaryOperation binary_op);
-
-  template<class InputIterator, class OutputIterator,
-           class UnaryOperation, class BinaryOperation>
-    OutputIterator
-      transform_inclusive_scan(InputIterator first, InputIterator last,
-                               OutputIterator result,
-                               UnaryOperation unary_op,
-                               BinaryOperation binary_op);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator,
-           class UnaryOperation, class BinaryOperation>
-    OutputIterator
-      transform_inclusive_scan(ExecutionPolicy&& exec,
-                               InputIterator first, InputIterator last,
-                               OutputIterator result,
-                               UnaryOperation unary_op,
-                               BinaryOperation binary_op);
-
-  template<class InputIterator, class OutputIterator,
-           class UnaryOperation, class BinaryOperation, class T>
-    OutputIterator
-      transform_inclusive_scan(InputIterator first, InputIterator last,
-                               OutputIterator result,
-                               UnaryOperation unary_op,
-                               BinaryOperation binary_op, T init);
-  template<class ExecutionPolicy,
-           class InputIterator, class OutputIterator,
-           class UnaryOperation, class BinaryOperation, class T>
-    OutputIterator
-      transform_inclusive_scan(ExecutionPolicy&& exec,
-                               InputIterator first, InputIterator last,
-                               OutputIterator result,
-                               UnaryOperation unary_op,
-                               BinaryOperation binary_op, T init);
-}
-}
-}
-}
-
-
-
- - -

Reduce

- - - template<class InputIterator> -typename iterator_traits<InputIterator>::value_type - reduce(InputIterator first, InputIterator last); - - - - Same as reduce(first, last, typename iterator_traits<InputIterator>::value_type{}). - - - - - - template<class InputIterator, class T> -T reduce(InputIterator first, InputIterator last, T init); - - - - Same as reduce(first, last, init, plus<>()). - - - - - - template<class InputIterator, class T, class BinaryOperation> -T reduce(InputIterator first, InputIterator last, T init, - BinaryOperation binary_op); - - - - GENERALIZED_SUM(binary_op, init, *first, ..., *(first + (last - first) - 1)). - - - - - - binary_op shall not invalidate iterators or subranges, nor modify elements in the - range [first,last). - - - - - - O(last - first) applications of binary_op. - - - - - - - The primary difference between reduce and accumulate is that the behavior - of reduce may be non-deterministic for non-associative or non-commutative binary_op. - - - - -
- - -

Exclusive scan

- - - template<class InputIterator, class OutputIterator, class T> -OutputIterator exclusive_scan(InputIterator first, InputIterator last, - OutputIterator result, - T init); - - - - Same as exclusive_scan(first, last, result, init, plus<>()). - - - - - - template<class InputIterator, class OutputIterator, class T, class BinaryOperation> -OutputIterator exclusive_scan(InputIterator first, InputIterator last, - OutputIterator result, - T init, BinaryOperation binary_op); - - - - - Assigns through each iterator i in [result,result + (last - first)) the - value of GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, *first, ..., *(first + (i - result) - 1)). - - - - - - - The end of the resulting range beginning at result. - - - - - - - binary_op shall not invalidate iterators or subranges, nor modify elements in the - ranges [first,last) or [result,result + (last - first)). - - - - - - - O(last - first) applications of binary_op. - - - - - - - The difference between exclusive_scan and inclusive_scan is that - exclusive_scan excludes the ith input element from the ith - sum. If binary_op is not mathematically associative, the behavior of - exclusive_scan may be non-deterministic. - - - - -
- - -

Inclusive scan

- - - template<class InputIterator, class OutputIterator> -OutputIterator inclusive_scan(InputIterator first, InputIterator last, - OutputIterator result); - - - - - Same as inclusive_scan(first, last, result, plus<>()). - - - - - - - template<class InputIterator, class OutputIterator, class BinaryOperation> -OutputIterator inclusive_scan(InputIterator first, InputIterator last, - OutputIterator result, - BinaryOperation binary_op); - template<class InputIterator, class OutputIterator, class BinaryOperation, class T> -OutputIterator inclusive_scan(InputIterator first, InputIterator last, - OutputIterator result, - BinaryOperation binary_op, T init); - - - - - Assigns through each iterator i in [result,result + (last - first)) the value of - GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, *first, ..., *(first + (i - result))) or - GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, *first, ..., *(first + (i - result))) - if init is provided. - - - - - - - - The end of the resulting range beginning at result. - - - - - - - - binary_op shall not invalidate iterators or subranges, nor modify elements in the - ranges [first,last) or [result,result + (last - first)). - - - - - - - O(last - first) applications of binary_op. - - - - - - - The difference between exclusive_scan and inclusive_scan is that - inclusive_scan includes the ith input element in the ith sum. - If binary_op is not mathematically associative, the behavior of - inclusive_scan may be non-deterministic. - - - - -
- - -

Transform reduce

- - - template<class InputIterator, class UnaryFunction, class T, class BinaryOperation> -T transform_reduce(InputIterator first, InputIterator last, - UnaryOperation unary_op, T init, BinaryOperation binary_op); - - - - - GENERALIZED_SUM(binary_op, init, unary_op(*first), ..., unary_op(*(first + (last - first) -
- 1))). -
-
-
- - - - Neither unary_op nor binary_op shall invalidate subranges, or modify elements in the range [first,last) - - - - - - O(last - first) applications each of unary_op and binary_op. - - - - - - transform_reduce does not apply unary_op to init. - - -
-
- - -

Transform exclusive scan

- - - template<class InputIterator, class OutputIterator, - class UnaryOperation, - class T, class BinaryOperation> -OutputIterator transform_exclusive_scan(InputIterator first, InputIterator last, - OutputIterator result, - UnaryOperation unary_op, - T init, BinaryOperation binary_op); - - - - - Assigns through each iterator i in [result,result + (last - first)) the value of - GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, unary_op(*first), ..., unary_op(*(first + (i
- - result) - 1))). -
-
-
- - - - The end of the resulting range beginning at result. - - - - - - - Neither unary_op nor binary_op shall invalidate iterators or subranges, or modify elements in the - ranges [first,last) or [result,result + (last - first)). - - - - - - - O(last - first) applications each of unary_op and binary_op. - - - - - - - The difference between transform_exclusive_scan and transform_inclusive_scan is that transform_exclusive_scan - excludes the ith input element from the ith sum. If binary_op is not mathematically associative, the behavior of - transform_exclusive_scan may be non-deterministic. transform_exclusive_scan does not apply unary_op to init. - - - -
-
- - -

Transform inclusive scan

- - - template<class InputIterator, class OutputIterator, - class UnaryOperation, - class BinaryOperation> -OutputIterator transform_inclusive_scan(InputIterator first, InputIterator last, - OutputIterator result, - UnaryOperation unary_op, - BinaryOperation binary_op); - - template<class InputIterator, class OutputIterator, - class UnaryOperation, - class BinaryOperation, class T> -OutputIterator transform_inclusive_scan(InputIterator first, InputIterator last, - OutputIterator result, - UnaryOperation unary_op, - BinaryOperation binary_op, T init); - - - - - Assigns through each iterator i in [result,result + (last - first)) the value of - GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, unary_op(*first), ..., unary_op(*(first + (i -
- result)))) or - GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, unary_op(*first), ..., unary_op(*(first + (i
- - result)))) - if init is provided. -
-
-
- - - - The end of the resulting range beginning at result. - - - - - - - Neither unary_op nor binary_op shall invalidate iterators or subranges, or modify elements in the ranges [first,last) - or [result,result + (last - first)). - - - - - - - O(last - first) applications each of unary_op and binary_op. - - - - - - - The difference between transform_exclusive_scan and transform_inclusive_scan is that transform_inclusive_scan - includes the ith input element from the ith sum. If binary_op is not mathematically associative, the behavior of - transform_inclusive_scan may be non-deterministic. transform_inclusive_scan does not apply unary_op to init. - - - -
-
-
diff --git a/exceptions.html b/exceptions.html index a3e19aa..d3b217f 100644 --- a/exceptions.html +++ b/exceptions.html @@ -1,71 +1,16 @@

Parallel exceptions

- -

Exception reporting behavior

- -

- During the execution of a standard parallel algorithm, - if temporary memory resources are required and none are available, - the algorithm throws a std::bad_alloc exception. -

-

- During the execution of a standard parallel algorithm, if the invocation of an element access function - exits via an uncaught exception, the behavior of the program is determined by the type of - execution policy used to invoke the algorithm: - -

    -
  • - If the execution policy object is of type parallel_vector_execution_policy, unsequenced_policy, or vector_policy, - std::terminate shall be called. -
  • -
  • - If the execution policy object is of type sequential_execution_policy or - parallel_execution_policy, the execution of the algorithm exits via an - exception. The exception shall be an exception_list containing all uncaught exceptions thrown during - the invocations of element access functions, or optionally the uncaught exception if there was only one.
    -
    - - - For example, when for_each is executed sequentially, - if an invocation of the user-provided function object throws an exception, for_each can exit via the uncaught exception, or throw an exception_list containing the original exception. -
    -
    - - - These guarantees imply that, unless the algorithm has failed to allocate memory and - exits via std::bad_alloc, all exceptions thrown during the execution of - the algorithm are communicated to the caller. It is unspecified whether an algorithm implementation will "forge ahead" after - encountering and capturing a user exception. -
    -
    - - The algorithm may exit via the std::bad_alloc exception even if one or more - user-provided function objects have exited via an exception. For example, this can happen when an algorithm fails to allocate memory while - creating or adding elements to the exception_list object. - -
  • - -
  • - If the execution policy object is of any other type, the behavior is implementation-defined. -
  • -
-

-
-

Header <experimental/exception_list> synopsis

 
-namespace std {
-namespace std::experimental {
-inline namespace parallelism_v2 {
-inline namespace v2 {
+namespace std::experimental {
+inline namespace parallelism_v2 {
 
   class exception_list : public exception
   {
     public:
-      typedef unspecified iterator;
-      using iterator = unspecified;
+      using iterator = unspecified;
   
       size_t size() const noexcept;
       iterator begin() const noexcept;
@@ -73,20 +18,16 @@ 

Header <experimental/exception_list> synopsis

const char* what() const noexcept override; }; -} } } -}

- The class exception_list owns a sequence of exception_ptr objects. The parallel - algorithms may use the exception_list to communicate uncaught exceptions encountered during parallel execution to the - caller of the algorithm. + The class exception_list owns a sequence of exception_ptr objects.

- The type exception_list::iterator shall fulfill the requirements of + The type exception_list::iterator fulfillsshall fulfill the requirements of ForwardIterator.

diff --git a/execution_policies.html b/execution_policies.html index 0ee67d7..e7e8a6e 100644 --- a/execution_policies.html +++ b/execution_policies.html @@ -1,78 +1,14 @@

Execution policies

- - -

In general

-

- This clause describes classes that are execution policy types. An object - of an execution policy type indicates the kinds of parallelism allowed in the execution - of an algorithm and expresses the consequent requirements on the element - access functions. -

- -
std::vector<int> v = ...
-
-// standard sequential sort
-std::sort(v.begin(), v.end());
-
-using namespace std::experimental::parallel;
-
-// explicitly sequential sort
-sort(seq, v.begin(), v.end());
-
-// permitting parallel execution
-sort(par, v.begin(), v.end());
-
-// permitting vectorization as well
-sort(par_vec, v.begin(), v.end());
-
-// sort with dynamically-selected execution
-size_t threshold = ...
-execution_policy exec = seq;
-if (v.size() > threshold)
-{
-  exec = par;
-}
-
-sort(exec, v.begin(), v.end());
-
-
-
- - Because different parallel architectures may require idiosyncratic - parameters for efficient execution, implementations of the Standard Library - may provide additional execution policies to those described in this - Technical Specification as extensions. - -
-
-

Header <experimental/execution_policy> synopsis

+

Header <experimental/execution> synopsis

-#include <execution>
-
-namespace std {
-namespace std::experimental {
-inline namespace parallelism_v2 {
-inline namespace v2 {
-  
-  template<class T> struct is_execution_policy;
-  template<class T> constexpr bool is_execution_policy_v = is_execution_policy<T>::value;
-
-  
-  class sequential_execution_policy;
-
-  
-  class parallel_execution_policy;
+#include <execution>
 
-  
-  class parallel_vector_execution_policy;
-
-  
-  class execution_policy;
-
+namespace std::experimental {
+inline namespace parallelism_v2 {
 namespace execution {
   
   class unsequenced_policy;
@@ -80,75 +16,14 @@ 

Header <experimental/execution_policy> synopsi class vector_policy; - + inline constexpr unsequenced_policy unseq{ unspecified }; - inline constexpr parallel_policy par{ unspecified }; + inline constexpr vector_policy vecparallel_policy par{ unspecified }; } -} } } -}

- - -

Execution policy type trait

- -
-template<class T> struct is_execution_policy { see below };
-
- -

is_execution_policy can be used to detect parallel execution policies for the purpose of excluding function signatures from otherwise ambiguous overload resolution participation.

- -

is_execution_policy<T> shall be a UnaryTypeTrait with a BaseCharacteristic of true_type if T is the type of a standard or implementation-defined execution policy, otherwise false_type. - -

-
- - - This provision reserves the privilege of creating non-standard execution policies to the library implementation. - - -

The behavior of a program that adds specializations for is_execution_policy is undefined.

-
- - - - -

Sequential execution policy

- -
-class sequential_execution_policy{ unspecified };
-
- -

The class sequential_execution_policy is an execution policy type used as a unique type to disambiguate parallel algorithm overloading and require that a parallel algorithm's execution may not be parallelized.

- -
- - - -

Parallel execution policy

- -
-class parallel_execution_policy{ unspecified };
-
- -

The class parallel_execution_policy is an execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be parallelized.

- -
- - - -

Parallel+Vector execution policy

- -
-class parallel_vector_execution_policy{ unspecified };
-
- -

The class class parallel_vector_execution_policy is an execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be vectorized and parallelized.

- -
-

Unsequenced execution policy

@@ -159,12 +34,12 @@

Unsequenced execution policy

The class unsequenced_policy is an execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be vectorized, e.g., executed on a single thread using instructions that operate on multiple data items.

-

The invocations of element access functions in parallel algorithms invoked with an execution policy of type unsequenced_policy are permitted to execute in an unordered fashion in the calling thread, unsequenced with respect to one another within the calling thread. - This means that multiple function object invocations may be interleaved on a single thread.

+

The invocations of element access functions in parallel algorithms invoked with an execution policy of type unsequenced_policy are permitted to execute in an unordered fashion in the calling thread, unsequenced with respect to one another within the calling thread. + This means that multiple function object invocations may be interleaved on a single thread.

-

This overrides the usual guarantee from the C++ Standard, [intro.execution] that function executions do not overlap with one another.

+

This overrides the usual guarantee from the C++ Standard, [intro.execution] that function executions do not overlap with one another.

-

During the execution of a parallel algorithm with the experimental::execution::unsequenced_policy policy, if the invocation of an element access function exits via an uncaught exception, terminate() shall be called.

+

During the execution of a parallel algorithm with the experimental::execution::unsequenced_policy policy, if the invocation of an element access function exits via an uncaught exception, terminate() willshall be called.

@@ -177,117 +52,20 @@

Vector execution policy

The class vector_policy is an execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be vectorized. Additionally, such vectorization will result in an execution that respects the sequencing constraints of wavefront application ([parallel.alg.general.wavefront]). The implementation thus makes stronger guarantees than for unsequenced_policy, for example.

-

The invocations of element access functions in parallel algorithms invoked with an execution policy of type vector_policy are permitted to execute in unordered fashion in the calling thread, unsequenced with respect to one another within the calling thread, subject to the sequencing constraints of wavefront application () for the last argument to for_loop or for_loop_strided.

- -

During the execution of a parallel algorithm with the experimental::execution::vector_policy policy, if the invocation of an element access function exits via an uncaught exception, terminate() shall be called.

- - - - - -

Dynamic execution policy

- -
-class execution_policy
-{
-  public:
-    
-    template<class T> execution_policy(const T& exec);
-    template<class T> execution_policy& operator=(const T& exec);
-
-    
-    const type_info& type() const noexcept;
-    template<class T> T* get() noexcept;
-    template<class T> const T* get() const noexcept;
-};
-
- -

The class execution_policy is a container for execution policy objects. - execution_policy allows dynamic control over standard algorithm execution.

- - -
std::vector<float> sort_me = ...
-        
-using namespace std::experimental::parallel;
-execution_policy exec = seq;
-
-if(sort_me.size() > threshold)
-{
-  exec = std::par;
-}
- 
-std::sort(exec, std::begin(sort_me), std::end(sort_me));
-
- -

Objects of type execution_policy shall be constructible and assignable from objects of - type T for which is_execution_policy<T>::value is true.

- -
- -

execution_policy construct/assign

- - - template<class T> execution_policy(const T& exec); - - Constructs an execution_policy object with a copy of exec's state. - - - - - This constructor shall not participate in overload resolution unless - is_execution_policy<T>::value is true. - - - - - - - - template<class T> execution_policy& operator=(const T& exec); - - Assigns a copy of exec's state to *this. - - *this. - - -
- - -

execution_policy object access

- - - const type_info& type() const noexcept; - - typeid(T), such that T is the type of the execution policy object contained by *this. - - - - template<class T> T* get() noexcept; - template<class T> const T* get() const noexcept; - - If target_type() == typeid(T), a pointer to the stored execution policy object; otherwise a null pointer. +

The invocations of element access functions in parallel algorithms invoked with an execution policy of type vector_policy are permitted to execute in unordered fashion in the calling thread, unsequenced with respect to one another within the calling thread, subject to the sequencing constraints of wavefront application () for the last argument to for_loop, for_loop_n, or for_loop_strided, or for_loop_strided_n.

- - - is_execution_policy<T>::value is true. - - -
+

During the execution of a parallel algorithm with the experimental::execution::vector_policy policy, if the invocation of an element access function exits via an uncaught exception, terminate() willshall be called.

-

Execution policy objects

-constexpr sequential_execution_policy      seq{};
-constexpr parallel_execution_policy        par{};
-constexpr parallel_vector_execution_policy par_vec{};
-constexpr execution::unsequenced_policy unseq{};
-constexpr execution::vector_policy vec{};
+inline constexpr execution::unsequenced_policy unseq{};
+inline constexpr execution::vector_policy vec{};
 
-

The header <experimental/execution_policy> declares a global object associated with each type of execution policy defined by this Technical Specification.

+

The header <experimental/execution> declares a global object associated with each type of execution policy defined by this Technical Specification.

diff --git a/front_matter.html b/front_matter.html index aecf884..b8032ea 100644 --- a/front_matter.html +++ b/front_matter.html @@ -1,8 +1,8 @@ -N4706 +N4725 19570 - - N4698 + + N4706 Jared Hoberock
NVIDIA Corporation
diff --git a/general.html b/general.html index bda89ac..d012f25 100644 --- a/general.html +++ b/general.html @@ -7,7 +7,7 @@

Namespaces and headers

experimental and not part of the C++ Standard Library, they should not be declared directly within namespace std. Unless otherwise specified, all components described in this Technical Specification are declared in namespace - std::experimental::parallelism_v2parallel::v2.

+ std::experimental::parallelism_v2.

Once standardized, the components described by this Technical Specification are expected to be promoted to namespace std. @@ -15,7 +15,7 @@

Namespaces and headers

Unless otherwise specified, references to such entities described in this Technical Specification are assumed to be qualified with - std::experimental::parallelism_v2parallel::v2, and references to entities described in the C++ + std::experimental::parallelism_v2, and references to entities described in the C++ Standard Library are assumed to be qualified with std::.

Extensions that are expected to eventually be added to an existing header @@ -43,29 +43,14 @@

Feature-testing recommendations

Value Header - - N4505 - Working Draft, Technical Specification for C++ Extensions for Parallelism - - __cpp_lib_experimental_parallel_algorithm - 201505 - - - <experimental/algorithm>
- <experimental/exception_list>
- <experimental/execution_policy>
- <experimental/numeric> -
- - P0155R0 Task Block R5 __cpp_lib_experimental_parallel_task_block - 201711201510 + 201711 - <experimental/exception_list>
+ <experimental/exception_list>
<experimental/task_block>
@@ -74,19 +59,19 @@

Feature-testing recommendations

Vector and Wavefront Policies , __cpp_lib_experimental_execution_vector_policy - 201711201707 + 201711 <experimental/algorithm>
<experimental/execution>
- P0075R2 - Template Library for Parallel For Loops + P0075R2 + Template Library for Parallel For Loops , , - __cpp_lib_experimental_parallel_for_loop - 201711 - <experimental/algorithm> + __cpp_lib_experimental_parallel_for_loop + 201711 + <experimental/algorithm> diff --git a/normative_references.html b/normative_references.html index a082cbf..50ce1a8 100644 --- a/normative_references.html +++ b/normative_references.html @@ -7,19 +7,19 @@

Normative references

of the referenced document (including any amendments) applies.

    -
  • ISO/IEC 14882:2017To be published. Section references are relative to N3937., +
  • ISO/IEC 14882:2017, Programming Languages — C++
-

ISO/IEC 14882:2017 is herein called the C++ Standard. - The library described in ISO/IEC 14882:2017 clauses 20-3317-30 is herein called +

ISO/IEC 14882:2017 is herein called the C++ Standard. + The library described in ISO/IEC 14882:2017 clauses 20-33 is herein called the C++ Standard Library. The C++ Standard Library components described in - ISO/IEC 14882:2017 clauses 28, 29.8 and 23.10.1025, 26.7 and 20.7.2 are herein called the C++ Standard + ISO/IEC 14882:2017 clauses 28, 29.8 and 23.10.10 are herein called the C++ Standard Algorithms Library.

Unless otherwise specified, the whole of the C++ Standard's Library - introduction (C++14 §20) is included into this + introduction (C++14 §20) is included into this Technical Specification by reference.

diff --git a/task_block.html b/task_block.html index 4855e55..fbbb0db 100644 --- a/task_block.html +++ b/task_block.html @@ -5,23 +5,19 @@

Task Block

Header <experimental/task_block> synopsis

-namespace std {
-namespace std::experimental {
-inline namespace parallelism_v2 {
-inline namespace v2 {
+namespace std::experimental {
+inline namespace parallelism_v2 {
   class task_cancelled_exception;
 
   class task_block;
 
   template<class F>
-    void define_task_block(F&& f);
+    void define_task_block(F&& f);
 
   template<class f>
-    void define_task_block_restore_thread(F&& f);
-}
+    void define_task_block_restore_thread(F&& f);
 }
 }
-}
      
@@ -29,21 +25,17 @@

Header <experimental/task_block> synopsis

Class task_cancelled_exception

 
-namespace std {
-namespace std::experimental {
-inline namespace parallelism_v2 {
-inline namespace v2 {
+namespace std::experimental {
+inline namespace parallelism_v2 {
 
   class task_cancelled_exception : public exception
   {
     public:
       task_cancelled_exception() noexcept;
-      virtual const char* what() const noexcept override;
+      virtual const char* what() const noexcept override;
   };
-}
 }
 }
-}
      

@@ -69,10 +61,8 @@

task_cancelled_exception member function what

Class task_block

 
-namespace std {
-namespace std::experimental {
-inline namespace parallelism_v2 {
-inline namespace v2 {
+namespace std::experimental {
+inline namespace parallelism_v2 {
 
   class task_block
   {
@@ -80,19 +70,17 @@ 

Class task_block

~task_block(); public: - task_block(const task_block&) = delete; - task_block& operator=(const task_block&) = delete; - void operator&() const = delete; + task_block(const task_block&) = delete; + task_block& operator=(const task_block&) = delete; + void operator&() const = delete; template<class F> - void run(F&& f); + void run(F&& f); void wait(); }; -} } } -}

diff --git a/terms_and_definitions.html b/terms_and_definitions.html index 86a659d..a725262 100644 --- a/terms_and_definitions.html +++ b/terms_and_definitions.html @@ -1,7 +1,6 @@

Terms and definitions

-
  • No terms and definitions are listed in this document.
  • ISO and IEC maintained terminological databases for us in standardization at the following addresses:
  • @@ -10,58 +9,5 @@

    Terms and definitions

  • ISO Online browsing platform: available at http://www.iso.org/obp
-
- - -

For the purposes of this document, the terms and definitions given in the C++ Standard and the following apply.

- -

A parallel algorithm is a function template described by this Technical Specification declared in namespace std::experimental::parallel::v2 with a formal template parameter named ExecutionPolicy.

- -

- Parallel algorithms access objects indirectly accessible via their arguments by invoking the following functions: - -

    -
  • - All operations of the categories of the iterators that the algorithm is instantiated with. -
  • - -
  • - Functions on those sequence elements that are required by its specification. -
  • - -
  • - User-provided function objects to be applied during the execution of the algorithm, if required by the specification. -
  • - -
  • - Operations on those function objects required by the specification. - - - See clause 25.1 of C++ Standard Algorithms Library. - -
  • -
- - These functions are herein called element access functions. - - - The sort function may invoke the following element access functions: - -
    -
  • - Methods of the random-access iterator of the actual template argument, as per 24.2.7, as implied by the name of the - template parameters RandomAccessIterator. -
  • - -
  • - The swap function on the elements of the sequence (as per 25.4.1.1 [sort]/2). -
  • - -
  • - The user-provided Compare function object. -
  • -
-
-