comparison libgomp/libgomp.texi @ 145:1830386684a0

gcc-9.2.0
author anatofuz
date Thu, 13 Feb 2020 11:34:05 +0900
parents 84e7813d76e9
children
comparison
equal deleted inserted replaced
131:84e7813d76e9 145:1830386684a0
5 @settitle GNU libgomp 5 @settitle GNU libgomp
6 @c %**end of header 6 @c %**end of header
7 7
8 8
9 @copying 9 @copying
10 Copyright @copyright{} 2006-2018 Free Software Foundation, Inc. 10 Copyright @copyright{} 2006-2020 Free Software Foundation, Inc.
11 11
12 Permission is granted to copy, distribute and/or modify this document 12 Permission is granted to copy, distribute and/or modify this document
13 under the terms of the GNU Free Documentation License, Version 1.3 or 13 under the terms of the GNU Free Documentation License, Version 1.3 or
14 any later version published by the Free Software Foundation; with the 14 any later version published by the Free Software Foundation; with the
15 Invariant Sections being ``Funding Free Software'', the Front-Cover 15 Invariant Sections being ``Funding Free Software'', the Front-Cover
93 @comment aligned to the same column. Do not use tabs. This provides 93 @comment aligned to the same column. Do not use tabs. This provides
94 @comment better formatting. 94 @comment better formatting.
95 @comment 95 @comment
96 @menu 96 @menu
97 * Enabling OpenMP:: How to enable OpenMP for your applications. 97 * Enabling OpenMP:: How to enable OpenMP for your applications.
98 * Runtime Library Routines:: The OpenMP runtime application programming 98 * OpenMP Runtime Library Routines: Runtime Library Routines.
99 The OpenMP runtime application programming
99 interface. 100 interface.
100 * Environment Variables:: Influencing runtime behavior with environment 101 * OpenMP Environment Variables: Environment Variables.
101 variables. 102 Influencing OpenMP runtime behavior with
103 environment variables.
102 * Enabling OpenACC:: How to enable OpenACC for your 104 * Enabling OpenACC:: How to enable OpenACC for your
103 applications. 105 applications.
104 * OpenACC Runtime Library Routines:: The OpenACC runtime application 106 * OpenACC Runtime Library Routines:: The OpenACC runtime application
105 programming interface. 107 programming interface.
106 * OpenACC Environment Variables:: Influencing OpenACC runtime behavior with 108 * OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
107 environment variables. 109 environment variables.
108 * CUDA Streams Usage:: Notes on the implementation of 110 * CUDA Streams Usage:: Notes on the implementation of
109 asynchronous operations. 111 asynchronous operations.
110 * OpenACC Library Interoperability:: OpenACC library interoperability with the 112 * OpenACC Library Interoperability:: OpenACC library interoperability with the
111 NVIDIA CUBLAS library. 113 NVIDIA CUBLAS library.
114 * OpenACC Profiling Interface::
112 * The libgomp ABI:: Notes on the external ABI presented by libgomp. 115 * The libgomp ABI:: Notes on the external ABI presented by libgomp.
113 * Reporting Bugs:: How to report bugs in the GNU Offloading and 116 * Reporting Bugs:: How to report bugs in the GNU Offloading and
114 Multi Processing Runtime Library. 117 Multi Processing Runtime Library.
115 * Copying:: GNU general public license says 118 * Copying:: GNU general public license says
116 how you can copy and share libgomp. 119 how you can copy and share libgomp.
142 the @uref{https://www.openmp.org, OpenMP Application Program Interface} manual, 145 the @uref{https://www.openmp.org, OpenMP Application Program Interface} manual,
143 version 4.5. 146 version 4.5.
144 147
145 148
146 @c --------------------------------------------------------------------- 149 @c ---------------------------------------------------------------------
147 @c Runtime Library Routines 150 @c OpenMP Runtime Library Routines
148 @c --------------------------------------------------------------------- 151 @c ---------------------------------------------------------------------
149 152
150 @node Runtime Library Routines 153 @node Runtime Library Routines
151 @chapter Runtime Library Routines 154 @chapter OpenMP Runtime Library Routines
152 155
153 The runtime routines described here are defined by Section 3 of the OpenMP 156 The runtime routines described here are defined by Section 3 of the OpenMP
154 specification in version 4.5. The routines are structured in following 157 specification in version 4.5. The routines are structured in following
155 three parts: 158 three parts:
156 159
1325 @end table 1328 @end table
1326 1329
1327 1330
1328 1331
1329 @c --------------------------------------------------------------------- 1332 @c ---------------------------------------------------------------------
1330 @c Environment Variables 1333 @c OpenMP Environment Variables
1331 @c --------------------------------------------------------------------- 1334 @c ---------------------------------------------------------------------
1332 1335
1333 @node Environment Variables 1336 @node Environment Variables
1334 @chapter Environment Variables 1337 @chapter OpenMP Environment Variables
1335 1338
1336 The environment variables which beginning with @env{OMP_} are defined by 1339 The environment variables which beginning with @env{OMP_} are defined by
1337 section 4 of the OpenMP specification in version 4.5, while those 1340 section 4 of the OpenMP specification in version 4.5, while those
1338 beginning with @env{GOMP_} are GNU extensions. 1341 beginning with @env{GOMP_} are GNU extensions.
1339 1342
1722 1725
1723 @item @emph{See also}: 1726 @item @emph{See also}:
1724 @ref{OMP_STACKSIZE} 1727 @ref{OMP_STACKSIZE}
1725 1728
1726 @item @emph{Reference}: 1729 @item @emph{Reference}:
1727 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html, 1730 @uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
1728 GCC Patches Mailinglist}, 1731 GCC Patches Mailinglist},
1729 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html, 1732 @uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
1730 GCC Patches Mailinglist} 1733 GCC Patches Mailinglist}
1731 @end table 1734 @end table
1732 1735
1733 1736
1734 1737
1806 @node Enabling OpenACC 1809 @node Enabling OpenACC
1807 @chapter Enabling OpenACC 1810 @chapter Enabling OpenACC
1808 1811
1809 To activate the OpenACC extensions for C/C++ and Fortran, the compile-time 1812 To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
1810 flag @option{-fopenacc} must be specified. This enables the OpenACC directive 1813 flag @option{-fopenacc} must be specified. This enables the OpenACC directive
1811 @code{#pragma acc} in C/C++ and @code{!$accp} directives in free form, 1814 @code{#pragma acc} in C/C++ and @code{!$acc} directives in free form,
1812 @code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form, 1815 @code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
1813 @code{!$} conditional compilation sentinels in free form and @code{c$}, 1816 @code{!$} conditional compilation sentinels in free form and @code{c$},
1814 @code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also 1817 @code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
1815 arranges for automatic linking of the OpenACC runtime library 1818 arranges for automatic linking of the OpenACC runtime library
1816 (@ref{OpenACC Runtime Library Routines}). 1819 (@ref{OpenACC Runtime Library Routines}).
1817 1820
1818 A complete description of all OpenACC directives accepted may be found in 1821 A complete description of all OpenACC directives accepted may be found in
1819 the @uref{https://www.openacc.org, OpenACC} Application Programming 1822 the @uref{https://www.openacc.org, OpenACC} Application Programming
1820 Interface manual, version 2.0. 1823 Interface manual, version 2.6.
1821 1824
1822 Note that this is an experimental feature and subject to 1825 Note that this is an experimental feature and subject to
1823 change in future versions of GCC. See 1826 change in future versions of GCC. See
1824 @uref{https://gcc.gnu.org/wiki/OpenACC} for more information. 1827 @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
1825 1828
1831 1834
1832 @node OpenACC Runtime Library Routines 1835 @node OpenACC Runtime Library Routines
1833 @chapter OpenACC Runtime Library Routines 1836 @chapter OpenACC Runtime Library Routines
1834 1837
1835 The runtime routines described here are defined by section 3 of the OpenACC 1838 The runtime routines described here are defined by section 3 of the OpenACC
1836 specifications in version 2.0. 1839 specifications in version 2.6.
1837 They have C linkage, and do not throw exceptions. 1840 They have C linkage, and do not throw exceptions.
1838 Generally, they are available only for the host, with the exception of 1841 Generally, they are available only for the host, with the exception of
1839 @code{acc_on_device}, which is available for both the host and the 1842 @code{acc_on_device}, which is available for both the host and the
1840 acceleration device. 1843 acceleration device.
1841 1844
1844 type. 1847 type.
1845 * acc_set_device_type:: Set type of device accelerator to use. 1848 * acc_set_device_type:: Set type of device accelerator to use.
1846 * acc_get_device_type:: Get type of device accelerator to be used. 1849 * acc_get_device_type:: Get type of device accelerator to be used.
1847 * acc_set_device_num:: Set device number to use. 1850 * acc_set_device_num:: Set device number to use.
1848 * acc_get_device_num:: Get device number to be used. 1851 * acc_get_device_num:: Get device number to be used.
1852 * acc_get_property:: Get device property.
1849 * acc_async_test:: Tests for completion of a specific asynchronous 1853 * acc_async_test:: Tests for completion of a specific asynchronous
1850 operation. 1854 operation.
1851 * acc_async_test_all:: Tests for completion of all asychronous 1855 * acc_async_test_all:: Tests for completion of all asynchronous
1852 operations. 1856 operations.
1853 * acc_wait:: Wait for completion of a specific asynchronous 1857 * acc_wait:: Wait for completion of a specific asynchronous
1854 operation. 1858 operation.
1855 * acc_wait_all:: Waits for completion of all asyncrhonous 1859 * acc_wait_all:: Waits for completion of all asynchronous
1856 operations. 1860 operations.
1857 * acc_wait_all_async:: Wait for completion of all asynchronous 1861 * acc_wait_all_async:: Wait for completion of all asynchronous
1858 operations. 1862 operations.
1859 * acc_wait_async:: Wait for completion of asynchronous operations. 1863 * acc_wait_async:: Wait for completion of asynchronous operations.
1860 * acc_init:: Initialize runtime for a specific device type. 1864 * acc_init:: Initialize runtime for a specific device type.
1882 * acc_unmap_data:: Unmap device memory from host memory. 1886 * acc_unmap_data:: Unmap device memory from host memory.
1883 * acc_deviceptr:: Get device pointer associated with specific 1887 * acc_deviceptr:: Get device pointer associated with specific
1884 host address. 1888 host address.
1885 * acc_hostptr:: Get host pointer associated with specific 1889 * acc_hostptr:: Get host pointer associated with specific
1886 device address. 1890 device address.
1887 * acc_is_present:: Indiciate whether host variable / array is 1891 * acc_is_present:: Indicate whether host variable / array is
1888 present on device. 1892 present on device.
1889 * acc_memcpy_to_device:: Copy host memory to device memory. 1893 * acc_memcpy_to_device:: Copy host memory to device memory.
1890 * acc_memcpy_from_device:: Copy device memory to host memory. 1894 * acc_memcpy_from_device:: Copy device memory to host memory.
1895 * acc_attach:: Let device pointer point to device-pointer target.
1896 * acc_detach:: Let device pointer point to host-pointer target.
1891 1897
1892 API routines for target platforms. 1898 API routines for target platforms.
1893 1899
1894 * acc_get_current_cuda_device:: Get CUDA device handle. 1900 * acc_get_current_cuda_device:: Get CUDA device handle.
1895 * acc_get_current_cuda_context::Get CUDA context handle. 1901 * acc_get_current_cuda_context::Get CUDA context handle.
1896 * acc_get_cuda_stream:: Get CUDA stream handle. 1902 * acc_get_cuda_stream:: Get CUDA stream handle.
1897 * acc_set_cuda_stream:: Set CUDA stream handle. 1903 * acc_set_cuda_stream:: Set CUDA stream handle.
1904
1905 API routines for the OpenACC Profiling Interface.
1906
1907 * acc_prof_register:: Register callbacks.
1908 * acc_prof_unregister:: Unregister callbacks.
1909 * acc_prof_lookup:: Obtain inquiry functions.
1910 * acc_register_library:: Library registration.
1898 @end menu 1911 @end menu
1899 1912
1900 1913
1901 1914
1902 @node acc_get_num_devices 1915 @node acc_get_num_devices
1916 @item @emph{Interface}: @tab @code{integer function acc_get_num_devices(devicetype)} 1929 @item @emph{Interface}: @tab @code{integer function acc_get_num_devices(devicetype)}
1917 @item @tab @code{integer(kind=acc_device_kind) devicetype} 1930 @item @tab @code{integer(kind=acc_device_kind) devicetype}
1918 @end multitable 1931 @end multitable
1919 1932
1920 @item @emph{Reference}: 1933 @item @emph{Reference}:
1921 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 1934 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
1922 3.2.1. 1935 3.2.1.
1923 @end table 1936 @end table
1924 1937
1925 1938
1926 1939
1927 @node acc_set_device_type 1940 @node acc_set_device_type
1928 @section @code{acc_set_device_type} -- Set type of device accelerator to use. 1941 @section @code{acc_set_device_type} -- Set type of device accelerator to use.
1929 @table @asis 1942 @table @asis
1930 @item @emph{Description} 1943 @item @emph{Description}
1931 This function indicates to the runtime library which device typr, specified 1944 This function indicates to the runtime library which device type, specified
1932 in @var{devicetype}, to use when executing a parallel or kernels region. 1945 in @var{devicetype}, to use when executing a parallel or kernels region.
1933 1946
1934 @item @emph{C/C++}: 1947 @item @emph{C/C++}:
1935 @multitable @columnfractions .20 .80 1948 @multitable @columnfractions .20 .80
1936 @item @emph{Prototype}: @tab @code{acc_set_device_type(acc_device_t devicetype);} 1949 @item @emph{Prototype}: @tab @code{acc_set_device_type(acc_device_t devicetype);}
1941 @item @emph{Interface}: @tab @code{subroutine acc_set_device_type(devicetype)} 1954 @item @emph{Interface}: @tab @code{subroutine acc_set_device_type(devicetype)}
1942 @item @tab @code{integer(kind=acc_device_kind) devicetype} 1955 @item @tab @code{integer(kind=acc_device_kind) devicetype}
1943 @end multitable 1956 @end multitable
1944 1957
1945 @item @emph{Reference}: 1958 @item @emph{Reference}:
1946 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 1959 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
1947 3.2.2. 1960 3.2.2.
1948 @end table 1961 @end table
1949 1962
1950 1963
1951 1964
1966 @item @emph{Interface}: @tab @code{function acc_get_device_type(void)} 1979 @item @emph{Interface}: @tab @code{function acc_get_device_type(void)}
1967 @item @tab @code{integer(kind=acc_device_kind) acc_get_device_type} 1980 @item @tab @code{integer(kind=acc_device_kind) acc_get_device_type}
1968 @end multitable 1981 @end multitable
1969 1982
1970 @item @emph{Reference}: 1983 @item @emph{Reference}:
1971 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 1984 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
1972 3.2.3. 1985 3.2.3.
1973 @end table 1986 @end table
1974 1987
1975 1988
1976 1989
1977 @node acc_set_device_num 1990 @node acc_set_device_num
1978 @section @code{acc_set_device_num} -- Set device number to use. 1991 @section @code{acc_set_device_num} -- Set device number to use.
1979 @table @asis 1992 @table @asis
1980 @item @emph{Description} 1993 @item @emph{Description}
1981 This function will indicate to the runtime which device number, 1994 This function will indicate to the runtime which device number,
1982 specified by @var{num}, associated with the specifed device 1995 specified by @var{num}, associated with the specified device
1983 type @var{devicetype}. 1996 type @var{devicetype}.
1984 1997
1985 @item @emph{C/C++}: 1998 @item @emph{C/C++}:
1986 @multitable @columnfractions .20 .80 1999 @multitable @columnfractions .20 .80
1987 @item @emph{Prototype}: @tab @code{acc_set_device_num(int num, acc_device_t devicetype);} 2000 @item @emph{Prototype}: @tab @code{acc_set_device_num(int num, acc_device_t devicetype);}
1993 @item @tab @code{integer devicenum} 2006 @item @tab @code{integer devicenum}
1994 @item @tab @code{integer(kind=acc_device_kind) devicetype} 2007 @item @tab @code{integer(kind=acc_device_kind) devicetype}
1995 @end multitable 2008 @end multitable
1996 2009
1997 @item @emph{Reference}: 2010 @item @emph{Reference}:
1998 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2011 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
1999 3.2.4. 2012 3.2.4.
2000 @end table 2013 @end table
2001 2014
2002 2015
2003 2016
2020 @item @tab @code{integer(kind=acc_device_kind) devicetype} 2033 @item @tab @code{integer(kind=acc_device_kind) devicetype}
2021 @item @tab @code{integer acc_get_device_num} 2034 @item @tab @code{integer acc_get_device_num}
2022 @end multitable 2035 @end multitable
2023 2036
2024 @item @emph{Reference}: 2037 @item @emph{Reference}:
2025 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2038 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2026 3.2.5. 2039 3.2.5.
2040 @end table
2041
2042
2043
2044 @node acc_get_property
2045 @section @code{acc_get_property} -- Get device property.
2046 @cindex acc_get_property
2047 @cindex acc_get_property_string
2048 @table @asis
2049 @item @emph{Description}
2050 These routines return the value of the specified @var{property} for the
2051 device being queried according to @var{devicenum} and @var{devicetype}.
2052 Integer-valued and string-valued properties are returned by
2053 @code{acc_get_property} and @code{acc_get_property_string} respectively.
2054 The Fortran @code{acc_get_property_string} subroutine returns the string
2055 retrieved in its fourth argument while the remaining entry points are
2056 functions, which pass the return value as their result.
2057
2058 @item @emph{C/C++}:
2059 @multitable @columnfractions .20 .80
2060 @item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
2061 @item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int devicenum, acc_device_t devicetype, acc_device_property_t property);}
2062 @end multitable
2063
2064 @item @emph{Fortran}:
2065 @multitable @columnfractions .20 .80
2066 @item @emph{Interface}: @tab @code{function acc_get_property(devicenum, devicetype, property)}
2067 @item @emph{Interface}: @tab @code{subroutine acc_get_property_string(devicenum, devicetype, property, string)}
2068 @item @tab @code{integer devicenum}
2069 @item @tab @code{integer(kind=acc_device_kind) devicetype}
2070 @item @tab @code{integer(kind=acc_device_property) property}
2071 @item @tab @code{integer(kind=acc_device_property) acc_get_property}
2072 @item @tab @code{character(*) string}
2073 @end multitable
2074
2075 @item @emph{Reference}:
2076 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2077 3.2.6.
2027 @end table 2078 @end table
2028 2079
2029 2080
2030 2081
2031 @node acc_async_test 2082 @node acc_async_test
2032 @section @code{acc_async_test} -- Test for completion of a specific asynchronous operation. 2083 @section @code{acc_async_test} -- Test for completion of a specific asynchronous operation.
2033 @table @asis 2084 @table @asis
2034 @item @emph{Description} 2085 @item @emph{Description}
2035 This function tests for completion of the asynchrounous operation specified 2086 This function tests for completion of the asynchronous operation specified
2036 in @var{arg}. In C/C++, a non-zero value will be returned to indicate 2087 in @var{arg}. In C/C++, a non-zero value will be returned to indicate
2037 the specified asynchronous operation has completed. While Fortran will return 2088 the specified asynchronous operation has completed. While Fortran will return
2038 a @code{true}. If the asynchrounous operation has not completed, C/C++ returns 2089 a @code{true}. If the asynchronous operation has not completed, C/C++ returns
2039 a zero and Fortran returns a @code{false}. 2090 a zero and Fortran returns a @code{false}.
2040 2091
2041 @item @emph{C/C++}: 2092 @item @emph{C/C++}:
2042 @multitable @columnfractions .20 .80 2093 @multitable @columnfractions .20 .80
2043 @item @emph{Prototype}: @tab @code{int acc_async_test(int arg);} 2094 @item @emph{Prototype}: @tab @code{int acc_async_test(int arg);}
2049 @item @tab @code{integer(kind=acc_handle_kind) arg} 2100 @item @tab @code{integer(kind=acc_handle_kind) arg}
2050 @item @tab @code{logical acc_async_test} 2101 @item @tab @code{logical acc_async_test}
2051 @end multitable 2102 @end multitable
2052 2103
2053 @item @emph{Reference}: 2104 @item @emph{Reference}:
2054 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2105 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2055 3.2.6. 2106 3.2.9.
2056 @end table 2107 @end table
2057 2108
2058 2109
2059 2110
2060 @node acc_async_test_all 2111 @node acc_async_test_all
2061 @section @code{acc_async_test_all} -- Tests for completion of all asynchronous operations. 2112 @section @code{acc_async_test_all} -- Tests for completion of all asynchronous operations.
2062 @table @asis 2113 @table @asis
2063 @item @emph{Description} 2114 @item @emph{Description}
2064 This function tests for completion of all asynchrounous operations. 2115 This function tests for completion of all asynchronous operations.
2065 In C/C++, a non-zero value will be returned to indicate all asynchronous 2116 In C/C++, a non-zero value will be returned to indicate all asynchronous
2066 operations have completed. While Fortran will return a @code{true}. If 2117 operations have completed. While Fortran will return a @code{true}. If
2067 any asynchronous operation has not completed, C/C++ returns a zero and 2118 any asynchronous operation has not completed, C/C++ returns a zero and
2068 Fortran returns a @code{false}. 2119 Fortran returns a @code{false}.
2069 2120
2077 @item @emph{Interface}: @tab @code{function acc_async_test()} 2128 @item @emph{Interface}: @tab @code{function acc_async_test()}
2078 @item @tab @code{logical acc_get_device_num} 2129 @item @tab @code{logical acc_get_device_num}
2079 @end multitable 2130 @end multitable
2080 2131
2081 @item @emph{Reference}: 2132 @item @emph{Reference}:
2082 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2133 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2083 3.2.7. 2134 3.2.10.
2084 @end table 2135 @end table
2085 2136
2086 2137
2087 2138
2088 @node acc_wait 2139 @node acc_wait
2105 @item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait(arg)} 2156 @item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait(arg)}
2106 @item @tab @code{integer(acc_handle_kind) arg} 2157 @item @tab @code{integer(acc_handle_kind) arg}
2107 @end multitable 2158 @end multitable
2108 2159
2109 @item @emph{Reference}: 2160 @item @emph{Reference}:
2110 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2161 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2111 3.2.8. 2162 3.2.11.
2112 @end table 2163 @end table
2113 2164
2114 2165
2115 2166
2116 @node acc_wait_all 2167 @node acc_wait_all
2130 @item @emph{Interface}: @tab @code{subroutine acc_wait_all()} 2181 @item @emph{Interface}: @tab @code{subroutine acc_wait_all()}
2131 @item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait_all()} 2182 @item @emph{Interface (OpenACC 1.0 compatibility)}: @tab @code{subroutine acc_async_wait_all()}
2132 @end multitable 2183 @end multitable
2133 2184
2134 @item @emph{Reference}: 2185 @item @emph{Reference}:
2135 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2186 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2136 3.2.10. 2187 3.2.13.
2137 @end table 2188 @end table
2138 2189
2139 2190
2140 2191
2141 @node acc_wait_all_async 2192 @node acc_wait_all_async
2156 @item @emph{Interface}: @tab @code{subroutine acc_wait_all_async(async)} 2207 @item @emph{Interface}: @tab @code{subroutine acc_wait_all_async(async)}
2157 @item @tab @code{integer(acc_handle_kind) async} 2208 @item @tab @code{integer(acc_handle_kind) async}
2158 @end multitable 2209 @end multitable
2159 2210
2160 @item @emph{Reference}: 2211 @item @emph{Reference}:
2161 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2212 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2162 3.2.11. 2213 3.2.14.
2163 @end table 2214 @end table
2164 2215
2165 2216
2166 2217
2167 @node acc_wait_async 2218 @node acc_wait_async
2181 @item @emph{Interface}: @tab @code{subroutine acc_wait_async(arg, async)} 2232 @item @emph{Interface}: @tab @code{subroutine acc_wait_async(arg, async)}
2182 @item @tab @code{integer(acc_handle_kind) arg, async} 2233 @item @tab @code{integer(acc_handle_kind) arg, async}
2183 @end multitable 2234 @end multitable
2184 2235
2185 @item @emph{Reference}: 2236 @item @emph{Reference}:
2186 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2237 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2187 3.2.9. 2238 3.2.12.
2188 @end table 2239 @end table
2189 2240
2190 2241
2191 2242
2192 @node acc_init 2243 @node acc_init
2206 @item @emph{Interface}: @tab @code{subroutine acc_init(devicetype)} 2257 @item @emph{Interface}: @tab @code{subroutine acc_init(devicetype)}
2207 @item @tab @code{integer(acc_device_kind) devicetype} 2258 @item @tab @code{integer(acc_device_kind) devicetype}
2208 @end multitable 2259 @end multitable
2209 2260
2210 @item @emph{Reference}: 2261 @item @emph{Reference}:
2211 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2262 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2212 3.2.12. 2263 3.2.7.
2213 @end table 2264 @end table
2214 2265
2215 2266
2216 2267
2217 @node acc_shutdown 2268 @node acc_shutdown
2231 @item @emph{Interface}: @tab @code{subroutine acc_shutdown(devicetype)} 2282 @item @emph{Interface}: @tab @code{subroutine acc_shutdown(devicetype)}
2232 @item @tab @code{integer(acc_device_kind) devicetype} 2283 @item @tab @code{integer(acc_device_kind) devicetype}
2233 @end multitable 2284 @end multitable
2234 2285
2235 @item @emph{Reference}: 2286 @item @emph{Reference}:
2236 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2287 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2237 3.2.13. 2288 3.2.8.
2238 @end table 2289 @end table
2239 2290
2240 2291
2241 2292
2242 @node acc_on_device 2293 @node acc_on_device
2243 @section @code{acc_on_device} -- Whether executing on a particular device 2294 @section @code{acc_on_device} -- Whether executing on a particular device
2244 @table @asis 2295 @table @asis
2245 @item @emph{Description}: 2296 @item @emph{Description}:
2246 This function returns whether the program is executing on a particular 2297 This function returns whether the program is executing on a particular
2247 device specified in @var{devicetype}. In C/C++ a non-zero value is 2298 device specified in @var{devicetype}. In C/C++ a non-zero value is
2248 returned to indicate the device is execiting on the specified device type. 2299 returned to indicate the device is executing on the specified device type.
2249 In Fortran, @code{true} will be returned. If the program is not executing 2300 In Fortran, @code{true} will be returned. If the program is not executing
2250 on the specified device type C/C++ will return a zero, while Fortran will 2301 on the specified device type C/C++ will return a zero, while Fortran will
2251 return @code{false}. 2302 return @code{false}.
2252 2303
2253 @item @emph{C/C++}: 2304 @item @emph{C/C++}:
2262 @item @tab @code{logical acc_on_device} 2313 @item @tab @code{logical acc_on_device}
2263 @end multitable 2314 @end multitable
2264 2315
2265 2316
2266 @item @emph{Reference}: 2317 @item @emph{Reference}:
2267 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2318 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2268 3.2.14. 2319 3.2.17.
2269 @end table 2320 @end table
2270 2321
2271 2322
2272 2323
2273 @node acc_malloc 2324 @node acc_malloc
2281 @multitable @columnfractions .20 .80 2332 @multitable @columnfractions .20 .80
2282 @item @emph{Prototype}: @tab @code{d_void* acc_malloc(size_t len);} 2333 @item @emph{Prototype}: @tab @code{d_void* acc_malloc(size_t len);}
2283 @end multitable 2334 @end multitable
2284 2335
2285 @item @emph{Reference}: 2336 @item @emph{Reference}:
2286 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2337 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2287 3.2.15. 2338 3.2.18.
2288 @end table 2339 @end table
2289 2340
2290 2341
2291 2342
2292 @node acc_free 2343 @node acc_free
2299 @multitable @columnfractions .20 .80 2350 @multitable @columnfractions .20 .80
2300 @item @emph{Prototype}: @tab @code{acc_free(d_void *a);} 2351 @item @emph{Prototype}: @tab @code{acc_free(d_void *a);}
2301 @end multitable 2352 @end multitable
2302 2353
2303 @item @emph{Reference}: 2354 @item @emph{Reference}:
2304 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2355 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2305 3.2.16. 2356 3.2.19.
2306 @end table 2357 @end table
2307 2358
2308 2359
2309 2360
2310 @node acc_copyin 2361 @node acc_copyin
2320 variable or array element and @var{len} specifies the length in bytes. 2371 variable or array element and @var{len} specifies the length in bytes.
2321 2372
2322 @item @emph{C/C++}: 2373 @item @emph{C/C++}:
2323 @multitable @columnfractions .20 .80 2374 @multitable @columnfractions .20 .80
2324 @item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);} 2375 @item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);}
2376 @item @emph{Prototype}: @tab @code{void *acc_copyin_async(h_void *a, size_t len, int async);}
2325 @end multitable 2377 @end multitable
2326 2378
2327 @item @emph{Fortran}: 2379 @item @emph{Fortran}:
2328 @multitable @columnfractions .20 .80 2380 @multitable @columnfractions .20 .80
2329 @item @emph{Interface}: @tab @code{subroutine acc_copyin(a)} 2381 @item @emph{Interface}: @tab @code{subroutine acc_copyin(a)}
2330 @item @tab @code{type, dimension(:[,:]...) :: a} 2382 @item @tab @code{type, dimension(:[,:]...) :: a}
2331 @item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)} 2383 @item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)}
2332 @item @tab @code{type, dimension(:[,:]...) :: a} 2384 @item @tab @code{type, dimension(:[,:]...) :: a}
2333 @item @tab @code{integer len} 2385 @item @tab @code{integer len}
2334 @end multitable 2386 @item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, async)}
2335 2387 @item @tab @code{type, dimension(:[,:]...) :: a}
2336 @item @emph{Reference}: 2388 @item @tab @code{integer(acc_handle_kind) :: async}
2337 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2389 @item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, len, async)}
2338 3.2.17. 2390 @item @tab @code{type, dimension(:[,:]...) :: a}
2391 @item @tab @code{integer len}
2392 @item @tab @code{integer(acc_handle_kind) :: async}
2393 @end multitable
2394
2395 @item @emph{Reference}:
2396 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2397 3.2.20.
2339 @end table 2398 @end table
2340 2399
2341 2400
2342 2401
2343 @node acc_present_or_copyin 2402 @node acc_present_or_copyin
2344 @section @code{acc_present_or_copyin} -- If the data is not present on the device, allocate device memory and copy from host memory. 2403 @section @code{acc_present_or_copyin} -- If the data is not present on the device, allocate device memory and copy from host memory.
2345 @table @asis 2404 @table @asis
2346 @item @emph{Description} 2405 @item @emph{Description}
2347 This function tests if the host data specifed by @var{a} and of length 2406 This function tests if the host data specified by @var{a} and of length
2348 @var{len} is present or not. If it is not present, then device memory 2407 @var{len} is present or not. If it is not present, then device memory
2349 will be allocated and the host memory copied. The device address of 2408 will be allocated and the host memory copied. The device address of
2350 the newly allocated device memory is returned. 2409 the newly allocated device memory is returned.
2351 2410
2352 In Fortran, two (2) forms are supported. In the first form, @var{a} specifies 2411 In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2353 a contiguous array section. The second form @var{a} specifies a variable or 2412 a contiguous array section. The second form @var{a} specifies a variable or
2354 array element and @var{len} specifies the length in bytes. 2413 array element and @var{len} specifies the length in bytes.
2414
2415 Note that @code{acc_present_or_copyin} and @code{acc_pcopyin} exist for
2416 backward compatibility with OpenACC 2.0; use @ref{acc_copyin} instead.
2355 2417
2356 @item @emph{C/C++}: 2418 @item @emph{C/C++}:
2357 @multitable @columnfractions .20 .80 2419 @multitable @columnfractions .20 .80
2358 @item @emph{Prototype}: @tab @code{void *acc_present_or_copyin(h_void *a, size_t len);} 2420 @item @emph{Prototype}: @tab @code{void *acc_present_or_copyin(h_void *a, size_t len);}
2359 @item @emph{Prototype}: @tab @code{void *acc_pcopyin(h_void *a, size_t len);} 2421 @item @emph{Prototype}: @tab @code{void *acc_pcopyin(h_void *a, size_t len);}
2372 @item @tab @code{type, dimension(:[,:]...) :: a} 2434 @item @tab @code{type, dimension(:[,:]...) :: a}
2373 @item @tab @code{integer len} 2435 @item @tab @code{integer len}
2374 @end multitable 2436 @end multitable
2375 2437
2376 @item @emph{Reference}: 2438 @item @emph{Reference}:
2377 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2439 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2378 3.2.18. 2440 3.2.20.
2379 @end table 2441 @end table
2380 2442
2381 2443
2382 2444
2383 @node acc_create 2445 @node acc_create
2393 array element and @var{len} specifies the length in bytes. 2455 array element and @var{len} specifies the length in bytes.
2394 2456
2395 @item @emph{C/C++}: 2457 @item @emph{C/C++}:
2396 @multitable @columnfractions .20 .80 2458 @multitable @columnfractions .20 .80
2397 @item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);} 2459 @item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);}
2460 @item @emph{Prototype}: @tab @code{void *acc_create_async(h_void *a, size_t len, int async);}
2398 @end multitable 2461 @end multitable
2399 2462
2400 @item @emph{Fortran}: 2463 @item @emph{Fortran}:
2401 @multitable @columnfractions .20 .80 2464 @multitable @columnfractions .20 .80
2402 @item @emph{Interface}: @tab @code{subroutine acc_create(a)} 2465 @item @emph{Interface}: @tab @code{subroutine acc_create(a)}
2403 @item @tab @code{type, dimension(:[,:]...) :: a} 2466 @item @tab @code{type, dimension(:[,:]...) :: a}
2404 @item @emph{Interface}: @tab @code{subroutine acc_create(a, len)} 2467 @item @emph{Interface}: @tab @code{subroutine acc_create(a, len)}
2405 @item @tab @code{type, dimension(:[,:]...) :: a} 2468 @item @tab @code{type, dimension(:[,:]...) :: a}
2406 @item @tab @code{integer len} 2469 @item @tab @code{integer len}
2407 @end multitable 2470 @item @emph{Interface}: @tab @code{subroutine acc_create_async(a, async)}
2408 2471 @item @tab @code{type, dimension(:[,:]...) :: a}
2409 @item @emph{Reference}: 2472 @item @tab @code{integer(acc_handle_kind) :: async}
2410 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2473 @item @emph{Interface}: @tab @code{subroutine acc_create_async(a, len, async)}
2411 3.2.19. 2474 @item @tab @code{type, dimension(:[,:]...) :: a}
2475 @item @tab @code{integer len}
2476 @item @tab @code{integer(acc_handle_kind) :: async}
2477 @end multitable
2478
2479 @item @emph{Reference}:
2480 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2481 3.2.21.
2412 @end table 2482 @end table
2413 2483
2414 2484
2415 2485
2416 @node acc_present_or_create 2486 @node acc_present_or_create
2417 @section @code{acc_present_or_create} -- If the data is not present on the device, allocate device memory and map it to host memory. 2487 @section @code{acc_present_or_create} -- If the data is not present on the device, allocate device memory and map it to host memory.
2418 @table @asis 2488 @table @asis
2419 @item @emph{Description} 2489 @item @emph{Description}
2420 This function tests if the host data specifed by @var{a} and of length 2490 This function tests if the host data specified by @var{a} and of length
2421 @var{len} is present or not. If it is not present, then device memory 2491 @var{len} is present or not. If it is not present, then device memory
2422 will be allocated and mapped to host memory. In C/C++, the device address 2492 will be allocated and mapped to host memory. In C/C++, the device address
2423 of the newly allocated device memory is returned. 2493 of the newly allocated device memory is returned.
2424 2494
2425 In Fortran, two (2) forms are supported. In the first form, @var{a} specifies 2495 In Fortran, two (2) forms are supported. In the first form, @var{a} specifies
2426 a contiguous array section. The second form @var{a} specifies a variable or 2496 a contiguous array section. The second form @var{a} specifies a variable or
2427 array element and @var{len} specifies the length in bytes. 2497 array element and @var{len} specifies the length in bytes.
2428 2498
2499 Note that @code{acc_present_or_create} and @code{acc_pcreate} exist for
2500 backward compatibility with OpenACC 2.0; use @ref{acc_create} instead.
2429 2501
2430 @item @emph{C/C++}: 2502 @item @emph{C/C++}:
2431 @multitable @columnfractions .20 .80 2503 @multitable @columnfractions .20 .80
2432 @item @emph{Prototype}: @tab @code{void *acc_present_or_create(h_void *a, size_t len)} 2504 @item @emph{Prototype}: @tab @code{void *acc_present_or_create(h_void *a, size_t len)}
2433 @item @emph{Prototype}: @tab @code{void *acc_pcreate(h_void *a, size_t len)} 2505 @item @emph{Prototype}: @tab @code{void *acc_pcreate(h_void *a, size_t len)}
2446 @item @tab @code{type, dimension(:[,:]...) :: a} 2518 @item @tab @code{type, dimension(:[,:]...) :: a}
2447 @item @tab @code{integer len} 2519 @item @tab @code{integer len}
2448 @end multitable 2520 @end multitable
2449 2521
2450 @item @emph{Reference}: 2522 @item @emph{Reference}:
2451 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2523 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2452 3.2.20. 2524 3.2.21.
2453 @end table 2525 @end table
2454 2526
2455 2527
2456 2528
2457 @node acc_copyout 2529 @node acc_copyout
2466 array element and @var{len} specifies the length in bytes. 2538 array element and @var{len} specifies the length in bytes.
2467 2539
2468 @item @emph{C/C++}: 2540 @item @emph{C/C++}:
2469 @multitable @columnfractions .20 .80 2541 @multitable @columnfractions .20 .80
2470 @item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);} 2542 @item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);}
2543 @item @emph{Prototype}: @tab @code{acc_copyout_async(h_void *a, size_t len, int async);}
2544 @item @emph{Prototype}: @tab @code{acc_copyout_finalize(h_void *a, size_t len);}
2545 @item @emph{Prototype}: @tab @code{acc_copyout_finalize_async(h_void *a, size_t len, int async);}
2471 @end multitable 2546 @end multitable
2472 2547
2473 @item @emph{Fortran}: 2548 @item @emph{Fortran}:
2474 @multitable @columnfractions .20 .80 2549 @multitable @columnfractions .20 .80
2475 @item @emph{Interface}: @tab @code{subroutine acc_copyout(a)} 2550 @item @emph{Interface}: @tab @code{subroutine acc_copyout(a)}
2476 @item @tab @code{type, dimension(:[,:]...) :: a} 2551 @item @tab @code{type, dimension(:[,:]...) :: a}
2477 @item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)} 2552 @item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)}
2478 @item @tab @code{type, dimension(:[,:]...) :: a} 2553 @item @tab @code{type, dimension(:[,:]...) :: a}
2479 @item @tab @code{integer len} 2554 @item @tab @code{integer len}
2480 @end multitable 2555 @item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, async)}
2481 2556 @item @tab @code{type, dimension(:[,:]...) :: a}
2482 @item @emph{Reference}: 2557 @item @tab @code{integer(acc_handle_kind) :: async}
2483 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2558 @item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, len, async)}
2484 3.2.21. 2559 @item @tab @code{type, dimension(:[,:]...) :: a}
2560 @item @tab @code{integer len}
2561 @item @tab @code{integer(acc_handle_kind) :: async}
2562 @item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a)}
2563 @item @tab @code{type, dimension(:[,:]...) :: a}
2564 @item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize(a, len)}
2565 @item @tab @code{type, dimension(:[,:]...) :: a}
2566 @item @tab @code{integer len}
2567 @item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, async)}
2568 @item @tab @code{type, dimension(:[,:]...) :: a}
2569 @item @tab @code{integer(acc_handle_kind) :: async}
2570 @item @emph{Interface}: @tab @code{subroutine acc_copyout_finalize_async(a, len, async)}
2571 @item @tab @code{type, dimension(:[,:]...) :: a}
2572 @item @tab @code{integer len}
2573 @item @tab @code{integer(acc_handle_kind) :: async}
2574 @end multitable
2575
2576 @item @emph{Reference}:
2577 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2578 3.2.22.
2485 @end table 2579 @end table
2486 2580
2487 2581
2488 2582
2489 @node acc_delete 2583 @node acc_delete
2498 array element and @var{len} specifies the length in bytes. 2592 array element and @var{len} specifies the length in bytes.
2499 2593
2500 @item @emph{C/C++}: 2594 @item @emph{C/C++}:
2501 @multitable @columnfractions .20 .80 2595 @multitable @columnfractions .20 .80
2502 @item @emph{Prototype}: @tab @code{acc_delete(h_void *a, size_t len);} 2596 @item @emph{Prototype}: @tab @code{acc_delete(h_void *a, size_t len);}
2597 @item @emph{Prototype}: @tab @code{acc_delete_async(h_void *a, size_t len, int async);}
2598 @item @emph{Prototype}: @tab @code{acc_delete_finalize(h_void *a, size_t len);}
2599 @item @emph{Prototype}: @tab @code{acc_delete_finalize_async(h_void *a, size_t len, int async);}
2503 @end multitable 2600 @end multitable
2504 2601
2505 @item @emph{Fortran}: 2602 @item @emph{Fortran}:
2506 @multitable @columnfractions .20 .80 2603 @multitable @columnfractions .20 .80
2507 @item @emph{Interface}: @tab @code{subroutine acc_delete(a)} 2604 @item @emph{Interface}: @tab @code{subroutine acc_delete(a)}
2508 @item @tab @code{type, dimension(:[,:]...) :: a} 2605 @item @tab @code{type, dimension(:[,:]...) :: a}
2509 @item @emph{Interface}: @tab @code{subroutine acc_delete(a, len)} 2606 @item @emph{Interface}: @tab @code{subroutine acc_delete(a, len)}
2510 @item @tab @code{type, dimension(:[,:]...) :: a} 2607 @item @tab @code{type, dimension(:[,:]...) :: a}
2511 @item @tab @code{integer len} 2608 @item @tab @code{integer len}
2512 @end multitable 2609 @item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, async)}
2513 2610 @item @tab @code{type, dimension(:[,:]...) :: a}
2514 @item @emph{Reference}: 2611 @item @tab @code{integer(acc_handle_kind) :: async}
2515 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2612 @item @emph{Interface}: @tab @code{subroutine acc_delete_async(a, len, async)}
2516 3.2.22. 2613 @item @tab @code{type, dimension(:[,:]...) :: a}
2614 @item @tab @code{integer len}
2615 @item @tab @code{integer(acc_handle_kind) :: async}
2616 @item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a)}
2617 @item @tab @code{type, dimension(:[,:]...) :: a}
2618 @item @emph{Interface}: @tab @code{subroutine acc_delete_finalize(a, len)}
2619 @item @tab @code{type, dimension(:[,:]...) :: a}
2620 @item @tab @code{integer len}
2621 @item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, async)}
2622 @item @tab @code{type, dimension(:[,:]...) :: a}
2623 @item @tab @code{integer(acc_handle_kind) :: async}
2624 @item @emph{Interface}: @tab @code{subroutine acc_delete_async_finalize(a, len, async)}
2625 @item @tab @code{type, dimension(:[,:]...) :: a}
2626 @item @tab @code{integer len}
2627 @item @tab @code{integer(acc_handle_kind) :: async}
2628 @end multitable
2629
2630 @item @emph{Reference}:
2631 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2632 3.2.23.
2517 @end table 2633 @end table
2518 2634
2519 2635
2520 2636
2521 @node acc_update_device 2637 @node acc_update_device
2531 array element and @var{len} specifies the length in bytes. 2647 array element and @var{len} specifies the length in bytes.
2532 2648
2533 @item @emph{C/C++}: 2649 @item @emph{C/C++}:
2534 @multitable @columnfractions .20 .80 2650 @multitable @columnfractions .20 .80
2535 @item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len);} 2651 @item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len);}
2652 @item @emph{Prototype}: @tab @code{acc_update_device(h_void *a, size_t len, async);}
2536 @end multitable 2653 @end multitable
2537 2654
2538 @item @emph{Fortran}: 2655 @item @emph{Fortran}:
2539 @multitable @columnfractions .20 .80 2656 @multitable @columnfractions .20 .80
2540 @item @emph{Interface}: @tab @code{subroutine acc_update_device(a)} 2657 @item @emph{Interface}: @tab @code{subroutine acc_update_device(a)}
2541 @item @tab @code{type, dimension(:[,:]...) :: a} 2658 @item @tab @code{type, dimension(:[,:]...) :: a}
2542 @item @emph{Interface}: @tab @code{subroutine acc_update_device(a, len)} 2659 @item @emph{Interface}: @tab @code{subroutine acc_update_device(a, len)}
2543 @item @tab @code{type, dimension(:[,:]...) :: a} 2660 @item @tab @code{type, dimension(:[,:]...) :: a}
2544 @item @tab @code{integer len} 2661 @item @tab @code{integer len}
2545 @end multitable 2662 @item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, async)}
2546 2663 @item @tab @code{type, dimension(:[,:]...) :: a}
2547 @item @emph{Reference}: 2664 @item @tab @code{integer(acc_handle_kind) :: async}
2548 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2665 @item @emph{Interface}: @tab @code{subroutine acc_update_device_async(a, len, async)}
2549 3.2.23. 2666 @item @tab @code{type, dimension(:[,:]...) :: a}
2667 @item @tab @code{integer len}
2668 @item @tab @code{integer(acc_handle_kind) :: async}
2669 @end multitable
2670
2671 @item @emph{Reference}:
2672 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2673 3.2.24.
2550 @end table 2674 @end table
2551 2675
2552 2676
2553 2677
2554 @node acc_update_self 2678 @node acc_update_self
2564 array element and @var{len} specifies the length in bytes. 2688 array element and @var{len} specifies the length in bytes.
2565 2689
2566 @item @emph{C/C++}: 2690 @item @emph{C/C++}:
2567 @multitable @columnfractions .20 .80 2691 @multitable @columnfractions .20 .80
2568 @item @emph{Prototype}: @tab @code{acc_update_self(h_void *a, size_t len);} 2692 @item @emph{Prototype}: @tab @code{acc_update_self(h_void *a, size_t len);}
2693 @item @emph{Prototype}: @tab @code{acc_update_self_async(h_void *a, size_t len, int async);}
2569 @end multitable 2694 @end multitable
2570 2695
2571 @item @emph{Fortran}: 2696 @item @emph{Fortran}:
2572 @multitable @columnfractions .20 .80 2697 @multitable @columnfractions .20 .80
2573 @item @emph{Interface}: @tab @code{subroutine acc_update_self(a)} 2698 @item @emph{Interface}: @tab @code{subroutine acc_update_self(a)}
2574 @item @tab @code{type, dimension(:[,:]...) :: a} 2699 @item @tab @code{type, dimension(:[,:]...) :: a}
2575 @item @emph{Interface}: @tab @code{subroutine acc_update_self(a, len)} 2700 @item @emph{Interface}: @tab @code{subroutine acc_update_self(a, len)}
2576 @item @tab @code{type, dimension(:[,:]...) :: a} 2701 @item @tab @code{type, dimension(:[,:]...) :: a}
2577 @item @tab @code{integer len} 2702 @item @tab @code{integer len}
2578 @end multitable 2703 @item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, async)}
2579 2704 @item @tab @code{type, dimension(:[,:]...) :: a}
2580 @item @emph{Reference}: 2705 @item @tab @code{integer(acc_handle_kind) :: async}
2581 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2706 @item @emph{Interface}: @tab @code{subroutine acc_update_self_async(a, len, async)}
2582 3.2.24. 2707 @item @tab @code{type, dimension(:[,:]...) :: a}
2708 @item @tab @code{integer len}
2709 @item @tab @code{integer(acc_handle_kind) :: async}
2710 @end multitable
2711
2712 @item @emph{Reference}:
2713 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2714 3.2.25.
2583 @end table 2715 @end table
2584 2716
2585 2717
2586 2718
2587 @node acc_map_data 2719 @node acc_map_data
2596 @multitable @columnfractions .20 .80 2728 @multitable @columnfractions .20 .80
2597 @item @emph{Prototype}: @tab @code{acc_map_data(h_void *h, d_void *d, size_t len);} 2729 @item @emph{Prototype}: @tab @code{acc_map_data(h_void *h, d_void *d, size_t len);}
2598 @end multitable 2730 @end multitable
2599 2731
2600 @item @emph{Reference}: 2732 @item @emph{Reference}:
2601 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2733 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2602 3.2.25. 2734 3.2.26.
2603 @end table 2735 @end table
2604 2736
2605 2737
2606 2738
2607 @node acc_unmap_data 2739 @node acc_unmap_data
2615 @multitable @columnfractions .20 .80 2747 @multitable @columnfractions .20 .80
2616 @item @emph{Prototype}: @tab @code{acc_unmap_data(h_void *h);} 2748 @item @emph{Prototype}: @tab @code{acc_unmap_data(h_void *h);}
2617 @end multitable 2749 @end multitable
2618 2750
2619 @item @emph{Reference}: 2751 @item @emph{Reference}:
2620 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2752 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2621 3.2.26. 2753 3.2.27.
2622 @end table 2754 @end table
2623 2755
2624 2756
2625 2757
2626 @node acc_deviceptr 2758 @node acc_deviceptr
2634 @multitable @columnfractions .20 .80 2766 @multitable @columnfractions .20 .80
2635 @item @emph{Prototype}: @tab @code{void *acc_deviceptr(h_void *h);} 2767 @item @emph{Prototype}: @tab @code{void *acc_deviceptr(h_void *h);}
2636 @end multitable 2768 @end multitable
2637 2769
2638 @item @emph{Reference}: 2770 @item @emph{Reference}:
2639 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2771 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2640 3.2.27. 2772 3.2.28.
2641 @end table 2773 @end table
2642 2774
2643 2775
2644 2776
2645 @node acc_hostptr 2777 @node acc_hostptr
2653 @multitable @columnfractions .20 .80 2785 @multitable @columnfractions .20 .80
2654 @item @emph{Prototype}: @tab @code{void *acc_hostptr(d_void *d);} 2786 @item @emph{Prototype}: @tab @code{void *acc_hostptr(d_void *d);}
2655 @end multitable 2787 @end multitable
2656 2788
2657 @item @emph{Reference}: 2789 @item @emph{Reference}:
2658 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2790 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2659 3.2.28. 2791 3.2.29.
2660 @end table 2792 @end table
2661 2793
2662 2794
2663 2795
2664 @node acc_is_present 2796 @node acc_is_present
2692 @item @tab @code{integer len} 2824 @item @tab @code{integer len}
2693 @item @tab @code{logical acc_is_present} 2825 @item @tab @code{logical acc_is_present}
2694 @end multitable 2826 @end multitable
2695 2827
2696 @item @emph{Reference}: 2828 @item @emph{Reference}:
2697 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2829 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2698 3.2.29. 2830 3.2.30.
2699 @end table 2831 @end table
2700 2832
2701 2833
2702 2834
2703 @node acc_memcpy_to_device 2835 @node acc_memcpy_to_device
2712 @multitable @columnfractions .20 .80 2844 @multitable @columnfractions .20 .80
2713 @item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void *src, size_t bytes);} 2845 @item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void *src, size_t bytes);}
2714 @end multitable 2846 @end multitable
2715 2847
2716 @item @emph{Reference}: 2848 @item @emph{Reference}:
2717 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2849 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2718 3.2.30. 2850 3.2.31.
2719 @end table 2851 @end table
2720 2852
2721 2853
2722 2854
2723 @node acc_memcpy_from_device 2855 @node acc_memcpy_from_device
2732 @multitable @columnfractions .20 .80 2864 @multitable @columnfractions .20 .80
2733 @item @emph{Prototype}: @tab @code{acc_memcpy_from_device(d_void *dest, h_void *src, size_t bytes);} 2865 @item @emph{Prototype}: @tab @code{acc_memcpy_from_device(d_void *dest, h_void *src, size_t bytes);}
2734 @end multitable 2866 @end multitable
2735 2867
2736 @item @emph{Reference}: 2868 @item @emph{Reference}:
2737 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2869 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2738 3.2.31. 2870 3.2.32.
2871 @end table
2872
2873
2874
2875 @node acc_attach
2876 @section @code{acc_attach} -- Let device pointer point to device-pointer target.
2877 @table @asis
2878 @item @emph{Description}
2879 This function updates a pointer on the device from pointing to a host-pointer
2880 address to pointing to the corresponding device data.
2881
2882 @item @emph{C/C++}:
2883 @multitable @columnfractions .20 .80
2884 @item @emph{Prototype}: @tab @code{acc_attach(h_void **ptr);}
2885 @item @emph{Prototype}: @tab @code{acc_attach_async(h_void **ptr, int async);}
2886 @end multitable
2887
2888 @item @emph{Reference}:
2889 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2890 3.2.34.
2891 @end table
2892
2893
2894
2895 @node acc_detach
2896 @section @code{acc_detach} -- Let device pointer point to host-pointer target.
2897 @table @asis
2898 @item @emph{Description}
2899 This function updates a pointer on the device from pointing to a device-pointer
2900 address to pointing to the corresponding host data.
2901
2902 @item @emph{C/C++}:
2903 @multitable @columnfractions .20 .80
2904 @item @emph{Prototype}: @tab @code{acc_detach(h_void **ptr);}
2905 @item @emph{Prototype}: @tab @code{acc_detach_async(h_void **ptr, int async);}
2906 @item @emph{Prototype}: @tab @code{acc_detach_finalize(h_void **ptr);}
2907 @item @emph{Prototype}: @tab @code{acc_detach_finalize_async(h_void **ptr, int async);}
2908 @end multitable
2909
2910 @item @emph{Reference}:
2911 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2912 3.2.35.
2739 @end table 2913 @end table
2740 2914
2741 2915
2742 2916
2743 @node acc_get_current_cuda_device 2917 @node acc_get_current_cuda_device
2751 @multitable @columnfractions .20 .80 2925 @multitable @columnfractions .20 .80
2752 @item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_device(void);} 2926 @item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_device(void);}
2753 @end multitable 2927 @end multitable
2754 2928
2755 @item @emph{Reference}: 2929 @item @emph{Reference}:
2756 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2930 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2757 A.2.1.1. 2931 A.2.1.1.
2758 @end table 2932 @end table
2759 2933
2760 2934
2761 2935
2766 This function returns the CUDA context handle. This handle is the same 2940 This function returns the CUDA context handle. This handle is the same
2767 as used by the CUDA Runtime or Driver API's. 2941 as used by the CUDA Runtime or Driver API's.
2768 2942
2769 @item @emph{C/C++}: 2943 @item @emph{C/C++}:
2770 @multitable @columnfractions .20 .80 2944 @multitable @columnfractions .20 .80
2771 @item @emph{Prototype}: @tab @code{acc_get_current_cuda_context(void);} 2945 @item @emph{Prototype}: @tab @code{void *acc_get_current_cuda_context(void);}
2772 @end multitable 2946 @end multitable
2773 2947
2774 @item @emph{Reference}: 2948 @item @emph{Reference}:
2775 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2949 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2776 A.2.1.2. 2950 A.2.1.2.
2777 @end table 2951 @end table
2778 2952
2779 2953
2780 2954
2781 @node acc_get_cuda_stream 2955 @node acc_get_cuda_stream
2782 @section @code{acc_get_cuda_stream} -- Get CUDA stream handle. 2956 @section @code{acc_get_cuda_stream} -- Get CUDA stream handle.
2783 @table @asis 2957 @table @asis
2784 @item @emph{Description} 2958 @item @emph{Description}
2785 This function returns the CUDA stream handle. This handle is the same 2959 This function returns the CUDA stream handle for the queue @var{async}.
2786 as used by the CUDA Runtime or Driver API's. 2960 This handle is the same as used by the CUDA Runtime or Driver API's.
2787 2961
2788 @item @emph{C/C++}: 2962 @item @emph{C/C++}:
2789 @multitable @columnfractions .20 .80 2963 @multitable @columnfractions .20 .80
2790 @item @emph{Prototype}: @tab @code{acc_get_cuda_stream(void);} 2964 @item @emph{Prototype}: @tab @code{void *acc_get_cuda_stream(int async);}
2791 @end multitable 2965 @end multitable
2792 2966
2793 @item @emph{Reference}: 2967 @item @emph{Reference}:
2794 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2968 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2795 A.2.1.3. 2969 A.2.1.3.
2796 @end table 2970 @end table
2797 2971
2798 2972
2799 2973
2800 @node acc_set_cuda_stream 2974 @node acc_set_cuda_stream
2801 @section @code{acc_set_cuda_stream} -- Set CUDA stream handle. 2975 @section @code{acc_set_cuda_stream} -- Set CUDA stream handle.
2802 @table @asis 2976 @table @asis
2803 @item @emph{Description} 2977 @item @emph{Description}
2804 This function associates the stream handle specified by @var{stream} with 2978 This function associates the stream handle specified by @var{stream} with
2805 the asynchronous value specified by @var{async}. 2979 the queue @var{async}.
2806 2980
2807 @item @emph{C/C++}: 2981 This cannot be used to change the stream handle associated with
2808 @multitable @columnfractions .20 .80 2982 @code{acc_async_sync}.
2809 @item @emph{Prototype}: @tab @code{acc_set_cuda_stream(int async void *stream);} 2983
2810 @end multitable 2984 The return value is not specified.
2811 2985
2812 @item @emph{Reference}: 2986 @item @emph{C/C++}:
2813 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 2987 @multitable @columnfractions .20 .80
2988 @item @emph{Prototype}: @tab @code{int acc_set_cuda_stream(int async, void *stream);}
2989 @end multitable
2990
2991 @item @emph{Reference}:
2992 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2814 A.2.1.4. 2993 A.2.1.4.
2994 @end table
2995
2996
2997
2998 @node acc_prof_register
2999 @section @code{acc_prof_register} -- Register callbacks.
3000 @table @asis
3001 @item @emph{Description}:
3002 This function registers callbacks.
3003
3004 @item @emph{C/C++}:
3005 @multitable @columnfractions .20 .80
3006 @item @emph{Prototype}: @tab @code{void acc_prof_register (acc_event_t, acc_prof_callback, acc_register_t);}
3007 @end multitable
3008
3009 @item @emph{See also}:
3010 @ref{OpenACC Profiling Interface}
3011
3012 @item @emph{Reference}:
3013 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
3014 5.3.
3015 @end table
3016
3017
3018
3019 @node acc_prof_unregister
3020 @section @code{acc_prof_unregister} -- Unregister callbacks.
3021 @table @asis
3022 @item @emph{Description}:
3023 This function unregisters callbacks.
3024
3025 @item @emph{C/C++}:
3026 @multitable @columnfractions .20 .80
3027 @item @emph{Prototype}: @tab @code{void acc_prof_unregister (acc_event_t, acc_prof_callback, acc_register_t);}
3028 @end multitable
3029
3030 @item @emph{See also}:
3031 @ref{OpenACC Profiling Interface}
3032
3033 @item @emph{Reference}:
3034 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
3035 5.3.
3036 @end table
3037
3038
3039
3040 @node acc_prof_lookup
3041 @section @code{acc_prof_lookup} -- Obtain inquiry functions.
3042 @table @asis
3043 @item @emph{Description}:
3044 Function to obtain inquiry functions.
3045
3046 @item @emph{C/C++}:
3047 @multitable @columnfractions .20 .80
3048 @item @emph{Prototype}: @tab @code{acc_query_fn acc_prof_lookup (const char *);}
3049 @end multitable
3050
3051 @item @emph{See also}:
3052 @ref{OpenACC Profiling Interface}
3053
3054 @item @emph{Reference}:
3055 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
3056 5.3.
3057 @end table
3058
3059
3060
3061 @node acc_register_library
3062 @section @code{acc_register_library} -- Library registration.
3063 @table @asis
3064 @item @emph{Description}:
3065 Function for library registration.
3066
3067 @item @emph{C/C++}:
3068 @multitable @columnfractions .20 .80
3069 @item @emph{Prototype}: @tab @code{void acc_register_library (acc_prof_reg, acc_prof_reg, acc_prof_lookup_func);}
3070 @end multitable
3071
3072 @item @emph{See also}:
3073 @ref{OpenACC Profiling Interface}, @ref{ACC_PROFLIB}
3074
3075 @item @emph{Reference}:
3076 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
3077 5.3.
2815 @end table 3078 @end table
2816 3079
2817 3080
2818 3081
2819 @c --------------------------------------------------------------------- 3082 @c ---------------------------------------------------------------------
2823 @node OpenACC Environment Variables 3086 @node OpenACC Environment Variables
2824 @chapter OpenACC Environment Variables 3087 @chapter OpenACC Environment Variables
2825 3088
2826 The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} 3089 The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
2827 are defined by section 4 of the OpenACC specification in version 2.0. 3090 are defined by section 4 of the OpenACC specification in version 2.0.
3091 The variable @env{ACC_PROFLIB}
3092 is defined by section 4 of the OpenACC specification in version 2.6.
2828 The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes. 3093 The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
2829 3094
2830 @menu 3095 @menu
2831 * ACC_DEVICE_TYPE:: 3096 * ACC_DEVICE_TYPE::
2832 * ACC_DEVICE_NUM:: 3097 * ACC_DEVICE_NUM::
3098 * ACC_PROFLIB::
2833 * GCC_ACC_NOTIFY:: 3099 * GCC_ACC_NOTIFY::
2834 @end menu 3100 @end menu
2835 3101
2836 3102
2837 3103
2838 @node ACC_DEVICE_TYPE 3104 @node ACC_DEVICE_TYPE
2839 @section @code{ACC_DEVICE_TYPE} 3105 @section @code{ACC_DEVICE_TYPE}
2840 @table @asis 3106 @table @asis
2841 @item @emph{Reference}: 3107 @item @emph{Reference}:
2842 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 3108 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2843 4.1. 3109 4.1.
2844 @end table 3110 @end table
2845 3111
2846 3112
2847 3113
2848 @node ACC_DEVICE_NUM 3114 @node ACC_DEVICE_NUM
2849 @section @code{ACC_DEVICE_NUM} 3115 @section @code{ACC_DEVICE_NUM}
2850 @table @asis 3116 @table @asis
2851 @item @emph{Reference}: 3117 @item @emph{Reference}:
2852 @uref{https://www.openacc.org, OpenACC specification v2.0}, section 3118 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
2853 4.2. 3119 4.2.
3120 @end table
3121
3122
3123
3124 @node ACC_PROFLIB
3125 @section @code{ACC_PROFLIB}
3126 @table @asis
3127 @item @emph{See also}:
3128 @ref{acc_register_library}, @ref{OpenACC Profiling Interface}
3129
3130 @item @emph{Reference}:
3131 @uref{https://www.openacc.org, OpenACC specification v2.6}, section
3132 4.3.
2854 @end table 3133 @end table
2855 3134
2856 3135
2857 3136
2858 @node GCC_ACC_NOTIFY 3137 @node GCC_ACC_NOTIFY
2877 data and asynchronous operation of computing constructs. This 3156 data and asynchronous operation of computing constructs. This
2878 asynchronous functionality is implemented by making use of CUDA 3157 asynchronous functionality is implemented by making use of CUDA
2879 streams@footnote{See "Stream Management" in "CUDA Driver API", 3158 streams@footnote{See "Stream Management" in "CUDA Driver API",
2880 TRM-06703-001, Version 5.5, for additional information}. 3159 TRM-06703-001, Version 5.5, for additional information}.
2881 3160
2882 The primary means by that the asychronous functionality is accessed 3161 The primary means by that the asynchronous functionality is accessed
2883 is through the use of those OpenACC directives which make use of the 3162 is through the use of those OpenACC directives which make use of the
2884 @code{async} and @code{wait} clauses. When the @code{async} clause is 3163 @code{async} and @code{wait} clauses. When the @code{async} clause is
2885 first used with a directive, it creates a CUDA stream. If an 3164 first used with a directive, it creates a CUDA stream. If an
2886 @code{async-argument} is used with the @code{async} clause, then the 3165 @code{async-argument} is used with the @code{async} clause, then the
2887 stream is associated with the specified @code{async-argument}. 3166 stream is associated with the specified @code{async-argument}.
3064 is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()} 3343 is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
3065 is called prior to a call to an OpenACC function, then you must call 3344 is called prior to a call to an OpenACC function, then you must call
3066 @code{acc_set_device_num()}@footnote{More complete information 3345 @code{acc_set_device_num()}@footnote{More complete information
3067 about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in 3346 about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
3068 sections 4.1 and 4.2 of the @uref{https://www.openacc.org, OpenACC} 3347 sections 4.1 and 4.2 of the @uref{https://www.openacc.org, OpenACC}
3069 Application Programming Interface”, Version 2.0.} 3348 Application Programming Interface”, Version 2.6.}
3349
3350
3351
3352 @c ---------------------------------------------------------------------
3353 @c OpenACC Profiling Interface
3354 @c ---------------------------------------------------------------------
3355
3356 @node OpenACC Profiling Interface
3357 @chapter OpenACC Profiling Interface
3358
3359 @section Implementation Status and Implementation-Defined Behavior
3360
3361 We're implementing the OpenACC Profiling Interface as defined by the
3362 OpenACC 2.6 specification. We're clarifying some aspects here as
3363 @emph{implementation-defined behavior}, while they're still under
3364 discussion within the OpenACC Technical Committee.
3365
3366 This implementation is tuned to keep the performance impact as low as
3367 possible for the (very common) case that the Profiling Interface is
3368 not enabled. This is relevant, as the Profiling Interface affects all
3369 the @emph{hot} code paths (in the target code, not in the offloaded
3370 code). Users of the OpenACC Profiling Interface can be expected to
3371 understand that performance will be impacted to some degree once the
3372 Profiling Interface has gotten enabled: for example, because of the
3373 @emph{runtime} (libgomp) calling into a third-party @emph{library} for
3374 every event that has been registered.
3375
3376 We're not yet accounting for the fact that @cite{OpenACC events may
3377 occur during event processing}.
3378
3379 We're not yet implementing initialization via a
3380 @code{acc_register_library} function that is either statically linked
3381 in, or dynamically via @env{LD_PRELOAD}.
3382 Initialization via @code{acc_register_library} functions dynamically
3383 loaded via the @env{ACC_PROFLIB} environment variable does work, as
3384 does directly calling @code{acc_prof_register},
3385 @code{acc_prof_unregister}, @code{acc_prof_lookup}.
3386
3387 As currently there are no inquiry functions defined, calls to
3388 @code{acc_prof_lookup} will always return @code{NULL}.
3389
3390 There aren't separate @emph{start}, @emph{stop} events defined for the
3391 event types @code{acc_ev_create}, @code{acc_ev_delete},
3392 @code{acc_ev_alloc}, @code{acc_ev_free}. It's not clear if these
3393 should be triggered before or after the actual device-specific call is
3394 made. We trigger them after.
3395
3396 Remarks about data provided to callbacks:
3397
3398 @table @asis
3399
3400 @item @code{acc_prof_info.event_type}
3401 It's not clear if for @emph{nested} event callbacks (for example,
3402 @code{acc_ev_enqueue_launch_start} as part of a parent compute
3403 construct), this should be set for the nested event
3404 (@code{acc_ev_enqueue_launch_start}), or if the value of the parent
3405 construct should remain (@code{acc_ev_compute_construct_start}). In
3406 this implementation, the value will generally correspond to the
3407 innermost nested event type.
3408
3409 @item @code{acc_prof_info.device_type}
3410 @itemize
3411
3412 @item
3413 For @code{acc_ev_compute_construct_start}, and in presence of an
3414 @code{if} clause with @emph{false} argument, this will still refer to
3415 the offloading device type.
3416 It's not clear if that's the expected behavior.
3417
3418 @item
3419 Complementary to the item before, for
3420 @code{acc_ev_compute_construct_end}, this is set to
3421 @code{acc_device_host} in presence of an @code{if} clause with
3422 @emph{false} argument.
3423 It's not clear if that's the expected behavior.
3424
3425 @end itemize
3426
3427 @item @code{acc_prof_info.thread_id}
3428 Always @code{-1}; not yet implemented.
3429
3430 @item @code{acc_prof_info.async}
3431 @itemize
3432
3433 @item
3434 Not yet implemented correctly for
3435 @code{acc_ev_compute_construct_start}.
3436
3437 @item
3438 In a compute construct, for host-fallback
3439 execution/@code{acc_device_host} it will always be
3440 @code{acc_async_sync}.
3441 It's not clear if that's the expected behavior.
3442
3443 @item
3444 For @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end},
3445 it will always be @code{acc_async_sync}.
3446 It's not clear if that's the expected behavior.
3447
3448 @end itemize
3449
3450 @item @code{acc_prof_info.async_queue}
3451 There is no @cite{limited number of asynchronous queues} in libgomp.
3452 This will always have the same value as @code{acc_prof_info.async}.
3453
3454 @item @code{acc_prof_info.src_file}
3455 Always @code{NULL}; not yet implemented.
3456
3457 @item @code{acc_prof_info.func_name}
3458 Always @code{NULL}; not yet implemented.
3459
3460 @item @code{acc_prof_info.line_no}
3461 Always @code{-1}; not yet implemented.
3462
3463 @item @code{acc_prof_info.end_line_no}
3464 Always @code{-1}; not yet implemented.
3465
3466 @item @code{acc_prof_info.func_line_no}
3467 Always @code{-1}; not yet implemented.
3468
3469 @item @code{acc_prof_info.func_end_line_no}
3470 Always @code{-1}; not yet implemented.
3471
3472 @item @code{acc_event_info.event_type}, @code{acc_event_info.*.event_type}
3473 Relating to @code{acc_prof_info.event_type} discussed above, in this
3474 implementation, this will always be the same value as
3475 @code{acc_prof_info.event_type}.
3476
3477 @item @code{acc_event_info.*.parent_construct}
3478 @itemize
3479
3480 @item
3481 Will be @code{acc_construct_parallel} for all OpenACC compute
3482 constructs as well as many OpenACC Runtime API calls; should be the
3483 one matching the actual construct, or
3484 @code{acc_construct_runtime_api}, respectively.
3485
3486 @item
3487 Will be @code{acc_construct_enter_data} or
3488 @code{acc_construct_exit_data} when processing variable mappings
3489 specified in OpenACC @emph{declare} directives; should be
3490 @code{acc_construct_declare}.
3491
3492 @item
3493 For implicit @code{acc_ev_device_init_start},
3494 @code{acc_ev_device_init_end}, and explicit as well as implicit
3495 @code{acc_ev_alloc}, @code{acc_ev_free},
3496 @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
3497 @code{acc_ev_enqueue_download_start}, and
3498 @code{acc_ev_enqueue_download_end}, will be
3499 @code{acc_construct_parallel}; should reflect the real parent
3500 construct.
3501
3502 @end itemize
3503
3504 @item @code{acc_event_info.*.implicit}
3505 For @code{acc_ev_alloc}, @code{acc_ev_free},
3506 @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end},
3507 @code{acc_ev_enqueue_download_start}, and
3508 @code{acc_ev_enqueue_download_end}, this currently will be @code{1}
3509 also for explicit usage.
3510
3511 @item @code{acc_event_info.data_event.var_name}
3512 Always @code{NULL}; not yet implemented.
3513
3514 @item @code{acc_event_info.data_event.host_ptr}
3515 For @code{acc_ev_alloc}, and @code{acc_ev_free}, this is always
3516 @code{NULL}.
3517
3518 @item @code{typedef union acc_api_info}
3519 @dots{} as printed in @cite{5.2.3. Third Argument: API-Specific
3520 Information}. This should obviously be @code{typedef @emph{struct}
3521 acc_api_info}.
3522
3523 @item @code{acc_api_info.device_api}
3524 Possibly not yet implemented correctly for
3525 @code{acc_ev_compute_construct_start},
3526 @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}:
3527 will always be @code{acc_device_api_none} for these event types.
3528 For @code{acc_ev_enter_data_start}, it will be
3529 @code{acc_device_api_none} in some cases.
3530
3531 @item @code{acc_api_info.device_type}
3532 Always the same as @code{acc_prof_info.device_type}.
3533
3534 @item @code{acc_api_info.vendor}
3535 Always @code{-1}; not yet implemented.
3536
3537 @item @code{acc_api_info.device_handle}
3538 Always @code{NULL}; not yet implemented.
3539
3540 @item @code{acc_api_info.context_handle}
3541 Always @code{NULL}; not yet implemented.
3542
3543 @item @code{acc_api_info.async_handle}
3544 Always @code{NULL}; not yet implemented.
3545
3546 @end table
3547
3548 Remarks about certain event types:
3549
3550 @table @asis
3551
3552 @item @code{acc_ev_device_init_start}, @code{acc_ev_device_init_end}
3553 @itemize
3554
3555 @item
3556 @c See 'DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT' in
3557 @c 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c',
3558 @c 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.
3559 Whan a compute construct triggers implicit
3560 @code{acc_ev_device_init_start} and @code{acc_ev_device_init_end}
3561 events, they currently aren't @emph{nested within} the corresponding
3562 @code{acc_ev_compute_construct_start} and
3563 @code{acc_ev_compute_construct_end}, but they're currently observed
3564 @emph{before} @code{acc_ev_compute_construct_start}.
3565 It's not clear what to do: the standard asks us provide a lot of
3566 details to the @code{acc_ev_compute_construct_start} callback, without
3567 (implicitly) initializing a device before?
3568
3569 @item
3570 Callbacks for these event types will not be invoked for calls to the
3571 @code{acc_set_device_type} and @code{acc_set_device_num} functions.
3572 It's not clear if they should be.
3573
3574 @end itemize
3575
3576 @item @code{acc_ev_enter_data_start}, @code{acc_ev_enter_data_end}, @code{acc_ev_exit_data_start}, @code{acc_ev_exit_data_end}
3577 @itemize
3578
3579 @item
3580 Callbacks for these event types will also be invoked for OpenACC
3581 @emph{host_data} constructs.
3582 It's not clear if they should be.
3583
3584 @item
3585 Callbacks for these event types will also be invoked when processing
3586 variable mappings specified in OpenACC @emph{declare} directives.
3587 It's not clear if they should be.
3588
3589 @end itemize
3590
3591 @end table
3592
3593 Callbacks for the following event types will be invoked, but dispatch
3594 and information provided therein has not yet been thoroughly reviewed:
3595
3596 @itemize
3597 @item @code{acc_ev_alloc}
3598 @item @code{acc_ev_free}
3599 @item @code{acc_ev_update_start}, @code{acc_ev_update_end}
3600 @item @code{acc_ev_enqueue_upload_start}, @code{acc_ev_enqueue_upload_end}
3601 @item @code{acc_ev_enqueue_download_start}, @code{acc_ev_enqueue_download_end}
3602 @end itemize
3603
3604 During device initialization, and finalization, respectively,
3605 callbacks for the following event types will not yet be invoked:
3606
3607 @itemize
3608 @item @code{acc_ev_alloc}
3609 @item @code{acc_ev_free}
3610 @end itemize
3611
3612 Callbacks for the following event types have not yet been implemented,
3613 so currently won't be invoked:
3614
3615 @itemize
3616 @item @code{acc_ev_device_shutdown_start}, @code{acc_ev_device_shutdown_end}
3617 @item @code{acc_ev_runtime_shutdown}
3618 @item @code{acc_ev_create}, @code{acc_ev_delete}
3619 @item @code{acc_ev_wait_start}, @code{acc_ev_wait_end}
3620 @end itemize
3621
3622 For the following runtime library functions, not all expected
3623 callbacks will be invoked (mostly concerning implicit device
3624 initialization):
3625
3626 @itemize
3627 @item @code{acc_get_num_devices}
3628 @item @code{acc_set_device_type}
3629 @item @code{acc_get_device_type}
3630 @item @code{acc_set_device_num}
3631 @item @code{acc_get_device_num}
3632 @item @code{acc_init}
3633 @item @code{acc_shutdown}
3634 @end itemize
3635
3636 Aside from implicit device initialization, for the following runtime
3637 library functions, no callbacks will be invoked for shared-memory
3638 offloading devices (it's not clear if they should be):
3639
3640 @itemize
3641 @item @code{acc_malloc}
3642 @item @code{acc_free}
3643 @item @code{acc_copyin}, @code{acc_present_or_copyin}, @code{acc_copyin_async}
3644 @item @code{acc_create}, @code{acc_present_or_create}, @code{acc_create_async}
3645 @item @code{acc_copyout}, @code{acc_copyout_async}, @code{acc_copyout_finalize}, @code{acc_copyout_finalize_async}
3646 @item @code{acc_delete}, @code{acc_delete_async}, @code{acc_delete_finalize}, @code{acc_delete_finalize_async}
3647 @item @code{acc_update_device}, @code{acc_update_device_async}
3648 @item @code{acc_update_self}, @code{acc_update_self_async}
3649 @item @code{acc_map_data}, @code{acc_unmap_data}
3650 @item @code{acc_memcpy_to_device}, @code{acc_memcpy_to_device_async}
3651 @item @code{acc_memcpy_from_device}, @code{acc_memcpy_from_device_async}
3652 @end itemize
3070 3653
3071 3654
3072 3655
3073 @c --------------------------------------------------------------------- 3656 @c ---------------------------------------------------------------------
3074 @c The libgomp ABI 3657 @c The libgomp ABI
3475 4058
3476 @node Reporting Bugs 4059 @node Reporting Bugs
3477 @chapter Reporting Bugs 4060 @chapter Reporting Bugs
3478 4061
3479 Bugs in the GNU Offloading and Multi Processing Runtime Library should 4062 Bugs in the GNU Offloading and Multi Processing Runtime Library should
3480 be reported via @uref{http://gcc.gnu.org/bugzilla/, Bugzilla}. Please add 4063 be reported via @uref{https://gcc.gnu.org/bugzilla/, Bugzilla}. Please add
3481 "openacc", or "openmp", or both to the keywords field in the bug 4064 "openacc", or "openmp", or both to the keywords field in the bug
3482 report, as appropriate. 4065 report, as appropriate.
3483 4066
3484 4067
3485 4068