|
18 | 18 | PyProf - PyTorch Profiling tool |
19 | 19 | =============================== |
20 | 20 |
|
21 | | - **NOTE: You are currently on teh r20.12 branch which tracks stabilization |
22 | | - towards the release. This branch is not usable during stabilization** |
23 | | - |
24 | 21 | .. overview-begin-marker-do-not-remove |
25 | 22 |
|
| 23 | +PyProf is a tool that profiles and analyzes the GPU performance of PyTorch |
| 24 | +models. PyProf aggregates kernel performance from `Nsight Systems |
| 25 | +<https://developer.nvidia.com/nsight-systems>`_ or `NvProf |
| 26 | +<https://developer.nvidia.com/nvidia-visual-profiler>`_ and provides the |
| 27 | +following additional features: |
| 28 | + |
| 29 | +What's New in 3.7.0 |
| 30 | +------------------- |
| 31 | + |
| 32 | +* Monkey patching support for APEX libraries. |
| 33 | + |
| 34 | +Features |
| 35 | +-------- |
| 36 | + |
| 37 | +* Identifies the layer that launched a kernel: e.g. the association of |
| 38 | + `ComputeOffsetsKernel` with a concrete PyTorch layer or API is not obvious. |
| 39 | + |
| 40 | +* Identifies the tensor dimensions and precision: without knowing the tensor |
| 41 | + dimensions and precision, it's impossible to reason about whether the actual |
| 42 | + (silicon) kernel time is close to maximum performance of such a kernel on |
| 43 | + the GPU. Knowing the tensor dimensions and precision, we can figure out the |
| 44 | + FLOPs and bandwidth required by a layer, and then determine how close to |
| 45 | + maximum performance the kernel is for that operation. |
| 46 | + |
| 47 | +* Forward-backward correlation: PyProf determines what the forward pass step |
| 48 | + is that resulted in the particular weight and data gradients (wgrad, dgrad), |
| 49 | + which makes it possible to determine the tensor dimensions required by these |
| 50 | + backprop steps to assess their performance. |
| 51 | + |
| 52 | +* Determines Tensor Core usage: PyProf can highlight the kernels that use |
| 53 | + `Tensor Cores <https://developer.nvidia.com/tensor-cores>`_. |
| 54 | + |
| 55 | +* Correlate the line in the user's code that launched a particular kernel (program trace). |
| 56 | + |
26 | 57 | .. overview-end-marker-do-not-remove |
27 | 58 |
|
28 | 59 | Quick Installation Instructions |
@@ -75,5 +106,57 @@ Quick Start Instructions |
75 | 106 |
|
76 | 107 | .. quick-start-end-marker-do-not-remove |
77 | 108 |
|
| 109 | +Documentation |
| 110 | +------------- |
| 111 | + |
| 112 | +The User Guide can be found in the |
| 113 | +`documentation for current release |
| 114 | +<https://docs.nvidia.com/deeplearning/frameworks/pyprof-user-guide/index.html>`_, and |
| 115 | +provides instructions on how to install and profile with PyProf. |
| 116 | + |
| 117 | +A complete `Quick Start Guide <https://docs.nvidia.com/deeplearning/frameworks/pyprof-user-guide/quickstart.html>`_ |
| 118 | +provides step-by-step instructions to get you quickly started using PyProf. |
| 119 | + |
| 120 | +An `FAQ <https://docs.nvidia.com/deeplearning/frameworks/pyprof-user-guide/faqs.html>`_ provides |
| 121 | +answers for frequently asked questions. |
| 122 | + |
| 123 | +The `Release Notes |
| 124 | +<https://docs.nvidia.com/deeplearning/frameworks/pyprof-release-notes/index.html>`_ |
| 125 | +indicate the required versions of the NVIDIA Driver and CUDA, and also describe |
| 126 | +which GPUs are supported by PyProf |
| 127 | + |
| 128 | +Presentation and Papers |
| 129 | +^^^^^^^^^^^^^^^^^^^^^^^ |
| 130 | + |
| 131 | +* `Automating End-toEnd PyTorch Profiling <https://developer.nvidia.com/gtc/2020/video/s21143>`_. |
| 132 | + * `Presentation slides <https://developer.download.nvidia.com/video/gputechconf/gtc/2020/presentations/s21143-automating-end-to-end-pytorch-profiling.pdf>`_. |
| 133 | + |
| 134 | +Contributing |
| 135 | +------------ |
| 136 | + |
| 137 | +Contributions to PyProf are more than welcome. To |
| 138 | +contribute make a pull request and follow the guidelines outlined in |
| 139 | +the `Contributing <CONTRIBUTING.md>`_ document. |
| 140 | + |
| 141 | +Reporting problems, asking questions |
| 142 | +------------------------------------ |
| 143 | + |
| 144 | +We appreciate any feedback, questions or bug reporting regarding this |
| 145 | +project. When help with code is needed, follow the process outlined in |
| 146 | +the Stack Overflow (https://stackoverflow.com/help/mcve) |
| 147 | +document. Ensure posted examples are: |
| 148 | + |
| 149 | +* minimal – use as little code as possible that still produces the |
| 150 | + same problem |
| 151 | + |
| 152 | +* complete – provide all parts needed to reproduce the problem. Check |
| 153 | + if you can strip external dependency and still show the problem. The |
| 154 | + less time we spend on reproducing problems the more time we have to |
| 155 | + fix it |
| 156 | + |
| 157 | +* verifiable – test the code you're about to provide to make sure it |
| 158 | + reproduces the problem. Remove all other problems that are not |
| 159 | + related to your request/question. |
| 160 | + |
78 | 161 | .. |License| image:: https://img.shields.io/badge/License-Apache2-green.svg |
79 | 162 | :target: http://www.apache.org/licenses/LICENSE-2.0 |
0 commit comments