Skip to content

Commit 53ba7e4

Browse files
authored
Update README and Docs for 20.12 release (#137)
1 parent 37afcf2 commit 53ba7e4

4 files changed

Lines changed: 91 additions & 8 deletions

File tree

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:20.09-py3
15+
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:20.12-py3
1616

1717
############################################################################
1818
## Install PyProf

README.rst

Lines changed: 86 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,42 @@
1818
PyProf - PyTorch Profiling tool
1919
===============================
2020

21-
**NOTE: You are currently on teh r20.12 branch which tracks stabilization
22-
towards the release. This branch is not usable during stabilization**
23-
2421
.. overview-begin-marker-do-not-remove
2522
23+
PyProf is a tool that profiles and analyzes the GPU performance of PyTorch
24+
models. PyProf aggregates kernel performance from `Nsight Systems
25+
<https://developer.nvidia.com/nsight-systems>`_ or `NvProf
26+
<https://developer.nvidia.com/nvidia-visual-profiler>`_ and provides the
27+
following additional features:
28+
29+
What's New in 3.7.0
30+
-------------------
31+
32+
* Monkey patching support for APEX libraries.
33+
34+
Features
35+
--------
36+
37+
* Identifies the layer that launched a kernel: e.g. the association of
38+
`ComputeOffsetsKernel` with a concrete PyTorch layer or API is not obvious.
39+
40+
* Identifies the tensor dimensions and precision: without knowing the tensor
41+
dimensions and precision, it's impossible to reason about whether the actual
42+
(silicon) kernel time is close to maximum performance of such a kernel on
43+
the GPU. Knowing the tensor dimensions and precision, we can figure out the
44+
FLOPs and bandwidth required by a layer, and then determine how close to
45+
maximum performance the kernel is for that operation.
46+
47+
* Forward-backward correlation: PyProf determines what the forward pass step
48+
is that resulted in the particular weight and data gradients (wgrad, dgrad),
49+
which makes it possible to determine the tensor dimensions required by these
50+
backprop steps to assess their performance.
51+
52+
* Determines Tensor Core usage: PyProf can highlight the kernels that use
53+
`Tensor Cores <https://developer.nvidia.com/tensor-cores>`_.
54+
55+
* Correlate the line in the user's code that launched a particular kernel (program trace).
56+
2657
.. overview-end-marker-do-not-remove
2758
2859
Quick Installation Instructions
@@ -75,5 +106,57 @@ Quick Start Instructions
75106

76107
.. quick-start-end-marker-do-not-remove
77108
109+
Documentation
110+
-------------
111+
112+
The User Guide can be found in the
113+
`documentation for current release
114+
<https://docs.nvidia.com/deeplearning/frameworks/pyprof-user-guide/index.html>`_, and
115+
provides instructions on how to install and profile with PyProf.
116+
117+
A complete `Quick Start Guide <https://docs.nvidia.com/deeplearning/frameworks/pyprof-user-guide/quickstart.html>`_
118+
provides step-by-step instructions to get you quickly started using PyProf.
119+
120+
An `FAQ <https://docs.nvidia.com/deeplearning/frameworks/pyprof-user-guide/faqs.html>`_ provides
121+
answers for frequently asked questions.
122+
123+
The `Release Notes
124+
<https://docs.nvidia.com/deeplearning/frameworks/pyprof-release-notes/index.html>`_
125+
indicate the required versions of the NVIDIA Driver and CUDA, and also describe
126+
which GPUs are supported by PyProf
127+
128+
Presentation and Papers
129+
^^^^^^^^^^^^^^^^^^^^^^^
130+
131+
* `Automating End-toEnd PyTorch Profiling <https://developer.nvidia.com/gtc/2020/video/s21143>`_.
132+
* `Presentation slides <https://developer.download.nvidia.com/video/gputechconf/gtc/2020/presentations/s21143-automating-end-to-end-pytorch-profiling.pdf>`_.
133+
134+
Contributing
135+
------------
136+
137+
Contributions to PyProf are more than welcome. To
138+
contribute make a pull request and follow the guidelines outlined in
139+
the `Contributing <CONTRIBUTING.md>`_ document.
140+
141+
Reporting problems, asking questions
142+
------------------------------------
143+
144+
We appreciate any feedback, questions or bug reporting regarding this
145+
project. When help with code is needed, follow the process outlined in
146+
the Stack Overflow (https://stackoverflow.com/help/mcve)
147+
document. Ensure posted examples are:
148+
149+
* minimal – use as little code as possible that still produces the
150+
same problem
151+
152+
* complete – provide all parts needed to reproduce the problem. Check
153+
if you can strip external dependency and still show the problem. The
154+
less time we spend on reproducing problems the more time we have to
155+
fix it
156+
157+
* verifiable – test the code you're about to provide to make sure it
158+
reproduces the problem. Remove all other problems that are not
159+
related to your request/question.
160+
78161
.. |License| image:: https://img.shields.io/badge/License-Apache2-green.svg
79162
:target: http://www.apache.org/licenses/LICENSE-2.0

docs/install.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,6 @@ the most recent version of CUDA, Docker, and nvidia-docker.
4848
After performing the above setup, you can pull the PyProf container
4949
using the following command::
5050

51-
docker pull nvcr.io/nvidia/pytorch:20.10-py3
51+
docker pull nvcr.io/nvidia/pytorch:20.12-py3
5252

53-
Replace *20.10* with the version of PyTorch container that you want to pull.
53+
Replace *20.12* with the version of PyTorch container that you want to pull.

docs/quickstart.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Prerequisites
3939
drop down button. After cloning the repo be sure to select the r<xx.yy>
4040
release branch that corresponds to the version of PyProf want to use::
4141

42-
$ git checkout r20.10
42+
$ git checkout r20.12
4343

4444
* If you are starting with a pre-built NGC container, you will need to install
4545
Docker and nvidia-docker. For DGX users, see `Preparing to use NVIDIA Containers
@@ -75,7 +75,7 @@ the GitHub repo and checkout the release version of the branch that
7575
you want to build (or the master branch if you want to build the
7676
under-development version)::
7777

78-
$ git checkout r20.10
78+
$ git checkout r20.12
7979

8080
Then use docker to build::
8181

0 commit comments

Comments
 (0)