diff --git a/docs/development.md b/docs/development.md index 28d1aefae..d1a4abacd 100644 --- a/docs/development.md +++ b/docs/development.md @@ -11,41 +11,26 @@ please open an [issue]() or ## Quickstart -- Make a fork of the msprime repo on [GitHub]() -- Clone your fork into a local directory, making sure that the **submodules - are correctly initialised**: +- Fork the msprime repo on [GitHub]() and + clone your fork, making sure that the **submodules are correctly initialised**: ``` - $ git clone git@github.com:tskit-dev/msprime.git --recurse-submodules + $ git clone git@github.com:YOUR_GITHUB_USERNAME/msprime.git --recurse-submodules ``` - For an already checked out repo, the submodules can be initialised using: + For an already checked out repo, initialise submodules with: ``` $ git submodule update --init --recursive ``` -- Install the {ref}`requirements `. -- Build the low level module by running `make` in the project root. -- Run the tests to ensure everything has worked: `python3 -m pytest`. These should - all pass. -- Install the pre-commit checks: `pre-commit install` -- Make your changes in a local branch. On each commit a [pre-commit hook]() will run - checks for code style and common problems. - Sometimes these will report "files were modified by this hook" `git add` - and `git commit --amend` will update the commit with the automatically modified - version. - The modifications made are for consistency, code readability and designed to - minimise merge conflicts. They are guaranteed not to modify the functionality of the - code. To run the checks without committing use `pre-commit run`. To bypass - the checks (to save or get feedback on work-in-progress) use `git commit - --no-verify` -- If you have modified the C code then - `clang-format -i lib/tests/* lib/!(avl).{c,h}` will format the code to - satisfy CI checks. -- When ready open a pull request on GitHub. Please make sure that the tests pass before - you open the PR, unless you want to ask the community for help with a failing test. -- See the [tskit documentation]() - for more details on the recommended GitHub workflow. +- Install dependencies and build the low-level module: `uv sync` +- Install the prek pre-commit hook: `uv run prek install` +- Run the tests to ensure everything is working: `uv run pytest` +- Make your changes in a local branch and open a pull request when ready. + +See the [tskit developer documentation](https://tskit.dev/tskit/docs/stable/development.html) +for details on the recommended git workflow, running lint checks, and building +the documentation. (sec_development_requirements)= ## Requirements @@ -64,17 +49,11 @@ on the documentation because it requires a locally built version of the ### Python requirements -The packages needed for development are specified as optional dependencies -in the ``pyproject.toml`` file. Install these using: +The packages needed for development are specified as dependency groups in +``python/pyproject.toml``. Install them along with the low-level C extension using: ``` -$ python -m pip install -e ".[dev]" -``` - -For conda users, you may need to install GSL first: -``` -conda install gsl -pip install -e ".[dev]" +$ uv sync ``` ## Overview @@ -94,17 +73,9 @@ documented in the following sections. ## Continuous integration tests -Two different continuous integration providers are used, which run different -combinations of tests on different platforms: - -1. [GitHub Actions]() run a variety of code style and - quality checks using [pre-commit]() along - with Python tests on Linux, macOS and Windows. The docs are also built and a preview - generated if changes are detected. -2. [CircleCI]() Runs all Python tests using the apt-get - infrastructure for system requirements. Additionally, the low-level tests - are run, coverage statistics calculated using [CodeCov](), - and the documentation built. +CI uses shared GitHub Actions workflows from the tskit-dev ecosystem. See the +[repo administration guide](https://github.com/tskit-dev/.github/blob/main/repo_administration.md) +for details of the available workflows and how they are configured. (sec_development_documentation)= @@ -118,8 +89,7 @@ is split into two main sections: a particular function or class. See the {ref}`sec_development_documentation_api` section for details. 2. Thematically structured sections which discuss the functionality and - explain features via minimal examples. See the - {ref}`sec_development_documentation_examples` section for details. + explain features via minimal examples. Further documentation where features are combined to perform specific tasks is provided in the [tskit tutorials](https://tskit.dev/tutorials) site. @@ -141,19 +111,17 @@ Once you have created and checked out a "topic branch", you are ready to start editing the documentation. :::{note} -Please make sure you have built the low-level {ref}`C module ` -by running ``make`` in the project root directory before going any further. -A lot of inscrutable errors are caused by a mismatch between the low-level C module -installed in your system (or an older development version you previously compiled) -and the local development version of msprime. +The documentation requires a working build of the low-level C module. This is +built automatically by `uv sync`, but if you see inscrutable import errors a +mismatch between the installed and local versions of the module is a common cause. +Rebuild by running `make` in the project root. ::: (sec_development_documentation_building)= ### Building To build the documentation locally, go to the `docs` directory and run `make` -(ensure that the {ref}`sec_development_requirements` have been installed -and the low-level C module has been built---see the note in the previous section). +(ensure that the {ref}`sec_development_requirements` have been installed). This will build the HTML documentation in `docs/_build/html/`. You can now view the local build of the HTML in your local browser (if you do not know how to do this, try double clicking the HTML file). @@ -164,207 +132,43 @@ try running ``make clean`` which will delete all of the HTML and cached Jupyter notebook content. ::: -### JupyterBook - -Documentation for msprime is built using [Jupyter Book](https://jupyterbook.org), -which allows us to mix API documentation generated automatically using -[Sphinx](https://www.sphinx-doc.org) with code examples evaluated in a -local [Jupyter](https://jupyter.org) kernel. This is a very powerful -system that allows us to generate beautiful and useful documentation, -but it is quite new and has some quirks and gotchas. -In particular, because of the mixture of API documentation and notebook -content we need to write documentation using **two different markup -languages**. - -#### reStructuredText - -All of the documentation for previous versions of msprime was written -using the [reStructuredText]() format -(rST) which is the default for Python documentation. Because of this, all of -the API docstrings (see the {ref}`sec_development_documentation_api` section) -are written using rST. Converting these docstrings to Markdown -would be a lot of work (and support from upstream tools for Markdown -dosctrings is patchy), and so we need to use rST for this -purpose for the forseeable future. - -Some of the directives we use are only available in rST, and so these -must be enclosed in ``eval-rst`` blocks like so: - -````md -```{eval-rst} -.. autoclass:: msprime.StandardCoalescent -``` -```` - -#### Markdown - -Everything **besides** API docstrings is written using -[MyST Markdown](https://jupyterbook.org/content/myst.html). This is a -superset of [common Markdown](https://commonmark.org) which -enables executable Jupyter content to be included in the documentation. -In particular, JupyterBook and MyST are built on top of -[Sphinx](https://www.sphinx-doc.org) which allows us to do lots -of cross-referencing. - -Some useful links: - -- The [MyST cheat sheet](https://jupyterbook.org/reference/cheatsheet.html) - is a great resource. -- The "Write Book Content" part of the [Jupyter Book](https://jupyterbook.org/) - documentation has lots of helpful examples and links. -- The [MyST Syntax Guide](https://myst-parser.readthedocs.io/en/latest/using/syntax.html) - is a good reference for the full syntax -- Sphinx - [directives](https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html). - Some of these will work with Jupyter Book, some won't. There's currently no - comprehensive list of those that do. However, we tend to only use a small subset - of the available directives, and you can usually get by following existing - local examples. -- The [types of source files](https://jupyterbook.org/file-types/index.html) - section in the Jupyter Book documentation is useful reference for mixing - and matching Markdown and rST like we're doing. - (sec_development_documentation_api)= ### API Reference -API reference documentation is provided by -[docstrings](https://www.python.org/dev/peps/pep-0257/) in the source -code. These docstrings are written using -[reStructuredText]() -and [Sphinx](https://www.sphinx-doc.org). - -Docstrings should be **concise** and **precise**. Examples should not -be provided directly in the docstrings, but each significant -parameter (e.g.) in a function should have links to the corresponding -{ref}`examples ` section. - -```{eval-rst} -.. todo:: Provide an example of a well-documented docstring with - links to an examples section. We should use one of the simpler - functions as an example of this. -``` - -(sec_development_documentation_examples)= -### Examples - -The API reference documentation is gives precise formal information about -how to use a particular function of class. The rest of the manual should -provide the discussion and examples needed to contextualise this information -and help users to orient themselves. The examples section for a given -feature (e.g., function parameter) should: - -- Provide some background into what this feature is for, so that an - unsure reader can quickly orient themselves (external links to explain - concepts is good for this). -- Give examples using inline Jupyter code to illustrate the various - different ways that this feature can be used. These examples should - be as small as possible, so that the overall document runs quickly. - -Juptyer notebook code is incluced by using blocks like this: - -````md -```{code-cell} -print("This is python code!) -a = list(range(10) -a -``` -```` - -These cells behave exactly like they would in a Jupyter notebook (the -whole document is actually treated and executed like one notebook) - -:::{warning} -For a document to be evaluated as a notebook you **must** have -exactly the right [YAML Frontmatter]( -https://jupyterbook.org/reference/cheatsheet.html#executable-code) -at the top of the file. -::: +API docstrings are written in rST. The msprime docstrings predate the switch to +MyST Markdown and converting them would be a significant effort, so rST remains +the format for docstrings for the foreseeable future. -(sec_development_documentation_cross_referencing)= -### Cross referencing +Docstrings should be **concise** and **precise**. Examples should not be +embedded directly in docstrings; instead, each significant parameter should +link to the relevant section in the narrative documentation. -Cross referencing is done by using the ``{ref}`` inline role -(see Jupyter Book [documentation](https://jupyterbook.org/content/citations.html) -for more details) to link -to labelled sections within the manual or to API documentation. - -Sections within the manual should be labelled hierarchically, for example -this section is labelled like this: - -````md -(sec_development_documentation_cross_referencing)= -### Cross referencing -```` - -Elsewhere in the Markdown documentation we can then refer to this -section like: - -````md -See the {ref}`sec_development_documentation_cross_referencing` section for details. -```` - -Cross references like this will automatically use the section name -as the link text, which we can override if we like: - -````md -See {ref}`another section` for more information. -```` - -To refer to a given section from an rST docstring, we'd do something like -````rst -See the :ref:`sec_development_documentation_cross_referencing` section for more details. -```` - -When we want to refer to the API documentation for a function or class, we -use the appropriate inline text role to do so. For example, - -````md -The {func}`.sim_ancestry` function lets us simulate ancestral histories. -```` - -It's a good idea to always use this form when referring to functions -or classes so that the reader always has direct access to the API -documentation for a given function when they might it. +See the [tskit developer documentation](https://tskit.dev/tskit/docs/stable/development.html) +for guidance on markup languages, code examples, and cross referencing. ## High-level Python -Throughout this document, we assume that the `msprime` package is built and -run locally *within* the project directory. That is, `msprime` is *not* installed -into the Python installation using `pip install -e` or setuptools [development -mode](). Please -ensure that you build the low-level module using (e.g.) `make` and that -the shared object file is in the `msprime` directory. This will have a name -like `_msprime.cpython-38-x86_64-linux-gnu.so`, depending on your platform -and Python version. +The `msprime` package is installed in editable mode by `uv sync`. The low-level +C extension is also built automatically at that point. ### Conventions -All Python code follows the [PEP8]() style -guide, and is checked using the [flake8]() tool as -part of the continuous integration tests. [Black]() is -used as part of the pre-commit hook for python code style and formatting. - -### Packaging - -`msprime` is packaged and distributed as Python module, and follows the current -[best-practices]() advocated by the -[Python Packaging Authority](). The primary means of -distribution is though [PyPI](), which provides the -canonical source for each release. - -A package for [conda]() is also available on -[conda-forge](). +Python code is formatted and linted using [ruff](https://docs.astral.sh/ruff/), +run automatically via [prek](https://prek.j178.dev) on each commit. ### Tests -The tests for the high-level code are in the `tests` directory, and run using -[pytest](). A lot of the simulation and basic -tests are contained in the `tests/test_highlevel.py` file, but more recently -smaller test files with more focussed tests are preferred (e.g., `test_vcf.py`, -`test_demography.py`). +Tests are in the `python/tests` directory and run with +[pytest](). Core simulation and basic tests +are in `python/tests/test_highlevel.py`; more focused tests are in smaller files +(e.g., `test_demography.py`). Run the suite with: + +``` +$ uv run pytest +``` -All new code must have high test coverage, which will be checked as part of the -continuous integration tests by [CodeCov](). +All new code must have high test coverage, tracked by +[CodeCov](). ### Interfacing with low-level module @@ -374,18 +178,6 @@ are really just a shallow layer on top of the corresponding low-level object. The convention here is to keep a reference to the low-level object via a private instance variable such as `self._ll_recombination_map`. -### Command line interfaces - -The command line interfaces for `msprime` are defined in the `msprime/cli.py` file. -Each CLI has a single entry point (e.g. `msp_main`) which is invoked to run the -program. These entry points are registered with `setuptools` using the -`console_scripts` argument in `setup.py`, which allows them to be deployed as -first-class executable programs in a cross-platform manner. - -There are simple scripts in the root of the project (currently: `msp_dev.py`, -`mspms_dev.py`) which are used for development. For example, to run the -development version of `mspms` use `python3 mspms_dev.py`. - ## C Library The low-level code for `msprime` is written in C, and is structured as a @@ -408,52 +200,15 @@ $ sudo apt-get install libcunit1-dev libconfig-dev ninja-build ``` -Meson is best installed via `pip`: - -```{code-block} bash - -$ python3 -m pip install meson --user - -``` - -On macOS rather than use `apt-get` for installation of these requirements -a combination of `homebrew` and `pip` can be used (working as of 2020-01-15). - -```{code-block} bash - -$ brew install cunit -$ python3 -m pip install meson --user -$ python3 -m pip install ninja --user - -``` - -On macOS, conda builds are generally done using `clang` packages that are kept up to date: - -```{code-block} bash - -$ conda install clang_osx-64 clangxx_osx-64 - -``` - -In order to make sure that these compilers work correctly (*e.g.*, so that they can find -other dependencies installed via `conda`), you need to compile `msprime` with this command -on versions of macOS older than "Mojave": +Meson can be installed with: ```{code-block} bash -$ CONDA_BUILD_SYSROOT=/ python3 setup.py build_ext -i +$ uv tool install meson ``` -On more recent macOS releases, you may omit the `CONDA_BUILD_SYSROOT` prefix. - -:::{note} - -The use of the C toolchain on macOS is a moving target. The above advice -was written on 23 January 2020 and was validated by a few `msprime` contributors. -Caveat emptor, etc.. - -::: +On macOS, `brew install cunit` can be used in place of `apt-get`. ### Compiling @@ -578,22 +333,15 @@ automatically tracked using [CodeCov]( C code is formatted using [clang-format]() -with a custom configuration. -To ensure that your code is correctly formatted, you can run +with a custom configuration. This is checked automatically by prek. To format +all files manually run: ```{code-block} bash -make clang-format +$ uv run prek --all-files ``` -in the project root before submitting a pull request. Alternatively, -you can run `clang-format -i *.[c,h]` in the `lib` directory. - -Vim users may find the -[vim-clang-format]() -plugin useful for automatically formatting code. - ### Coding conventions The code is written using the [C99]() standard. All @@ -722,7 +470,7 @@ not intended to be used directly and may change arbitrarily over time. The conventions used within the low-level module here closely follow those in `tskit`; please see the -[documentation]() +[tskit developer documentation](https://tskit.dev/tskit/docs/stable/development.html#python-c-interface) for more information. ## Statistical tests @@ -734,18 +482,12 @@ as a pre-release sanity check. They are also very useful to run when developing new simulation functionality, as subtle statistical bugs can easily slip in unnoticed. -The statistical tests are all run via the `verification.py` script in the project root. -The script has some extra dependencies, which can be installed using: - -``` -pip install -e ".[verification]" -``` - -Run this script using: +The statistical tests are all run via the `verification.py` script in the project root, +using extra dependencies declared in the `verification` dependency group. Run using: ```{code-block} bash -$ python3 verification.py +$ uv run --group verification python verification.py ``` @@ -800,7 +542,6 @@ Note the following tips: - If `make` is giving you strange errors, or if tests are failing for strange reasons, try running `make clean` in the project root and then rebuilding. -- Beware of multiple versions of the python library installed by different - programs (e.g., pip versus installing locally from source)! In python, +- Beware of multiple versions of the python library being visible. In python, `msprime.__file__` will tell you the location of the package that is being used.