ENH: Set up initial reference framework and add reference tables for scipy special test cases #1

steppi · 2025-02-14T18:11:43Z

This PR lays the groundwork for xsf testing.

It:

Reference implementations

Adds reference implementations for all scalar kernels in xsf corresponding to
existing ufuncs in SciPy. The reference implementations use mpmath for arbitrary precision calculation. Many
of them are able to use mpmath special function implementations directly, but for some I had to write my own implementation. I put a lot of care into these and think they are generally pretty good, but they are not tested. My thought is that we try to continuously improve these, using disagreement between xsf and the reference as a sign that either xsf or the reference is wrong, and investigating on a case by case basis.

Care was taken for things like dynamically increasing precision to deal with catastrophic cancellation, and getting behavior on branch cuts correct, placing branch cuts on either the real or imaginary axis, and using the sign of zero in the real or imaginary part to determine the side of the branch cut. Since mpmath doesn't have signed zeros I had to do a workaround that I'll get to in a moment.

There is a reference_implementation decorator. One writes the functions as if they take arbitrary precision inputs and return arbitrary precision outputs. The decorator wraps these and makes them take finite precision inputs and return finite precision outputs roughly according to the relevant ufunc casting rules. The allowed input types (real, complex, integer) are specified using type hints and type hint overloads and the reference implementation decorator makes use of these annotations; adding them is not optional. I tried to make decorator so that the framework gets out of the way when writing reference implementations.

For functions without a reference implementation, or parameter regimes where an existing reference implementation doesn't work, there is a fallback to SciPy. I've pinned SciPy and added an assert to make sure that the fallback is to SciPy prior to the split of scalar kernels into the separate xsf library. The generated reference tables contain a boolean column with info on whether there was such a fallback. We could use these entries to suggest where work still needs to be done to improve the reference implementations.

Scripts to generate reference tables

This PR also adds scripts for generating reference tables, and adds a tables with entries for every special function ufunc evaluation that takes place in SciPy special's test suite. To do this, I wrote a wrapper callable class TracedUfunc which traces the arguments passed to ufuncs and outputs them to a csv file. I wrapped all SciPy special ufuncs with it in scipy/special/__init__.py on a branch which was otherwise identical with main, and ran the full test suite.

This generates a bunch of csv files, one per function in a user specified folder. Say ~/scipy_special_tests. Another script

python -m xsref.scipy_case_generation ~/scipy_special_tests ~/xsref/tables`

takes these csv files and generates parquet files with inputs. Another script, which on my machine I ran with the arguments. For each function, there is a parquet file for each possible type overload.

python -m xsref.tables ~/xsref/tables/scipy_special_tests/ --logpath_root ~/scipy_special_tests_logs --nworkers 16

generates matching parquet files with the corresponding outputs (some ufuncs have multiple outputs). The logs at logpath show cases where the reference implementation did not match SciPy. I created this so I could go through and investigate such cases.

Yet another script

python -m xsref.initial_reference_tolerances ~/xsref/tables/scipy_special_tests/

is used to generate matching parquet files containing extended relative error values between SciPy and the reference implementation. The idea is that when testing, we don't have some fixed tolerance to compare to. But track the current error values and test that things didn't get worse by some fixed factor. So we leave some wiggle room, and perhaps accept if the relative error is within twice of the relative error in the relevant parquet file. The parquet files themselves will keep the best relative error observed thus far though, so that they won't creep upwards.

Extended relative error

I mentioned extended relative error. This is an extension of relative error to all floating point numbers, including exceptional cases like NaNs, infinities, and zeros. It will return a non-NaN answer when comparing any two floats. I don't know if I've seen anything in the literature like this, but it works well in the situation where we track a current error for each case, and test that we don't make things worse. It wouldn't work so well if we wanted to say, all relative error scores should be less than 1e-12 or something.

Test suite

I've added a test suite which checks that the parquet files are consistent. There are separate files for input, output, and errors, and there are checks that each has the correct column types, has correct metadata. For instance, the output tables contain the checksum of the corresponding input table in their metadata, the error tables contains the checksum of the corresponding input and output tables. These are compared to the actual checksums to ensure that the derived tables were generated from the current upstream tables.

The README

I wrote a README with more information that may be useful to look at. In general though, it would be nice to document things further.

CC

@mdhaber, you may be interested in this. There may be some overlap to scikit-stats/scikit-stats#7, and it would be good to try to port these reference implementations over to https://github.com/mdhaber/mparray.

@inkydragon, you may be interested in trying out the reference tables here in the tests you are writing in scipy/xsf#7. I think that looks pretty good, though I think I'd prefer to use Catch2 over google test.

@fancidev, this just seems like something you might be interested in, though feel free to disregard if you are busy.

@izaid, I know you're very busy, but just want to make sure you're aware of this.

inkydragon · 2025-02-15T06:02:01Z

@inkydragon, you may be interested in trying out the reference tables here in the tests you are writing in scipy/xsf#7.

You mean to use those binary .parquet files, right?

Here's the problem, including binaries in a git project never seems to be the best option, even with git LFS.
These files always lead to an endless increase in the size of the project.

Maybe using github action to automatically publish these files to a release is a better way to go?

I think that looks pretty good, though I think I'd prefer to use Catch2 over google test.

I'll try Catch2 + Apache Arrow to see if it's looks good.

steppi · 2025-02-15T06:22:49Z

Here's the problem, including binaries in a git project never seems to be the best option, even with git LFS.
These files always lead to an endless increase in the size of the project.

Right. Storing the parquet files in Git is a short term measure until we figure out the funding situation for storing the files.

I'll try Catch2 + Apache Arrow to see if it's looks good.

Awesome. Thank you. My plan was to write a (heavily templated) Catch2 custom generator to get the cases from parquet files.

- this commit actually updates the table. previous commits update the code for generating tables

steppi · 2025-03-07T21:41:58Z

I've updated this to use two columns for complex numbers instead of a struct column because I found the C++ Arrow Parquet reader doesn't have the best support for struct columns.

Example below for how I'm reading the C++ Parquet files.

#include <complex>
#include <iostream>
#include <tuple>
#include <type_traits>

#include <arrow/io/file.h>
#include <parquet/stream_reader.h>


template <typename T>
struct remove_complex {
    using type = T;
};

template <typename T>
struct remove_complex<std::complex<T>> {
    using type = T;
};

template <typename T>
using remove_complex_t = typename remove_complex<T>::type;

template <typename... ColumnTypes>
class XsrefTableReader {
public:
    XsrefTableReader(const std::string& file_path) {
        PARQUET_ASSIGN_OR_THROW(
            infile_,
            arrow::io::ReadableFile::Open(file_path)
        );
	parquet::StreamReader stream{parquet::ParquetFileReader::Open(infile_)};
        stream_ = std::make_unique<parquet::StreamReader>(std::move(stream));
    }

    std::tuple<ColumnTypes...> operator()() {
        std::tuple<ColumnTypes...> row;
	fill_row(row);
	return row;
    }

    bool eof() const {
	return stream_->eof();
    }

private:
    void fill_row(std::tuple<ColumnTypes...>& elements) {
	std::apply([this](auto& ...x) { (fill_element(x), ...); }, elements);
	stream_->EndRow();
    }

    template <typename T>
    void fill_element(T& element) {
	if constexpr (std::is_same_v<T, std::complex<remove_complex_t<T>>>) {
	    using V = remove_complex_t<T>;
	    V real;
	    V imag;
	    *stream_ >> real >> imag;
	    element = T(real, imag);
	} else {
	    *stream_ >> element;
	}
    }

    std::shared_ptr<arrow::io::ReadableFile> infile_;
    std::unique_ptr<parquet::StreamReader> stream_;
};


int main() {
    auto reader = XsrefTableReader<double, double, double, std::complex<double>>(
	"/home/steppi/xsref/tables/scipy_special_tests/hyp2f1/In_d_d_d_cd-cd.parquet"
	);
    while (!reader.eof()) {
	auto [a, b, c, z] = reader();
	std::cout << "a: " << a << ", " << "b: " << b << ", " << "c: " << c << ", " << "z: " << z << '\n';
    }
}

steppi added 30 commits January 13, 2025 19:44

Add reference implementations

dd7b6f2

Implement fallbacks

d06f455

Make some whitespace adjustments

73d882a

Allow setting timeout at function call level

5da4970

Abstract out calculation of resolution precision

6f467a2

Simplify scipy_func attribute name

f96c02b

Remove dead code

466766c

Fix resolution precision calculation for log1pmx

304d2b2

Exclude get_resolution_precision from __all__

1152340

Some fixes

24717c4

Some fixes

415f5ea

Add parquet tables for scipy.special test case input

31e56ae

Added extended error metrics

cffd6e6

Make it easier to keep unwanted things out of __all__

93b7c15

Add pytest.ini with custom markers

f199220

Add function to read inputs from a parquet table

8981e97

Add function to compute output table for parquet input table

33a5100

More table generation code

58d60d7

Add working tree state to metadata

77bb955

Add tests for table consistency

f7a5be0

Fix test method name

0298ced

Fix typo in comment

02cdf91

Pin mpmath version in pyproject.toml

c6ff1a0

Make updates to test_tables

44ecb96

Improvements to table handling + some fixes

7f31bee

Add test of consistency of reference values

8daf362

Add TracedUFunc for identifying existing scipy.special test cases

5eaa0c1

Add function to convert traced cases to a parquet file

ecd1d7d

Update to more specific filename for scipy case generation

e289057

Add script for generating scipy test cases

19f6fcb

steppi added 14 commits February 13, 2025 12:58

Update _get_git_info to not care about modifications in tables

fc6b128

Change filename format to work on case insensitive file systems

6984948

Update tests to account for new table filename format

16ea7bb

Fix overflow case for extended_relative_error

ee2fb59

Make extended relative/absolute error functions not raise RuntimeWarning

e161f7a

Fix extended_relative_error very large desired case

89bde49

Fix table filenames

ed3aef1

Fix getting types from filename in tests

5ec59b9

Add GitHub workflow with tests

393bbd1

Update tables

d4cac00

Flesh out the README

b372640

Fix types in filename for reference tolerances script

1145b68

Fix tolerance table filenames

9bbabc2

Fix github workflow formatting

7e98fc4

steppi closed this Feb 14, 2025

steppi reopened this Feb 14, 2025

steppi added 2 commits February 14, 2025 16:04

Fix extended_absolute_error actual or desired infinite

a483eff

Update tol tables in response to extended error fix

cc4fef5

steppi mentioned this pull request Feb 25, 2025

Benchmarking and Accuracy Documentation scikit-stats/scikit-stats#7

Open

steppi added 6 commits March 6, 2025 14:12

Update table format to use two cols for complex instead of struct

cb3a826

Update table format to use two cols for complex types part 2

fc82707

Update comments in response to flattening of tables

9ef6bc8

Ensure correct types in table schema

0cb5319

Update tables tests for new table format

e77ce00

Update all tables to use two columns for complex numbers

1ca8cf8

- this commit actually updates the table. previous commits update the code for generating tables

steppi added 2 commits March 11, 2025 00:38

Fix order in call to extended_relative_error

9b04ef2

Check in corrected parquet error tables

b88807a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Set up initial reference framework and add reference tables for scipy special test cases #1

ENH: Set up initial reference framework and add reference tables for scipy special test cases #1

steppi commented Feb 14, 2025 •

edited

Loading

inkydragon commented Feb 15, 2025

steppi commented Feb 15, 2025

steppi commented Mar 7, 2025

ENH: Set up initial reference framework and add reference tables for scipy special test cases #1

Are you sure you want to change the base?

ENH: Set up initial reference framework and add reference tables for scipy special test cases #1

Conversation

steppi commented Feb 14, 2025 • edited Loading

Reference implementations

Scripts to generate reference tables

Extended relative error

Test suite

The README

CC

inkydragon commented Feb 15, 2025

steppi commented Feb 15, 2025

steppi commented Mar 7, 2025

steppi commented Feb 14, 2025 •

edited

Loading