awesome-ml-testing

awesome-ml-testing (aka: SE for AI)

Awesome material(papers, tools, etc.) about testing machine learning system, including deep learning system.

This repo will be updated continuously, don’t hesitate to add new Pull Request or Issues if you find anything is missing! Please use the format here.

A seperate web page for paper list is here. The webpage has fancy searchbox. Thanks to @Troublor.

My personal webpage is here and here.

Format

This repo use a specific format. When you open a new issues, you will find the template.

For tools:

Project Name:
- A short description (please add the related paper if this tool is from academic paper)”

For paper:

Paper Name:
- Author(s) Name(s)
- Conference/Journal name
- A short description

Tools

EvalDNN: https://github.com/yqtianust/EvalDNN
- EvalDNN: A Toolbox for Evaluating Deep Neural Network Models.
- Benchmark: https://yqtianust.github.io/EvalDNN-benchmark/index.html
MuDNN: https://github.com/microsoft/MMdnn
- MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
Netron: https://github.com/lutzroeder/netron
- a viewer for neural network, deep learning and machine learning models.
AIF360: https://github.com/IBM/AIF360, http://aif360.mybluemix.net/
- A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
sotabench: https://sotabench.com
- A free benchmarking service for all open source ML repositories. You can submit your model and this web will run it against some benchmakrs.
Paper with Code: https://paperswithcode.com/sota
- A website contains evalution results reported in DL/ML papers, as well as their code (if any). Note the code could be implemented by 3rd party.
Distiller: https://github.com/NervanaSystems/distiller
- An open-source Python package for neural network compression research (Pytorch).
NNCF: https://github.com/openvinotoolkit/nncf_pytorch
- Neural Network Compression Framework (NNCF) on Pytorch.
ML-Fairness: https://github.com/sumonbis/ML-Fairness
- Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness.
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing: https://www.textflint.io/
PRODeep: a platform for robustness verification of deep neural networks https://iscasmc.ios.ac.cn/prodeep/doku.php
Captum: Model interpretability and understanding for PyTorch, https://github.com/pytorch/captum
CompressAI: PyTorch library and evaluation platform for end-to-end compression research, https://github.com/InterDigitalInc/CompressAI

Dataset

APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection: https://arxiv.org/abs/1912.08166
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing: https://www.textflint.io/
Papers

See Paper List. Thanks to @Troublor

Talks

Safety and Robustness for Deep Learning with Provable Guarantees by Marta Kwiatkowska: slides video

Courses:

CS 590 Program Analysis For Deep Learning at Purdue University by Xiangyu Zhang: http://xyz-wiki.cs.purdue.edu/cs590-19f/doku.php?id=home

Free GPU Resources

Google Colab
Paperspace gradient: https://gradient.paperspace.com/