转自AI Studio,原文链接:『2022语言与智能技术竞赛』- 情感可解释性评测 - 飞桨AI Studio

2022语言与智能技术竞赛: 情感可解释性评测

赛题直播课回放视频:

项目介绍

深度学习模型在很多NLP任务上已经取得巨大成功,但其常被当作一个黑盒使用,内部预测机制对使用者是不透明的。这使得深度学习模型结果不被人信任,增加落地难度,尤其是在医疗、法律等特殊领域。同时,当模型出现效果不好或鲁棒性差等问题时,由于不了解其内部机制,导致很难对模型进行优化。近期,深度学习模型的可解释性被越来越多的人关注。但模型的可解释性评估还不够完善,本模块提供了3个NLP任务的评测数据和相关评测指标,旨在评估模型的可解释性。模块包含以下功能:

1. 完善可解释性评估体系,提供了评测数据和对应的评测指标
2. 提供了3种典型的证据抽取方法,分别是基于注意力(attention-based)、梯度(gradient-based)和线性模型(LIME)的证据抽取方法,并在LSTM、Transformer(RoBERTa-base和RoBERTa-large)等常用模型网络结构上完成实验验证,分别验证模型结构复杂度、模型参数规模对模型可解释的影响
3. 提供模型较全面的评估报告,含模型本身准确率等效果、以及在3个可解释评测指标上的结果

平台使用

环境准备

代码运行需要 Linux 主机,Python 3.8(推荐,其他低版本未测试过) 和 PaddlePaddle 2.1 以上版本。

推荐的环境

  • 操作系统 CentOS 7.5
  • Python 3.8.12
  • PaddlePaddle 2.1.0
  • PaddleNLP 2.2.4

除此之外,需要使用支持 GPU 的硬件环境。

PaddlePaddle

需要安装GPU版的PaddlePaddle,以及一些默认的飞桨依赖。

更多关于 PaddlePaddle 的安装教程、使用方法等请参考官方文档.

In [1]

!pip3 install paddlepaddle-gpu
!pip3 install paddlenlp==2.2.4
!pip3 install paddle-ernie
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: paddlepaddle-gpu in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (2.2.2)
Requirement already satisfied: Pillow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlepaddle-gpu) (7.1.2)
Requirement already satisfied: astor in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlepaddle-gpu) (0.8.1)
Requirement already satisfied: requests>=2.20.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlepaddle-gpu) (2.22.0)
Requirement already satisfied: numpy>=1.13 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlepaddle-gpu) (1.16.4)
Requirement already satisfied: decorator in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlepaddle-gpu) (4.4.2)
Requirement already satisfied: protobuf>=3.1.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlepaddle-gpu) (3.14.0)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlepaddle-gpu) (1.16.0)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.20.0->paddlepaddle-gpu) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.20.0->paddlepaddle-gpu) (2.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.20.0->paddlepaddle-gpu) (1.25.6)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.20.0->paddlepaddle-gpu) (2019.9.11)
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: paddlenlp==2.2.4 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (2.2.4)
Requirement already satisfied: colorama in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp==2.2.4) (0.4.4)
Requirement already satisfied: jieba in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp==2.2.4) (0.42.1)
Requirement already satisfied: seqeval in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp==2.2.4) (1.2.2)
Requirement already satisfied: colorlog in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp==2.2.4) (4.1.0)
Requirement already satisfied: h5py in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp==2.2.4) (2.9.0)
Requirement already satisfied: multiprocess in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp==2.2.4) (0.70.11.1)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from h5py->paddlenlp==2.2.4) (1.16.0)
Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from h5py->paddlenlp==2.2.4) (1.16.4)
Requirement already satisfied: dill>=0.3.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from multiprocess->paddlenlp==2.2.4) (0.3.3)
Requirement already satisfied: scikit-learn>=0.21.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from seqeval->paddlenlp==2.2.4) (0.22.1)
Requirement already satisfied: joblib>=0.11 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-learn>=0.21.3->seqeval->paddlenlp==2.2.4) (0.14.1)
Requirement already satisfied: scipy>=0.17.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-learn>=0.21.3->seqeval->paddlenlp==2.2.4) (1.3.0)
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: paddle-ernie in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (0.2.0.dev1)
Requirement already satisfied: requests in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddle-ernie) (2.22.0)
Requirement already satisfied: pathlib2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddle-ernie) (2.3.7.post1)
Requirement already satisfied: tqdm in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddle-ernie) (4.64.0)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pathlib2->paddle-ernie) (1.16.0)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->paddle-ernie) (1.25.6)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->paddle-ernie) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->paddle-ernie) (2019.9.11)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->paddle-ernie) (2.8)

第三方 Python 库

除 PaddlePaddle 及其依赖之外,还依赖其它第三方 Python 库,位于代码根目录的 requirements.txt 文件中。

可使用 pip 一键安装

In [2]

%cd ./PaddleNLP-develop/examples/model_interpretation
!pip3 install -r requirements.txt
/home/aistudio/PaddleNLP-develop/examples/model_interpretation
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: nvgpu>=0.9.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 1)) (0.9.0)
Requirement already satisfied: regex>=2021.11.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 2)) (2022.4.24)
Requirement already satisfied: spacy>=2.3.7 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (3.2.4)
Requirement already satisfied: tqdm>=4.62.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 4)) (4.64.0)
Requirement already satisfied: visualdl>=2.2.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 5)) (2.2.3)
Requirement already satisfied: ansi2html in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.7.0)
Requirement already satisfied: psutil in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (5.7.2)
Requirement already satisfied: flask-restful in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (0.3.9)
Requirement already satisfied: requests in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (2.22.0)
Requirement already satisfied: pandas in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.1.5)
Requirement already satisfied: flask in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.1.1)
Requirement already satisfied: tabulate in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (0.8.3)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.16.0)
Requirement already satisfied: pynvml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (8.0.4)
Requirement already satisfied: arrow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.2.2)
Requirement already satisfied: termcolor in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.1.0)
Requirement already satisfied: wasabi<1.1.0,>=0.8.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (0.9.1)
Requirement already satisfied: typer<0.5.0,>=0.3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (0.4.1)
Requirement already satisfied: click<8.1.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (8.0.4)
Requirement already satisfied: blis<0.8.0,>=0.4.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (0.7.7)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (2.0.6)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (1.0.2)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<1.9.0,>=1.7.4 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (1.8.2)
Requirement already satisfied: numpy>=1.15.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (1.16.4)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (1.0.7)
Requirement already satisfied: srsly<3.0.0,>=2.4.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (2.4.3)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (41.4.0)
Requirement already satisfied: jinja2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (3.0.0)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.8 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (3.0.9)
Requirement already satisfied: packaging>=20.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (21.3)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (3.3.0)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (3.0.6)
Requirement already satisfied: thinc<8.1.0,>=8.0.12 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (8.0.15)
Requirement already satisfied: pathy>=0.3.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (0.6.1)
Requirement already satisfied: typing-extensions<4.0.0.0,>=3.7.4 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (3.10.0.2)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from spacy>=2.3.7->-r requirements.txt (line 3)) (2.0.7)
Requirement already satisfied: Flask-Babel>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (1.0.0)
Requirement already satisfied: Pillow>=7.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (7.1.2)
Requirement already satisfied: bce-python-sdk in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (0.8.53)
Requirement already satisfied: protobuf>=3.11.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (3.14.0)
Requirement already satisfied: shellcheck-py in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (0.7.1.1)
Requirement already satisfied: flake8>=3.7.9 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (4.0.1)
Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (2.2.3)
Requirement already satisfied: pre-commit in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.2.2->-r requirements.txt (line 5)) (1.21.0)
Requirement already satisfied: zipp>=0.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catalogue<2.1.0,>=2.0.6->spacy>=2.3.7->-r requirements.txt (line 3)) (3.8.0)
Requirement already satisfied: importlib-metadata in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from click<8.1.0->spacy>=2.3.7->-r requirements.txt (line 3)) (4.2.0)
Requirement already satisfied: pycodestyle<2.9.0,>=2.8.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.2.2->-r requirements.txt (line 5)) (2.8.0)
Requirement already satisfied: pyflakes<2.5.0,>=2.4.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.2.2->-r requirements.txt (line 5)) (2.4.0)
Requirement already satisfied: mccabe<0.7.0,>=0.6.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.2.2->-r requirements.txt (line 5)) (0.6.1)
Requirement already satisfied: Werkzeug>=0.15 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask->nvgpu>=0.9.0->-r requirements.txt (line 1)) (0.16.0)
Requirement already satisfied: itsdangerous>=0.24 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask->nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.1.0)
Requirement already satisfied: Babel>=2.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl>=2.2.2->-r requirements.txt (line 5)) (2.8.0)
Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl>=2.2.2->-r requirements.txt (line 5)) (2019.3)
Requirement already satisfied: MarkupSafe>=2.0.0rc2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from jinja2->spacy>=2.3.7->-r requirements.txt (line 3)) (2.0.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from packaging>=20.0->spacy>=2.3.7->-r requirements.txt (line 3)) (3.0.8)
Requirement already satisfied: smart-open<6.0.0,>=5.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pathy>=0.3.5->spacy>=2.3.7->-r requirements.txt (line 3)) (5.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->nvgpu>=0.9.0->-r requirements.txt (line 1)) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->nvgpu>=0.9.0->-r requirements.txt (line 1)) (1.25.6)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->nvgpu>=0.9.0->-r requirements.txt (line 1)) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->nvgpu>=0.9.0->-r requirements.txt (line 1)) (2.8)
Requirement already satisfied: python-dateutil>=2.7.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from arrow->nvgpu>=0.9.0->-r requirements.txt (line 1)) (2.8.2)
Requirement already satisfied: future>=0.6.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from bce-python-sdk->visualdl>=2.2.2->-r requirements.txt (line 5)) (0.18.0)
Requirement already satisfied: pycryptodome>=3.8.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from bce-python-sdk->visualdl>=2.2.2->-r requirements.txt (line 5)) (3.9.9)
Requirement already satisfied: aniso8601>=0.82 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask-restful->nvgpu>=0.9.0->-r requirements.txt (line 1)) (9.0.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->visualdl>=2.2.2->-r requirements.txt (line 5)) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->visualdl>=2.2.2->-r requirements.txt (line 5)) (0.10.0)
Requirement already satisfied: aspy.yaml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.2.2->-r requirements.txt (line 5)) (1.3.0)
Requirement already satisfied: cfgv>=2.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.2.2->-r requirements.txt (line 5)) (2.0.1)
Requirement already satisfied: pyyaml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.2.2->-r requirements.txt (line 5)) (5.1.2)
Requirement already satisfied: nodeenv>=0.11.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.2.2->-r requirements.txt (line 5)) (1.3.4)
Requirement already satisfied: toml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.2.2->-r requirements.txt (line 5)) (0.10.0)
Requirement already satisfied: virtualenv>=15.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.2.2->-r requirements.txt (line 5)) (16.7.9)
Requirement already satisfied: identify>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.2.2->-r requirements.txt (line 5)) (1.4.10)

数据准备

模型训练数据

情感分析任务:

中文推荐使用ChnSentiCorp,英文推荐使用SST-2。本模块提供的中英文情感分析模型就是基于这两个数据集的。若修改训练数据集,请修改/model_interpretation/task/senti/pretrained_models/train.py (RoBERTa) 以及 /model_interpretation/task/senti/rnn/train.py (LSTM)。

下载预训练模型

使用paddlenlp框架自动缓存模型文件。

其他数据下载

请运行download.sh自动下载

In [ ]

!chmod +x ./download.sh
!./download.sh

评测数据

评测数据样例位于/model_interpretation/data/目录下,每一行为一条JSON格式的数据。

情感分析数据格式:

id: 数据的编号,作为该条数据识别key;
context:原文本数据;
sent_token:原文本数据的标准分词,注意:golden证据是基于该分词的,预测证据也需要与该分词对应;
sample_type: 数据的类性,分为原始数据(ori)和扰动数据(disturb);
rel_ids:与原始数据关联的扰动数据的id列表(只有原始数据有);

模型运行(情感分析)

模型训练

首先,进入task/senti/pretrained_models目录

In [3]

cd ./task/senti/pretrained_models
/home/aistudio/PaddleNLP-develop/examples/model_interpretation/task/senti/pretrained_models

修改run_train.sh中必要参数,然后运行run_train.sh

In [ ]

!chmod +x ./run_train.sh
!./run_train.sh

等待训练完成即可。训练过程可查看pretrained_models/logs文件夹中的日志。

重要度分数获取

下一步就是给每一个token获取一个重要度分数。首先进入task/senti/目录

In [4]

cd ..
/home/aistudio/PaddleNLP-develop/examples/model_interpretation/task/senti

修改run_inter.sh中必要参数,并运行。

In [6]

!chmod +x ./run_inter.sh
!./run_inter.sh
++ python3 ./saliency_map/sentiment_interpretable.py --language ch --base_model roberta_base --data_dir ../../data/senti_ch --vocab_path test --from_pretrained roberta-wwm-ext --batch_size 1 --init_checkpoint pretrained_models/saved_model_ch/roberta_base/model_900/model_state.pdparams --inter_mode attention --output_dir ./output/senti_ch.roberta_base --n-samples 200 --start_id 0 --eval
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:312: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.
  "You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default."
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlenlp/transformers/funnel/modeling.py:30: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable
[2022-04-28 14:25:35,448] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/roberta_base/vocab.txt and saved to /home/aistudio/.paddlenlp/models/roberta-wwm-ext
[2022-04-28 14:25:35,449] [    INFO] - Downloading vocab.txt from https://bj.bcebos.com/paddlenlp/models/transformers/roberta_base/vocab.txt
100%|████████████████████████████████████████| 107k/107k [00:00<00:00, 2.07MB/s]
[2022-04-28 14:25:35,607] [    INFO] - Downloading https://paddlenlp.bj.bcebos.com/models/transformers/roberta_base/roberta_chn_base.pdparams and saved to /home/aistudio/.paddlenlp/models/roberta-wwm-ext
[2022-04-28 14:25:35,607] [    INFO] - Downloading roberta_chn_base.pdparams from https://paddlenlp.bj.bcebos.com/models/transformers/roberta_base/roberta_chn_base.pdparams
100%|████████████████████████████████████████| 390M/390M [00:35<00:00, 11.4MB/s]
100it [00:17,  5.82it/s]

证据抽取

得到重要度分数后,我们就可以根据每个token的分数做证据抽取了。 首先我们去到证据抽取模块rationale_extraction目录。

In [7]

cd ../../rationale_extraction/
/home/aistudio/PaddleNLP-develop/examples/model_interpretation/rationale_extraction

修改generate.sh和run_2_pred_senti_per.sh中必要参数,并运行。

In [30]

!chmod +x ./generate.sh
!./generate.sh
roberta_base_attention_ch
num: 100
+++ python3 ./sentiment_pred.py --base_model roberta_base --data_dir ./rationale/senti/roberta_base_attention_ch/rationale_text/dev --output_dir ./prediction/senti/roberta_base_attention_ch/rationale_text/dev --vocab_path test --from_pretrained roberta-wwm-ext --batch_size 1 --init_checkpoint ../task/senti/pretrained_models/saved_model_ch/roberta_base/model_900/model_state.pdparams --inter_mode attention --n-samples 200 --language ch
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:312: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.
  "You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default."
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlenlp/transformers/funnel/modeling.py:30: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable
[2022-04-28 13:01:39,498] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/roberta-wwm-ext/vocab.txt
[2022-04-28 13:01:39,516] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/roberta-wwm-ext/roberta_chn_base.pdparams
load data 100
load model from ../task/senti/pretrained_models/saved_model_ch/roberta_base/model_900/model_state.pdparams
100it [00:12,  8.05it/s]
+++ for RATIONAL_TYPE in '"rationale_text"' '"rationale_exclusive_text"'
+++ [[ ch == \e\n ]]
+++ [[ ch == \c\h ]]
+++ [[ roberta_base == \r\o\b\e\r\t\a\_\b\a\s\e ]]
+++ FROM_PRETRAIN=roberta-wwm-ext
+++ CKPT=../task/senti/pretrained_models/saved_model_ch/roberta_base/model_900/model_state.pdparams
+++ OUTPUT=./prediction/senti/roberta_base_attention_ch/rationale_exclusive_text/dev
+++ '[' -d ./prediction/senti/roberta_base_attention_ch/rationale_exclusive_text/dev ']'
+++ set -x
+++ python3 ./sentiment_pred.py --base_model roberta_base --data_dir ./rationale/senti/roberta_base_attention_ch/rationale_exclusive_text/dev --output_dir ./prediction/senti/roberta_base_attention_ch/rationale_exclusive_text/dev --vocab_path test --from_pretrained roberta-wwm-ext --batch_size 1 --init_checkpoint ../task/senti/pretrained_models/saved_model_ch/roberta_base/model_900/model_state.pdparams --inter_mode attention --n-samples 200 --language ch
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:312: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.
  "You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default."
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlenlp/transformers/funnel/modeling.py:30: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable
[2022-04-28 13:02:11,937] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/roberta-wwm-ext/vocab.txt
[2022-04-28 13:02:11,954] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/roberta-wwm-ext/roberta_chn_base.pdparams
load data 100
load model from ../task/senti/pretrained_models/saved_model_ch/roberta_base/model_900/model_state.pdparams
100it [00:12,  7.70it/s]
roberta_base_attention_ch_finished

评估

最后我们使用evaluation目录下的评估模块,评估一下结果

In [8]

cd ../evaluation/
/home/aistudio/PaddleNLP-develop/examples/model_interpretation/evaluation

以合理性评估为例,进入plausibility目录,修改run_f1.sh中必要的参数,运行run_f1.sh

In [9]

cd ./plausibility/
/home/aistudio/PaddleNLP-develop/examples/model_interpretation/evaluation/plausibility

In [10]

!chmod +x ./run_f1.sh
!./run_f1.sh
roberta_base_attention_ch
num	4.00	macor_f1: 46.2

评估报告

中文情感分析评估报告样例:

模型 + 证据抽取方法情感分析
AccMacro-F1MAPNew_P
LSTM + IG56.836.859.891.4
RoBERTa-base + IG62.436.448.748.9
RoBERTa-large + IG65.338.341.937.8
Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐