★★★ 本文源自AlStudio社区精品项目,【点击此处】查看更多精品内容 >>>

Adaptive Decision Boundary

标题:Deep Open Intent Classification with Adaptive Decision Boundary
作者:Hanlei Zhang, Hua Xu, Ting-En Lin
单位:清华大学、西门子公司
论文:AAAI2021接收论文 https://arxiv.org/abs/2012.10209
摘要:

Abstract: Open intent classification is a challenging task in dialogue systems. On the one hand, it should ensure the quality of known intent identification. On the other hand, it needs to detect the open (unknown) intent without prior knowledge. Current models are limited in finding the appropriate decision boundary to balance the performances of both known intents and the open intent. In this paper, we propose a post-processing method to learn the adaptive decision boundary (ADB) for open intent classification. We first utilize the labeled known intent samples to pre-train the model. Then, we automatically learn the adaptive spherical decision boundary for each known class with the aid of well-trained features. Specifically, we propose a new loss function to balance both the empirical risk and the open space risk. Our method does not need open intent samples and is free from modifying the model architecture. Moreover, our approach is surprisingly insensitive with less labeled data and fewer known intents. Extensive experiments on three benchmark datasets show that our method yields significant improvements compared with the state-of-the-art methods. The codes are released at this https URL.
摘要:在对话系统中,开放式意图分类是一项具有挑战性的任务。一方面,它应该保证已知意图识别的质量,另一方面,它需要在没有先验知识的情况下检测到开放式(未知)意图。目前的模型在寻找适当的决策边界以平衡已知意图和开放式意图的性能方面存在局限性。本文提出了一种后处理方法,用于学习自适应决策边界(ADB)用于开放式意图分类。首先,我们利用标记的已知意图样本进行预训练模型。然后,我们利用训练良好的特征自动学习每个已知类别的自适应球形决策边界。具体而言,我们提出了一种新的损失函数来平衡经验风险和开放空间风险。我们的方法不需要开放式意图样本,并且不需要修改模型架构。此外,我们的方法对于标记不足的数据和已知意图类别较少的情况非常不敏感。在三个基准数据集上的大量实验表明,与最先进的方法相比,我们的方法具有显着的改进。本文的代码发布在URL上。

模型框架:

数据集:

实验结果:

# 更新paddlenlp
!pip install -U paddlenlp
# 解压数据集
!unzip -d data data/data154307/ADB_data.zip 
# # 分步操作-1.预训练
# !python run.py \
#     --seed 0 \
#     --start_Pretrain \
#     --do_train \
#     --do_eval \
#     --device gpu \
#     --dataset "stackoverflow" \
#     --known_cls_ratio 0.25 \
#     --labeled_ratio 0.5 \
#     --output_dir "done_model" \
#     --model_name_or_path "ernie-2.0-base-en"\
#     --overwrite_output_dir \
#     --warmup_ratio 0.1 \
#     --learning_rate 2e-5 \
#     --num_train_epochs 100 \
#     --per_device_train_batch_size 128 \
#     --per_device_eval_batch_size 64 \
#     --metric_for_best_model "eval_macro_f1" \
#     --early_stopping \
#     --evaluation_strategy epoch \
#     --load_best_model_at_end \
#     --save_strategy epoch \
#     --save_total_limit 1 \
#     --disable_tqdm True 
# # 分步操作-2.自适应决策边界
# !python run.py \
#     --seed 0 \
#     --start_ADB \
#     --device gpu \
#     --dataset "stackoverflow" \
#     --known_cls_ratio 0.25 \
#     --labeled_ratio 0.5 \
#     --output_dir "done_model" \
#     --model_name_or_path "done_model" \
#     --lr_boundary 0.05 \
#     --num_train_epochs 100 \
#     --adb_device_train_batch_size 128 \
#     --adb_device_eval_batch_size 64 
# 一键训练和测试
!python run.py \
    --seed 0 \
    --start_Pretrain \
    --do_train \
    --do_eval \
    --device gpu \
    --dataset "stackoverflow" \
    --known_cls_ratio 0.25 \
    --labeled_ratio 1.0 \
    --output_dir "done_model" \
    --model_name_or_path "ernie-2.0-base-en"\
    --overwrite_output_dir \
    --warmup_ratio 0.1 \
    --learning_rate 2e-5 \
    --num_train_epochs 100 \
    --per_device_train_batch_size 128 \
    --per_device_eval_batch_size 64 \
    --metric_for_best_model "eval_macro_f1" \
    --early_stopping \
    --evaluation_strategy epoch \
    --load_best_model_at_end \
    --save_strategy epoch \
    --save_total_limit 1 \
    --disable_tqdm True \
    --start_ADB \
    --lr_boundary 0.05 \
    --adb_device_train_batch_size 128 \
    --adb_device_eval_batch_size 64 

复现Tabel 2和Tabel 3的实验结果

在BANKING, OOS 和 StackOverflow 三个数据集上,使用不同已知类比率(25%,50%和75%)的开放分类结果。
“Accuracy” 和 “F1-score”分别表示所有类别的准确率和宏F1分数。
“Open”和“Known”分别表示开放类和已知类的宏f1分数。

复现结果:

BankingBankingoosoosstackoverflowstackoverflow
ratioAccuracyF1-scoreAccuracyF1-scoreAccuracyF1-score
25%63.865.619570.464.942849.756.4304
50%71.479.612577.6380.812372.7277.53
75%81.2787.562984.9188.59677.6783.2664

ernie-2.0-base-en 复现结果:

BankingBankingoosoosstackoverflowstackoverflow
ratioOpenKnownOpenKnownOpenKnown
25%69.088965.436977.511764.612151.841057.3483
50%64.270680.016378.508580.843168.196978.4633
75%60.869688.023180.273888.670350.858885.4269

使用ernie-2.0-base-en和论文结果有一定差距,待后续更换bert权重测试,仅在OOS上表现的还算接近
一是Open类并未分类的很好
二是在少样本数量和少类别数量下特别敏感,不是论文中说的不敏感

%%capture
for choice_dataset in ['oos','stackoverflow','banking']:
    for choice_know_ratio in [0.25,0.5,0.75]:
        !python run.py \
            --seed 0 \
            --start_Pretrain \
            --do_train \
            --do_eval \
            --device gpu \
            --dataset {choice_dataset} \
            --known_cls_ratio {choice_know_ratio} \
            --labeled_ratio 1.0 \
            --output_dir "done_model" \
            --model_name_or_path "ernie-2.0-base-en"\
            --overwrite_output_dir \
            --warmup_ratio 0.1 \
            --learning_rate 2e-5 \
            --num_train_epochs 100 \
            --per_device_train_batch_size 128 \
            --per_device_eval_batch_size 64 \
            --metric_for_best_model "eval_macro_f1" \
            --early_stopping \
            --evaluation_strategy epoch \
            --load_best_model_at_end \
            --save_strategy epoch \
            --save_total_limit 1 \
            --disable_tqdm True \
            --start_ADB \
            --lr_boundary 0.05 \
            --adb_device_train_batch_size 128 \
       --adb_device_train_batch_size 128 \
            --adb_device_eval_batch_size 64     
import pandas as pd
test_result = pd.read_csv("./outputs/results.csv")
test_result
KnownOpenF1-scoreAccuracydatasetknown_cls_ratiolabeled_ratioseed
064.612177.511764.942870.40oos0.251.00
180.843178.508580.812377.63oos0.501.00
288.670380.273888.596084.91oos0.751.00
357.348351.841056.430449.70stackoverflow0.251.00
478.463368.196977.530072.72stackoverflow0.501.00
585.426950.858883.266477.67stackoverflow0.751.00
665.436969.088965.619563.80banking0.251.00
780.016364.270679.612571.40banking0.501.00
888.023160.869687.562981.27banking0.751.00

👨‍💻 作者:Armor
📧 邮箱:htkstudy163.com
✨ AI Studio主页:https://aistudio.baidu.com/aistudio/personalcenter/thirdview/392748

此文章为搬运
原项目链接

Logo

学大模型,用大模型上飞桨星河社区!每天8点V100G算力免费领!免费领取ERNIE 4.0 100w Token >>>

更多推荐