Bert系列模型情感分析任务对比实验

实验介绍实验数据信息实验数据来源：github实验任务：情感分析，二分类任务训练集大小：9600验证集大小：1200测试集大小：1200样本均衡情况：均衡参与对比的Bert系列模型包括：Bert、Finbert、Roberta实验数据选型文本长度最小长度：4最大长度：1992平均长度：108Bert具体参数如下训练命令及参数python run_classifier.py --task_name=

福将～白鹿

1779人浏览 · 2022-01-19 14:51:12

福将～白鹿 · 2022-01-19 14:51:12 发布

实验介绍

实验数据信息

实验数据来源：github
实验任务：情感分析，二分类任务
训练集大小：9600
验证集大小：1200
测试集大小：1200
样本均衡情况：均衡
参与对比的Bert系列模型包括：Bert、Finbert、Roberta

实验数据选型

文本长度
最小长度：4
最大长度：1992
平均长度：108

Bert

具体参数如下
训练命令及参数

python run_classifier.py --task_name=emlo --do_train=true --do_eval=true --data_dir=./ChnSentiCorp_data --vocab_file=./uncased/chine
se_L-12_H-768_A-12/vocab.txt --bert_config_file=./uncased/chinese_L-12_H-768_A-12/bert_config.json --init_checkpoint=./uncased/chinese_L-12_H-768_A-12/bert_model.ckpt --max_seq_length=64 --train_batch_size=16 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./tmp/bert_out/

预测命令及参数

python run_classifier.py --task_name=emlo --do_predict=true --data_dir=./ChnSentiCorp_data --vocab_file=./uncased/chinese_L-12_H-768_A-12/vocab.txt --bert_config_file=./uncased/chinese_L-12_H-768_A-12/bert_config.json --init_checkpoint=./tmp/bert_out/ --max_seq_length=64  --output_dir=./tmp/bert_emotion/

Finbert

具体参数如下
训练命令及参数

python run_classifier.py --task_name=emlo --do_train=true --do_eval=true --data_dir=./ChnSentiCorp_data --vocab_file=./uncased/FinBERT_L-12_H-768_A-12_tf/vocab.txt --bert_config_file=./uncased/FinBERT_L-12_H-768_A-12_tf/bert_config.json --init_checkpoint=./uncased/FinBERT_L-12_H-768_A-12_tf/bert_model.ckpt --max_seq_length=64 --train_batch_size=16 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./tmp/finbert_out/

预测命令及参数

python run_classifier.py --task_name=emlo --do_predict=true --data_dir=./ChnSentiCorp_data --vocab_file=./uncased/FinBERT_L-12_H-768_A-12_tf/vocab.txt --bert_config_file=./uncased/FinBERT_L-12_H-768_A-12_tf/bert_config.json --init_checkpoint=./tmp/finbert_out/ --max_seq_length=64  --output_dir=./tmp/finbert_emotion/

Roberta

具体参数如下
训练命令及参数

python run_classifier.py --task_name=emlo --do_train=true --do_eval=true --data_dir=./ChnSentiCorp_data --vocab_file=./uncased/roberta_zh_l12/vocab.txt --bert_config_file=./uncased/roberta_zh_l12/bert_config.json --init_checkpoint=./uncased/roberta_zh_l12/bert_model.ckpt --max_seq_length=64 --train_batch_size=16 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./tmp/roberta_out/

预测命令及参数

python run_classifier.py --task_name=emlo --do_predict=true --data_dir=./ChnSentiCorp_data --vocab_file=./uncased/roberta_zh_l12/vocab.txt --bert_config_file=./uncased/roberta_zh_l12/bert_config.json --init_checkpoint=./tmp/roberta_out/ --max_seq_length=64  --output_dir=./tmp/roberta_emotion/

实验效果对比

sentence_length= 64

Bert	P	R	F1
negative	0.9324	0.9324	0.9324
postive	0.9342	0.9342	0.9342
total	0.9333	0.9333	0.9333

Finbert	P	R	F1
negative	0.9254	0.9645	0.9445
postive	0.9639	0.9243	0.9437
total	0.9441	0.9441	0.9441

Roberta	P	R	F1
negative	0.9206	0.9408	0.9306
postive	0.9411	0.9210	0.9310
total	0.9308	0.9308	0.9308

技术共进，成长同行——讯飞AI开发者社区

更多推荐

2025大数据时代哪些证书可以考：推荐8个最合适考的证书

讯飞AI开发者社区

从数据分析师到提示工程架构师：技能转化与路径规划

在人工智能技术迅猛发展的今天，提示工程架构师已成为连接数据洞察与AI能力的关键角色。本文系统阐述了数据分析师向提示工程架构师转型的理论基础、技能迁移路径与实践策略。通过构建"技能转化矩阵"和"能力迁移四象限"模型，本文提供了一套结构化方法论，帮助数据分析师识别核心可迁移技能、弥补知识差距，并通过渐进式学习路径实现职业升级。文章深入剖析了提示工程架构师的角色定位、能力需求和行业价值，提供了从基础提示