[Methods 2024]Automatic ICD-10-CM coding via Lambda-Scaled attention based deep learning model

①Employ Word2Vec to mapping original code/words $w=\{w_{1},w_{2},\cdots,w_{i}\}$ in one clinical text $\mathbb{C}$ to $i$ vectors $E=\{ew_1,ew_2,ew_3,\cdots,ew_i\}\in\mathbb{R}^{i\times d^c}$ with dimension of $d^e=100$

（2）Deep bi-directional LSTM layer

①The processes of bi-LSTM:

$\overrightarrow{h_f}=\overrightarrow{LSTM}\left(x_n,\overrightarrow{h_{n-1}}\right)$

$\overset{\leftarrow}{\operatorname*{h_b}}=\overset{\leftarrow}{\operatorname*{LSTM}}\left(x_n,\overset{\leftarrow}{\operatorname*{h_{n+1}}}\right)$

$h_{bi-lstm}=\overrightarrow{h_f}\oplus\overleftarrow{h_b}$

where $H=[h_1,h_2,h_3,\cdots,h_m]\in\mathbb{R}^{2q\times m}$ （ $m$ 是根据时间步自适应的？）

（3）Multi-scale CNN layer

①This MS-CNN is constructed by a max pooling layer and mean pooling layer after CNN

②Concatenate all output from MS-CNN they get $\mathcal{C}_{r}\in\mathbb{R}^{1\times\sum d_{k}^{r}}$ :

$\mathcal{C}=[\mathcal{C}_1;\mathcal{C}_2;\mathcal{C}_3;\mathcal{C}_4;\mathcal{C}_5]$

（4）Lambda-Scaled attention layer

①They further scale features:

$\begin{aligned} & u_{j}=\text{tanh}(w_{o}\mathcal{C}_{r}+b_{o}) \\ & a_{t}=softmax(w_{a}u_{j}) \\ & a_{t}^{max}=max\left[a_{1},a_{2},a_{3},\cdots,a_{t}\right] \\ & a_{t}^{\prime}=\frac{a_{t}}{a_{t}^{max}};\quad\text{s.t.}a_{t}^{\prime}\leq1 \\ & T_{R}=\sum_{t=1}^{l}a_{t}^{\prime}\mathcal{C}_{r} \end{aligned}$

（5）Classification

①Classification by fully connected layers with Sigmoid

②Optimizer: Adam

③Binary cross entropy loss:

$Loss=-\frac{1}{G}\sum_{l=1}^{G}y_{l}\odot\log\bar{y}_{l}+(1-y_{l})\odot\log\left(1-\bar{y}_{l}\right)$

2.3.4. Web application deployment

①They build a web to predict ICD code

2.4. Experiments

2.4.1. Datasets

①Pretraining on MIMIC III and test on MIMIC III-Top 50 and CPLT

②Statistics:

（1）MIMIC III

①Sample: 53423

②Pre-training: stopword removal, tokenization, lowercase conversion, and removal of numbers, punctuation, and symbols by the Natural Language Toolkit (NLTK) library

③Limited record length: 2500

（2）CPLT

①Samples: 1380

②Preview:

③Data split: 1104 for training, 138 for testing and 138 for val

④Max record length: 150

2.4.2. Evaluation metrics

①Micro F1 and Macro F1 for imbalanced data

2.4.3. Parameter setting

①Hidden dim of bi-LSTM: 64 for CPLT and 128 for MIMC III-Top 50

②Batch size: 8

③Epoch: 50

④Dropout rate: 0.5

2.5. Experimental results

2.5.1. Baseline models

2.5.2. Results

（1）Complete distribution of ICD-10-CM codes in the CPLT database

①34 ICD code in CPLT dataset:

（2）Comparison of DRCNN-ATT model with baselines on CPLT database

①Performance comparison table on CPLT:

（3）Comparison of DRCNN-ATT model with baselines on MIMIC III-Top 50 database

①Performance on MIMIC III-Top 50:

（4）Medical code Predictor web application

①Application preview:

（5）Ablation study

①Attention module ablation:

2.6. Discussion

2.7. Conclusion

技术共进，成长同行——讯飞AI开发者社区

更多推荐

AI行业与人工智能的关系与区别

讯飞AI开发者社区

本地部署文生图AI工具：打造可持续使用的创作环境

随着人工智能技术的快速发展，文生图（Text-to-Image）AI工具已经成为内容创作者、设计师、自媒体从业者的重要助手。本文将介绍一种可持续、免费、安全的使用方式——将文生图AI模型部署在本地电脑中，并结合一个实用的AI工具箱进行多模态内容创作。在AI技术日益普及的今天，掌握本地部署与多模态AI工具的协同使用，将成为内容创作者的一项核心能力。通过将文生图AI大模型部署到本地电脑，并辅以功能丰富