LSTM+CRF模型项目完整代码_lstm-crf模型资源-CSDN下载

共15个文件

py：9个

txt：3个

gitignore：1个

LSTM+CRF

LSTM

5星 · 超过95%的资源需积分: 50 3 浏览量 2018-12-20 11:20:00 上传评论 13 收藏 19KB RAR 举报

**LSTM+CRF模型项目完整代码** 在自然语言处理（NLP）领域，LSTM（Long Short-Term Memory）和CRF（Conditional Random Fields）是两种常用的深度学习技术，常用于序列标注任务，如命名实体识别（NER）、词性标注（POS tagging）等。本项目提供了使用LSTM和CRF结合的完整实现，下面将详细介绍这两个模型及其在序列标注中的应用。 **LSTM（长短期记忆网络）** LSTM是一种特殊的循环神经网络（RNN），旨在解决传统RNN中的梯度消失和梯度爆炸问题。LSTM单元由输入门、遗忘门和输出门组成，能够有效地捕获长期依赖性。在序列标注任务中，LSTM会遍历每个时间步的输入序列，并通过内部机制记住或忘记重要信息，从而为后续的分类或预测提供上下文信息。 **CRF（条件随机场）** CRF是一种统计建模方法，常用于序列数据的联合概率建模。与单独预测每个序列元素的模型（如HMM或单独的LSTM输出）不同，CRF考虑了整个序列的上下文信息，优化了全局的标注序列。在序列标注任务中，CRF层可以计算出给定序列所有可能标注的联合概率，并选择概率最高的那个作为最终结果，这样可以提高标注的连贯性和准确性。 **LSTM+CRF组合** LSTM和CRF结合，利用LSTM的强大序列建模能力捕获局部特征，同时利用CRF的全局优化特性确保标签分配的一致性。这种架构在许多序列标注任务中表现出色，因为它能够平衡局部和全局信息的利用，从而提高模型性能。 **项目结构与文件** 项目名称“LSTM+CRF_seq_tagging-master”通常包含以下组成部分： 1. **数据预处理**：包括数据加载、分词、标注转换等，通常在`preprocess.py`等文件中完成。 2. **模型定义**：LSTM和CRF的实现，可能在`model.py`中定义。 3. **训练和评估**：定义训练循环，模型验证，保存和加载模型，这部分可能在`train.py`或`evaluate.py`中。 4. **实验配置**：包含超参数设置，如学习率、批次大小等，可能在`config.py`中。 5. **实验结果可视化**：使用工具如TensorBoard展示损失和准确率曲线。 6. **测试和应用**：对新数据进行预测，可能在`predict.py`中。项目的实际文件结构和内容可能根据作者的实现方式有所不同，但基本框架应包含以上部分。通过理解LSTM和CRF的工作原理以及它们如何协同工作，你可以深入研究项目代码，了解如何将这些概念应用于实际问题。这不仅有助于你掌握序列标注的先进方法，也能提升你在NLP领域的实践能力。

资源推荐

资源详情

资源评论

收起资源包目录

LSTM+CRF_seq_tagging-master.rar （15个子文件）

LSTM+CRF_seq_tagging-master

data

test.txt 819B

model

general_utils.py 5KB

__init__.py 0B

data_utils.py 11KB

base_model.py 5KB

config.py 3KB

ner_model.py 13KB

train.py 753B

evaluate.py 2KB

makefile 202B

build_data.py 2KB

requirements.txt 30B

.gitignore 51B

README.md 3KB

LICENSE.txt 11KB

# Named Entity Recognition with Tensorflow This repo implements a NER model using Tensorflow (LSTM + CRF + chars embeddings). State-of-the-art performance (F1 score between 90 and 91). Check the [blog post](https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html) ## Task Given a sentence, give a tag to each word. A classical application is Named Entity Recognition (NER). Here is an example ``` John lives in New York B-PER O O B-LOC I-LOC ``` ## Model Similar to [Lample et al.](https://arxiv.org/abs/1603.01360) and [Ma and Hovy](https://arxiv.org/pdf/1603.01354.pdf). - concatenate final states of a bi-lstm on character embeddings to get a character-based representation of each word - concatenate this representation to a standard word vector representation (GloVe here) - run a bi-lstm on each sentence to extract contextual representation of each word - decode with a linear chain CRF ## Getting started 1. Download the GloVe vectors with ``` make glove ``` Alternatively, you can download them manually [here](https://nlp.stanford.edu/projects/glove/) and update the `glove_filename` entry in `config.py`. You can also choose not to load pretrained word vectors by changing the entry `use_pretrained` to `False` in `model/config.py`. 2. Build the training data, train and evaluate the model with ``` make run ``` ## Details Here is the breakdown of the commands executed in `make run`: 1. [DO NOT MISS THIS STEP] Build vocab from the data and extract trimmed glove vectors according to the config in `model/config.py`. ``` python build_data.py ``` 2. Train the model with ``` python train.py ``` 3. Evaluate and interact with the model with ``` python evaluate.py ``` Data iterators and utils are in `model/data_utils.py` and the model with training/test procedures is in `model/ner_model.py` Training time on NVidia Tesla K80 is 110 seconds per epoch on CoNLL train set using characters embeddings and CRF. ## Training Data The training data must be in the following format (identical to the CoNLL2003 dataset). A default test file is provided to help you getting started. ``` John B-PER lives O in O New B-LOC York I-LOC . O This O is O another O sentence ``` Once you have produced your data files, change the parameters in `config.py` like ``` # dataset dev_filename = "data/coNLL/eng/eng.testa.iob" test_filename = "data/coNLL/eng/eng.testb.iob" train_filename = "data/coNLL/eng/eng.train.iob" ``` ## License This project is licensed under the terms of the apache 2.0 license (as Tensorflow and derivatives). If used for research, citation would be appreciated.

评论收藏

内容反馈