Named Entity Recognition (NER) is a process of identifying and recognizing entities through text.
The goal of the project is to create a model that is able to find an entity from the raw data and can determine the category to which the element belongs. There are four categories: names of people, organizations, places and more. Identified by the labels PER, ORG, LOC, and O respectively.
conda create --name ner python=3.7.11
conda activate ner
pip install -r requirements.txt
$ cd dataset
$ python download_glove.py
python main.py --char-embedding-dim CHAR-EMB-DIM --char-len -- CHAR-LEN --hidden-dim HIDDEN-DIM --embedding-dim EMB-DIM --epochs EPOCHS --batch-size BATCH-SIZE --lr LEARNING-RATE --dropout DROPOUT --bidirectional BIDIRECTIONAL --num-layers NUM-LAYERS --only-test ONLY-TEST
where
CHAR-EMB-DIMis the dimension of the char embedding, default is 10CHAR-LENis the maximum length of the char sequence, default is 8HIDDEN-DIMis the dimension of the hidden layer, default is 256EMB-DIMis the dimension of the word embedding, default is 300EPOCHSis the number of epochs, default is 50BATCH-SIZEis the batch size, default is 64LEARNING-RATEis the learning rate, default is 0.001DROPOUTis the dropout rate, default is 0.5BIDIRECTIONALis the bidirectional flag, default is TrueNUM-LAYERSis the number of layers, default is 2ONLY-TESTis the only test flag, default is False
for example, for training:
python main.py
for testing
python main.py --only-test True
