The current implementation offers support for HF LLama models and BERT models. +We will cover only BERT in this section as the Llama usage is the same, just different imports.
+# Install medcat
+! pip install medcat~=1.16.0
+WARNING: Ignoring invalid distribution ~umpy (/opt/homebrew/lib/python3.11/site-packages) +WARNING: Ignoring invalid distribution ~umpy (/opt/homebrew/lib/python3.11/site-packages) +ERROR: Could not find a version that satisfies the requirement medcat~=1.16.0 (from versions: 0.1.4, 0.1.5, 0.1.6, 0.1.7, 0.1.7.1, 0.1.8, 0.1.9, 0.1.9.2, 0.1.9.3, 0.1.9.4, 0.1.9.5, 0.1.9.6, 0.1.9.7, 0.1.9.9, 0.2.0.0, 0.2.0.1, 0.2.0.2, 0.2.0.3, 0.2.0.4, 0.2.0.5, 0.2.0.6, 0.2.0.7, 0.2.1, 0.2.2, 0.2.3, 0.2.3.1, 0.2.3.2, 0.2.3.3, 0.2.3.4, 0.2.3.5, 0.2.3.6, 0.2.3.7, 0.2.4.0, 0.2.4.1, 0.2.4.2, 0.2.4.3, 0.2.4.4, 0.2.4.5, 0.2.4.6, 0.2.4.7, 0.2.4.8, 0.2.4.9, 0.2.5.0, 0.2.5.1, 0.2.5.2, 0.2.5.3, 0.2.5.4, 0.2.5.5, 0.2.5.6, 0.2.5.7, 0.2.5.8, 0.2.5.9, 0.2.6.0, 0.2.6.1, 0.2.6.2, 0.2.6.3, 0.2.6.4, 0.2.6.5, 0.2.6.7, 0.2.6.8, 0.2.6.9, 0.2.7.0, 0.2.7.1, 0.2.7.2, 0.2.7.3, 0.2.7.4, 0.2.7.6, 0.2.7.7, 0.2.7.8, 0.2.7.9, 0.2.8.0, 0.2.8.1, 0.2.8.2, 0.2.8.3, 0.2.8.4, 0.2.8.5, 0.2.8.6, 0.2.8.7, 0.2.8.8, 0.2.8.9, 0.2.9.0, 0.2.9.1, 0.2.9.2, 0.2.9.3, 0.2.9.4, 0.2.9.5, 0.2.9.6, 0.2.9.7, 0.2.9.8, 0.2.9.9, 0.3.0.0, 0.3.0.1, 0.3.0.2, 0.3.0.3, 0.3.0.4, 0.3.0.5, 0.3.0.6, 0.3.0.7, 0.3.0.8, 0.3.0.9, 0.3.1.0, 0.3.1.1, 0.3.1.4, 0.3.1.5, 0.3.1.6, 0.3.1.7, 0.3.1.8, 0.3.1.9, 0.3.2.0, 0.3.2.1, 0.3.2.2, 0.3.2.3, 0.3.2.4, 0.3.2.5, 0.3.2.6, 0.3.2.7, 0.3.2.8, 0.3.2.9, 0.3.3.0, 0.3.3.1, 0.3.3.2, 0.3.3.3, 0.3.3.4, 0.3.3.5, 0.3.3.6, 0.3.3.7, 0.3.3.8, 0.3.3.9, 0.3.4.0, 0.3.4.1, 0.3.4.2, 0.3.4.3, 0.3.4.4, 0.3.4.5, 0.3.4.6, 0.3.4.7, 0.3.4.8, 0.3.4.9, 0.3.5.0, 0.3.5.1, 0.3.5.2, 0.3.5.3, 0.3.5.4, 0.3.5.5, 0.3.5.6, 0.3.5.7, 0.3.5.8, 0.3.5.9, 0.3.6.0, 0.3.6.1, 0.3.6.2, 0.3.6.3, 0.3.6.4, 0.3.6.5, 0.3.6.6, 0.3.6.7, 0.3.6.8, 0.3.6.9, 0.3.7.0, 0.3.7.1, 0.3.7.2, 0.3.7.3, 0.3.7.4, 0.3.7.5, 0.3.7.6, 0.3.7.7, 0.3.7.8, 0.3.7.9, 0.3.8.0, 0.3.8.1, 0.3.8.2, 0.3.8.3, 0.3.8.4, 0.3.8.5, 0.3.8.6, 0.3.8.7, 0.3.8.8, 0.3.8.9, 0.3.9.0, 0.3.9.1, 0.3.9.2, 0.3.9.3, 0.3.9.4, 0.3.9.5, 0.3.9.6, 0.3.9.7, 0.3.9.8, 0.3.9.9, 0.3.9.9.1, 0.3.9.9.2, 0.3.9.9.3, 0.3.9.9.4, 0.3.9.9.5, 0.3.9.9.6, 0.3.9.9.7, 0.3.9.9.8, 0.3.9.9.9, 0.4.0.0, 0.4.0.1, 0.4.0.2, 0.4.0.3, 0.4.0.4, 0.4.0.5, 0.4.0.6, 1.0.0.dev0, 1.0.0.dev1, 1.0.0.dev2, 1.0.0.dev3, 1.0.0.dev4, 1.0.0.dev5, 1.0.0.dev6, 1.0.0.dev7, 1.0.0.dev8, 1.0.0.dev9, 1.0.0.dev10, 1.0.0.dev11, 1.0.0.dev12, 1.0.0.dev13, 1.0.0.dev14, 1.0.0.dev15, 1.0.0.dev16, 1.0.0.dev17, 1.0.0.dev18, 1.0.0.dev19, 1.0.0.dev20, 1.0.0.dev21, 1.0.0.dev22, 1.0.0.dev23, 1.0.0.dev24, 1.0.0.dev25, 1.0.0.dev26, 1.0.0.dev27, 1.0.0.dev28, 1.0.0.dev29, 1.0.0.dev30, 1.0.0.dev31, 1.0.0.dev32, 1.0.0.dev33, 1.0.0.dev34, 1.0.0.dev35, 1.0.0.dev36, 1.0.0.dev37, 1.0.0.dev38, 1.0.0.dev39, 1.0.0.dev40, 1.0.0.dev41, 1.0.0.dev42, 1.0.0.dev43, 1.0.0.dev44, 1.0.0.dev45, 1.0.0.dev46, 1.0.0.dev47, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.0.11, 1.0.12, 1.0.13, 1.0.14, 1.0.15, 1.0.16, 1.0.17, 1.0.18, 1.0.19, 1.0.20, 1.0.21, 1.0.22, 1.0.23, 1.0.24, 1.0.25, 1.0.26, 1.0.27, 1.0.28, 1.0.29, 1.0.30, 1.0.31, 1.0.32, 1.0.33, 1.0.34, 1.0.35, 1.0.36, 1.0.37, 1.0.38, 1.0.39, 1.0.40, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.2.0, 1.2.3, 1.2.4, 1.2.5, 1.2.6, 1.2.7, 1.2.8, 1.2.9, 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.5.0, 1.5.3, 1.6.0, 1.6.1, 1.7.0, 1.7.1, 1.7.3, 1.7.4, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.11.0, 1.11.1, 1.12.0, 1.12.1, 1.12.2, 1.13.0, 1.13.1, 1.13.2, 1.13.3, 1.13.4, 1.14.0, 1.14.1, 1.14.2, 1.15.0b0, 1.15.0, 1.15.1, 1.15.2) +ERROR: No matching distribution found for medcat~=1.16.0 ++
import logging
+from medcat.cdb import CDB
+from medcat.config_rel_cat import ConfigRelCAT
+from medcat.rel_cat import RelCAT
+from medcat.utils.relation_extraction.base_component import BaseComponent_RelationExtraction
+from medcat.utils.relation_extraction.bert.model import BaseModel_RelationExtraction
+from medcat.utils.relation_extraction.bert.config import BaseConfig_RelationExtraction
+from medcat.utils.relation_extraction.tokenizer import BaseTokenizerWrapper_RelationExtraction
+Training RelCAT models with custom datasets from scratch.
+1. create the RelCAT config and set the parameters
+config = ConfigRelCAT()
+config.general.log_level = logging.INFO
+config.general.model_name = "bert-base-uncased" # base model that you want to use, we're going to use the HuggingFace bert-base-uncased model
+1.1 Based on what model you use, you might want to keep an eye on config.model.hidden_size, config.model.model_size and config.model.hidden_layers
+config.model.hidden_size= 256
+config.model.model_size = 2304 # 4096 for llama
+1.2 Other notable configurations
+config.general.cntx_left = 15 # how many tokens to the left of the start entity we select
+config.general.cntx_right = 15 # how many tokens to the right of the end entity we selecd
+config.general.window_size = 300 # distance (in characters) between two entities to be considered a relation
+config.train.nclasses = 2 # number of classes in your medcat export / dataset
+config.train.nepochs = 10 # number of epochs to train for
+config.model.freeze_layers = False # whether to freeze the layers of the base model
+config.general.limit_samples_per_class = 300 # limit the number of training samples per class to this number, to avoid overfitting in unbalanced datasets
+config.train.batch_size = 32 # batch size
+config.train.lr = 3e-5
+config.train.adam_epsilon = 1e-8
+config.train.adam_weight_decay = 0.0005
+2. create a CDB, it can be a CDB from another model of your choice or an empty one. +The CDB is used only when filtering by concept unique identifiers (CUI) or concept type ids (TUI). + + +
cdb = CDB()
+3. Create a tokenizer + + +
tokenizer = BaseTokenizerWrapper_RelationExtraction.load(tokenizer_path=config.general.model_name,
+ relcat_config=config)
+4. Add token tags to tokenizer. + This step is optional because the [s1], [e1], [s2], [e2] tags are already located in the default RelCATConfig. + If you are using a LLama based model, you will need to add the [PAD] token to the tokenizer, as shown below. + + +
special_ent_tokens = ["[s1]", "[e1]", "[s2]", "[e2]"]
+tokenizer.hf_tokenizers.add_tokens(special_ent_tokens, special_tokens=True)
+tokenizer.hf_tokenizers.add_special_tokens({'pad_token': '[PAD]'}) # used in llama tokenizer
+0+
5. Add tokens to the RelCATConfig + + +
config.general.tokenizer_relation_annotation_special_tokens_tags = special_ent_tokens
+config.general.annotation_schema_tag_ids = tokenizer.hf_tokenizers.convert_tokens_to_ids(special_ent_tokens)
+6. Create the relCAT object and initialize its components
+# if you wish to skip the steps in section 6.1 you can pass the init_model=True arguement to intialize the components with the default ConfigRelCAT settings.
+relCAT = RelCAT(cdb, config=config)
+INFO:medcat.utils.relation_extraction.base_component:BaseComponent_RelationExtraction initialized ++
6.1 Use the BaseComponent object, this one holds the tokenizer, model and model config. We will have to initialize each component beforehand.
+Resize token embeddings since we added the tokens before, this should be done after adding tokens to the tokenizer. It is not required after creating and saving/loading a model as the value will be retained.
+model_config = BaseConfig_RelationExtraction.load(pretrained_model_name_or_path=config.general.model_name,
+ relcat_config=config)
+
+# update the model config with the proper vocab size, since we added special tokens to the tokenizer
+model_config.hf_model_config.vocab_size = tokenizer.get_size()
+
+# set the padding idx in the model config and relcat config, this is necesasry as it depends on what tokenizer you use
+config.model.padding_idx = model_config.pad_token_id = tokenizer.get_pad_id()
+
+model = BaseModel_RelationExtraction.load(pretrained_model_name_or_path=config.general.model_name,
+ model_config=model_config,
+ relcat_config=config)
+
+# we have to update the model to reflect the new token embeddings, since we added special tokens to the tokenizer
+model.hf_model.resize_token_embeddings(len(tokenizer.hf_tokenizers)) # type: ignore
+
+component = BaseComponent_RelationExtraction(tokenizer=tokenizer, config=config)
+component.model = model
+component.model_config = model_config
+component.relcat_config = config
+component.tokenizer = tokenizer
+
+relCAT.component = component
+You are using a model of type bert to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
+INFO:medcat.utils.relation_extraction.config:Loaded config from : bert-base-uncased/model_config.json
+INFO:medcat.utils.relation_extraction.models:RelCAT model config: PretrainedConfig {
+ "_attn_implementation_autoset": true,
+ "architectures": [
+ "BertForMaskedLM"
+ ],
+ "attention_probs_dropout_prob": 0.1,
+ "gradient_checkpointing": false,
+ "hidden_act": "gelu",
+ "hidden_dropout_prob": 0.1,
+ "hidden_size": 768,
+ "initializer_range": 0.02,
+ "intermediate_size": 3072,
+ "layer_norm_eps": 1e-12,
+ "max_position_embeddings": 512,
+ "num_attention_heads": 12,
+ "num_hidden_layers": 12,
+ "pad_token_id": 0,
+ "position_embedding_type": "absolute",
+ "transformers_version": "4.51.3",
+ "type_vocab_size": 2,
+ "use_cache": true,
+ "vocab_size": 30526
+}
+
+INFO:medcat.utils.relation_extraction.bert.model:RelCAT model config: PretrainedConfig {
+ "_attn_implementation_autoset": true,
+ "architectures": [
+ "BertForMaskedLM"
+ ],
+ "attention_probs_dropout_prob": 0.1,
+ "gradient_checkpointing": false,
+ "hidden_act": "gelu",
+ "hidden_dropout_prob": 0.1,
+ "hidden_size": 768,
+ "initializer_range": 0.02,
+ "intermediate_size": 3072,
+ "layer_norm_eps": 1e-12,
+ "max_position_embeddings": 512,
+ "num_attention_heads": 12,
+ "num_hidden_layers": 12,
+ "pad_token_id": 0,
+ "position_embedding_type": "absolute",
+ "transformers_version": "4.51.3",
+ "type_vocab_size": 2,
+ "use_cache": true,
+ "vocab_size": 30526
+}
+
+Some weights of BertModel were not initialized from the model checkpoint at bert-base-uncased and are newly initialized because the shapes did not match:
+- embeddings.word_embeddings.weight: found shape torch.Size([30522, 768]) in the checkpoint and torch.Size([30526, 768]) in the model instantiated
+You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
+INFO:medcat.utils.relation_extraction.bert.model:Loaded model from pretrained: bert-base-uncased
+INFO:medcat.utils.relation_extraction.models:Loaded BertModel_RelationExtraction from pretrained_model_name_or_path: bert-base-uncased
+INFO:medcat.utils.relation_extraction.base_component:BaseComponent_RelationExtraction initialized
+
+7. Train the model from the ADE dataset.
+! rm -rf "./ade_relcat_model"
+! mkdir -p "./ade_relcat_model"
+relCAT.train(train_csv_path="./data/rel_cat_ADE_V2.tsv", checkpoint_path="./ade_relcat_model")
+
+# for MedCAT Trainer Exports, use the export_path argument : relCAT.train(export_data_path="./data/MedCAT_Export_relation_extraction.json")
+INFO:medcat.utils.relation_extraction.rel_dataset:CSV dataset | No. of relations detected:7093| from : ./data/rel_cat_ADE_V2.tsv | nclasses: 2 | idx2label: {0: 'DRUG-DOSE', 1: 'DRUG-AE'}
+INFO:medcat.utils.relation_extraction.rel_dataset:Samples per class:
+INFO:medcat.utils.relation_extraction.rel_dataset: label: DRUG-DOSE | samples: 279
+INFO:medcat.utils.relation_extraction.rel_dataset: label: DRUG-AE | samples: 6814
+INFO:root:Relations after train, test split : train - 524 | test - 115
+INFO:root: label: DRUG-AE samples | train 300 | test 60
+INFO:root: label: DRUG-DOSE samples | train 224 | test 55
+INFO:root:Attempting to load RelCAT model on device: cpu
+INFO:medcat.rel_cat:Starting training process...
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 0
+huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
+To disable this warning, you can either:
+ - Avoid using `tokenizers` before the fork if possible
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
+ 0%| | 0/524 [00:00<?, ?it/s]/Users/vladd/Library/Python/3.11/lib/python/site-packages/torch/utils/data/dataloader.py:683: UserWarning: 'pin_memory' argument is set as true but not supported on MPS now, then device pinned memory won't be used.
+ warnings.warn(warn_msg)
+100%|██████████| 524/524 [00:16<00:00, 32.36it/s]
+INFO:medcat.rel_cat:Losses at Epoch 0: 0.02401
+INFO:medcat.rel_cat:Train accuracy at Epoch 0: 0.52512
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.573
+INFO:medcat.rel_cat: f1 = 0.573
+INFO:medcat.rel_cat: loss = 0.686
+INFO:medcat.rel_cat: precision = 0.573
+INFO:medcat.rel_cat: recall = 0.573
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.000 | prec : 0.000 | acc: 0.573 | recall: 0.000
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.727 | prec : 1.000 | acc: 0.573 | recall: 0.573
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.544
+INFO:medcat.rel_cat: f1 = 0.544
+INFO:medcat.rel_cat: loss = 0.690
+INFO:medcat.rel_cat: precision = 0.544
+INFO:medcat.rel_cat: recall = 0.544
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.000 | prec : 0.000 | acc: 0.544 | recall: 0.000
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.697 | prec : 1.000 | acc: 0.544 | recall: 0.544
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:16.206846 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 1
+100%|██████████| 524/524 [00:15<00:00, 32.91it/s]
+INFO:medcat.rel_cat:Losses at Epoch 1: 0.02317
+INFO:medcat.rel_cat:Train accuracy at Epoch 1: 0.58211
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.579
+INFO:medcat.rel_cat: f1 = 0.579
+INFO:medcat.rel_cat: loss = 0.661
+INFO:medcat.rel_cat: precision = 0.579
+INFO:medcat.rel_cat: recall = 0.579
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.000 | prec : 0.000 | acc: 0.579 | recall: 0.000
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.728 | prec : 1.000 | acc: 0.579 | recall: 0.579
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.538
+INFO:medcat.rel_cat: f1 = 0.538
+INFO:medcat.rel_cat: loss = 0.687
+INFO:medcat.rel_cat: precision = 0.538
+INFO:medcat.rel_cat: recall = 0.538
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.000 | prec : 0.000 | acc: 0.538 | recall: 0.000
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.696 | prec : 1.000 | acc: 0.538 | recall: 0.538
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:15.921998 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 2
+100%|██████████| 524/524 [00:15<00:00, 33.77it/s]
+INFO:medcat.rel_cat:Losses at Epoch 2: 0.02237
+INFO:medcat.rel_cat:Train accuracy at Epoch 2: 0.57782
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.573
+INFO:medcat.rel_cat: f1 = 0.573
+INFO:medcat.rel_cat: loss = 0.624
+INFO:medcat.rel_cat: precision = 0.573
+INFO:medcat.rel_cat: recall = 0.573
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.000 | prec : 0.000 | acc: 0.573 | recall: 0.000
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.726 | prec : 1.000 | acc: 0.573 | recall: 0.573
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.517
+INFO:medcat.rel_cat: f1 = 0.517
+INFO:medcat.rel_cat: loss = 0.681
+INFO:medcat.rel_cat: precision = 0.517
+INFO:medcat.rel_cat: recall = 0.517
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.000 | prec : 0.000 | acc: 0.517 | recall: 0.000
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.677 | prec : 1.000 | acc: 0.517 | recall: 0.517
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:15.518625 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 3
+100%|██████████| 524/524 [00:15<00:00, 33.70it/s]
+INFO:medcat.rel_cat:Losses at Epoch 3: 0.02200
+INFO:medcat.rel_cat:Train accuracy at Epoch 3: 0.60846
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.661
+INFO:medcat.rel_cat: f1 = 0.661
+INFO:medcat.rel_cat: loss = 0.582
+INFO:medcat.rel_cat: precision = 0.661
+INFO:medcat.rel_cat: recall = 0.661
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.323 | prec : 0.205 | acc: 0.661 | recall: 0.856
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.765 | prec : 0.993 | acc: 0.661 | recall: 0.629
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.533
+INFO:medcat.rel_cat: f1 = 0.533
+INFO:medcat.rel_cat: loss = 0.690
+INFO:medcat.rel_cat: precision = 0.533
+INFO:medcat.rel_cat: recall = 0.533
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.082 | prec : 0.046 | acc: 0.533 | recall: 0.375
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.685 | prec : 0.970 | acc: 0.533 | recall: 0.532
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:15.550811 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 4
+100%|██████████| 524/524 [00:16<00:00, 32.74it/s]
+INFO:medcat.rel_cat:Losses at Epoch 4: 0.02016
+INFO:medcat.rel_cat:Train accuracy at Epoch 4: 0.69301
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.818
+INFO:medcat.rel_cat: f1 = 0.818
+INFO:medcat.rel_cat: loss = 0.529
+INFO:medcat.rel_cat: precision = 0.818
+INFO:medcat.rel_cat: recall = 0.818
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.788 | prec : 0.826 | acc: 0.818 | recall: 0.763
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.833 | prec : 0.811 | acc: 0.818 | recall: 0.864
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.713
+INFO:medcat.rel_cat: f1 = 0.713
+INFO:medcat.rel_cat: loss = 0.634
+INFO:medcat.rel_cat: precision = 0.713
+INFO:medcat.rel_cat: recall = 0.713
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.716 | prec : 0.783 | acc: 0.713 | recall: 0.678
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.710 | prec : 0.670 | acc: 0.713 | recall: 0.783
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:16.005069 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 5
+100%|██████████| 524/524 [00:15<00:00, 32.87it/s]
+INFO:medcat.rel_cat:Losses at Epoch 5: 0.01814
+INFO:medcat.rel_cat:Train accuracy at Epoch 5: 0.78615
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.852
+INFO:medcat.rel_cat: f1 = 0.852
+INFO:medcat.rel_cat: loss = 0.457
+INFO:medcat.rel_cat: precision = 0.852
+INFO:medcat.rel_cat: recall = 0.852
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.826 | prec : 0.868 | acc: 0.852 | recall: 0.796
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.865 | prec : 0.838 | acc: 0.852 | recall: 0.898
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.723
+INFO:medcat.rel_cat: f1 = 0.723
+INFO:medcat.rel_cat: loss = 0.600
+INFO:medcat.rel_cat: precision = 0.723
+INFO:medcat.rel_cat: recall = 0.723
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.715 | prec : 0.739 | acc: 0.723 | recall: 0.701
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.728 | prec : 0.710 | acc: 0.723 | recall: 0.759
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:15.942705 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 6
+100%|██████████| 524/524 [00:16<00:00, 31.98it/s]
+INFO:medcat.rel_cat:Losses at Epoch 6: 0.01778
+INFO:medcat.rel_cat:Train accuracy at Epoch 6: 0.80270
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.849
+INFO:medcat.rel_cat: f1 = 0.849
+INFO:medcat.rel_cat: loss = 0.427
+INFO:medcat.rel_cat: precision = 0.849
+INFO:medcat.rel_cat: recall = 0.849
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.836 | prec : 0.934 | acc: 0.849 | recall: 0.762
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.852 | prec : 0.785 | acc: 0.849 | recall: 0.939
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.692
+INFO:medcat.rel_cat: f1 = 0.692
+INFO:medcat.rel_cat: loss = 0.593
+INFO:medcat.rel_cat: precision = 0.692
+INFO:medcat.rel_cat: recall = 0.692
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.711 | prec : 0.822 | acc: 0.692 | recall: 0.637
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.665 | prec : 0.587 | acc: 0.692 | recall: 0.787
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:16.388052 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 7
+100%|██████████| 524/524 [00:16<00:00, 32.51it/s]
+INFO:medcat.rel_cat:Losses at Epoch 7: 0.01548
+INFO:medcat.rel_cat:Train accuracy at Epoch 7: 0.84436
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.894
+INFO:medcat.rel_cat: f1 = 0.894
+INFO:medcat.rel_cat: loss = 0.359
+INFO:medcat.rel_cat: precision = 0.894
+INFO:medcat.rel_cat: recall = 0.894
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.873 | prec : 0.882 | acc: 0.894 | recall: 0.873
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.903 | prec : 0.904 | acc: 0.894 | recall: 0.908
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.773
+INFO:medcat.rel_cat: f1 = 0.773
+INFO:medcat.rel_cat: loss = 0.552
+INFO:medcat.rel_cat: precision = 0.773
+INFO:medcat.rel_cat: recall = 0.773
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.747 | prec : 0.725 | acc: 0.773 | recall: 0.776
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.769 | prec : 0.791 | acc: 0.773 | recall: 0.752
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:16.119574 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 8
+100%|██████████| 524/524 [00:15<00:00, 33.21it/s]
+INFO:medcat.rel_cat:Losses at Epoch 8: 0.01448
+INFO:medcat.rel_cat:Train accuracy at Epoch 8: 0.86152
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.900
+INFO:medcat.rel_cat: f1 = 0.900
+INFO:medcat.rel_cat: loss = 0.336
+INFO:medcat.rel_cat: precision = 0.900
+INFO:medcat.rel_cat: recall = 0.900
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.890 | prec : 0.967 | acc: 0.900 | recall: 0.827
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.904 | prec : 0.844 | acc: 0.900 | recall: 0.977
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.721
+INFO:medcat.rel_cat: f1 = 0.721
+INFO:medcat.rel_cat: loss = 0.556
+INFO:medcat.rel_cat: precision = 0.721
+INFO:medcat.rel_cat: recall = 0.721
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.731 | prec : 0.792 | acc: 0.721 | recall: 0.697
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.701 | prec : 0.658 | acc: 0.721 | recall: 0.772
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:15.782063 seconds
+INFO:medcat.rel_cat:Total epochs on this model: 10 | currently training epoch 9
+100%|██████████| 524/524 [00:15<00:00, 32.89it/s]
+INFO:medcat.rel_cat:Losses at Epoch 9: 0.01364
+INFO:medcat.rel_cat:Train accuracy at Epoch 9: 0.87255
+INFO:medcat.rel_cat:======================== TRAIN SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:17
+INFO:medcat.rel_cat: accuracy = 0.918
+INFO:medcat.rel_cat: f1 = 0.918
+INFO:medcat.rel_cat: loss = 0.295
+INFO:medcat.rel_cat: precision = 0.918
+INFO:medcat.rel_cat: recall = 0.918
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.905 | prec : 0.967 | acc: 0.918 | recall: 0.860
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.925 | prec : 0.882 | acc: 0.918 | recall: 0.977
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:======================== TEST SET TEST RESULTS ========================
+INFO:medcat.rel_cat:Evaluating test samples...
+INFO:medcat.rel_cat:==================== Evaluation Results ===================
+INFO:medcat.rel_cat: no. of batches:4
+INFO:medcat.rel_cat: accuracy = 0.734
+INFO:medcat.rel_cat: f1 = 0.734
+INFO:medcat.rel_cat: loss = 0.558
+INFO:medcat.rel_cat: precision = 0.734
+INFO:medcat.rel_cat: recall = 0.734
+INFO:medcat.rel_cat:----------------------- class stats -----------------------
+INFO:medcat.rel_cat:label: DRUG-DOSE | f1: 0.742 | prec : 0.793 | acc: 0.734 | recall: 0.699
+INFO:medcat.rel_cat:label: DRUG-AE | f1: 0.718 | prec : 0.674 | acc: 0.734 | recall: 0.770
+INFO:medcat.rel_cat:-----------------------------------------------------------
+INFO:medcat.rel_cat:===========================================================
+INFO:medcat.rel_cat:Epoch finished, took 0:00:15.932087 seconds
+
+# save the model
+relCAT.save(save_path="./ade_relcat_model")
+
+
+