Skip to content

Rerunning fine tuning with normal cell type atlas samples #27

@alvinwt

Description

@alvinwt

Hi I am trying to train MethylBERT on a combination of colon and human leukocyte samples from the normal cell-type methylation atlas. I have converted the .pat files to reads.csv using wgbs_atlas_simulation.

I combined the reads.csv files for different blood and colon samples but the code fails when I use 1 sample as well.

Are there additional steps to process the reads prior to fine tuning or am I running with the wrong parameters?

command:

methylbert finetune 
-c ~/software/wgbs_atlas_simulation/res/GSM5652299_Blood-NK-Z000000TM.hg38_reads.csv 
-t ~/software/methylbert/test/data/processed/test_seq.csv \
-o model/ \
-l 4 \
-s 161 \
-b 256 \
--gradient_accumulation_steps 4 \
-e 600 \
-w 2 \
--log_freq 1 \
--eval_freq 1 \
--warm_up 1 \
--lr 1e-4 \
--decrease_steps 200 \
--loss focal_bce \
--with_cuda > stdout.txt 2> stderr.txt

and got

stdout.txt

MethylBERT v2.0.2
Create a tokenizer for 3-mers
Building Vocab
Vocab Size:  69
CPU info: 40 80
Loading Train Dataset: /home/alvin.ngwt/software/wgbs_atlas_simulation/res/GSM5652299_Blood-NK-Z000000TM.hg38_reads.csv
Total number of sequences :  122276
# of reads in each label:  [103658.  18618.]
122276 seqs with 1497 labels 
Loading Test Dataset: /home/alvin.ngwt/software/methylbert/test/data/processed/test_seq.csv
Total number of sequences :  612
# of reads in each label:  [319. 293.]
Creating Dataloader
Local step batch size :  64
Creating BERT Trainer
The model is loaded on GPU
Pre-trained MethylBERT model for 4 encoder blocks is selected.
Restore the pretrained model hanyangii/methylbert_hg19_4l
Focal loss assigned
Total Parameters: 49817090
Training Start
False

stderr.txt

File "/home/alvin.ngwt/miniconda3/envs/methylbert/bin/methylbert", line 7, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/cli.py", line 315, in main
    run_finetune(args)
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/cli.py", line 182, in run_finetune
    trainer.train(args.steps)
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/trainer.py", line 123, in train
    return self._iteration(steps, self.train_data, verbose)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/trainer.py", line 503, in _iteration
    for i, batch in enumerate(data_loader):
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
    return self._process_data(data)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
    data.reraise()
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/_utils.py", line 706, in reraise
    raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.

Original Traceback (most recent call last):
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
    return self.collate_fn(data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 317, in default_collate
    return collate(batch, collate_fn_map=default_collate_fn_map)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 155, in collate
    clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 155, in <dictcomp>
    clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 142, in collate
    return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 213, in collate_tensor_fn
    out = elem.new(storage).resize_(len(batch), *list(elem.size()))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Trying to resize storage that is not resizable

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions