Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training YOLO v8 Segment Model with COCO Panoptic Dataset type #12680

Open
1 task done
danicannt opened this issue May 14, 2024 · 11 comments
Open
1 task done

Training YOLO v8 Segment Model with COCO Panoptic Dataset type #12680

danicannt opened this issue May 14, 2024 · 11 comments
Labels
question Further information is requested

Comments

@danicannt
Copy link

Search before asking

Question

Hello everyone,

I'm currently trying to training a YOLO v8 segment model using the RailSem19 dataset, which follows the format of the COCO Panoptic dataset. However, I've encountered some challenges during the conversion process.

There are certain object types labeled as "polyline pair" within the RailSem19 dataset that I'm unsure how to properly convert for training with YOLO v8. I found some converters but I haven't been able to achieve successful conversion due to these object types.

I'm reaching out to inquire if anyone in the community has attempted or achieved success in training YOLO v8 with datasets similar to RailSem19 or has insights on how to handle "polyline pair" objects during conversion. Any guidance you can share would be helpful.

Thank you in advance

Additional

No response

@danicannt danicannt added the question Further information is requested label May 14, 2024
Copy link

👋 Hello @danicannt, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

@glenn-jocher
Copy link
Member

Hello!

Training a YOLO v8 segment model with datasets like RailSem19, which use COCO Panoptic formatting, can indeed pose unique challenges, especially for complex object types like "polyline pair."

For handling conversion of "polyline pair" objects, you might need to adapt them into a compatible format that YOLOv8 can process. One common approach is converting these polyline pairs into segmentation masks if your converter allows this transformation. Each polyline could be treated as a boundary of an object for segmentation purposes.

If you're writing a custom script for conversion, consider using polygon-based segmentation masks, where each "polyline pair" defines the vertices of a polygon. Libraries such as OpenCV in Python can help you fill polygons based on these vertices, creating masks that YOLO can train on.

Here’s a quick snippet to create a mask from polygons using OpenCV:

import cv2
import numpy as np

# Example polyline pairs (replace these with actual data)
polyline_pairs = [[[10, 10], [150, 200], [300, 150]], [[50, 50], [200, 250], [350, 200]]]
mask = np.zeros((height, width), dtype=np.uint8)  # Specify the dimensions of the image

for polyline in polyline_pairs:
    cv2.fillPoly(mask, [np.array(polyline, dtype=np.int32)], 255)

# Now `mask` is your segmentation mask that YOLO can use

I hope this helps! Keep exploring, and feel free to share any successful methods you come up with in the community. Happy training! 🚀

@danicannt
Copy link
Author

Hello @glenn-jocher

Sorry for the late reply,
I have been trying to create my own conversion script for this dataset which I am sharing here:

import json
import os
from scipy.spatial import ConvexHull
import numpy as np

class_indices = {
    "fence": 0,
    "pole": 1,
    "traffic-light": 2,
    "traffic-sign": 3,
    "rail": 4,
}

allowed_classes = ["fence", "pole", "traffic-light", "traffic-sign", "rail"]

def calculate_centroid(points):
    x_center = sum(point[0] for point in points) / len(points)
    y_center = sum(point[1] for point in points) / len(points)
    return x_center, y_center

def convert_to_yolo_v8(json_file_path, output_dir):
    with open(json_file_path, 'r') as f:
        data = json.load(f)

    if isinstance(data, dict):
        data = [data]

    for frame_data in data:
        frame_name = frame_data['frame']
        image_name = os.path.splitext(os.path.basename(frame_name))[0]
        img_width = frame_data['imgWidth']
        img_height = frame_data['imgHeight']
        objects = frame_data['objects']
        
        yolo_labels = []
        for obj in objects:
            label = obj['label']
            if label not in allowed_classes:
                continue  # Skip this object if the label is not in the allowed list
            class_index = class_indices[label]
            
            if 'boundingbox' in obj:
                bounding_box = obj['boundingbox']
                x_center = (bounding_box[0] + bounding_box[2]) / (2 * img_width)
                y_center = (bounding_box[1] + bounding_box[3]) / (2 * img_height)
                width = (bounding_box[2] - bounding_box[0]) / img_width
                height = (bounding_box[3] - bounding_box[1]) / img_height
                yolo_label = f"{class_index} {x_center} {y_center} {width} {height}"
                yolo_labels.append(yolo_label)
            
            elif 'polygon' in obj:
                points = obj['polygon']
                centroid = calculate_centroid(points)
                x_center = centroid[0] / img_width
                y_center = centroid[1] / img_height
                min_x = min(point[0] for point in points)
                max_x = max(point[0] for point in points)
                min_y = min(point[1] for point in points)
                max_y = max(point[1] for point in points)
                width = (max_x - min_x) / img_width
                height = (max_y - min_y) / img_height
                yolo_label = f"{class_index} {x_center} {y_center} {width} {height}"
                for point in points:
                    x = point[0] / img_width
                    y = point[1] / img_height
                    yolo_label += f" {x} {y}"
                yolo_labels.append(yolo_label)
                
            elif 'polyline' in obj:
                points = obj['polyline']
                centroid = calculate_centroid(points)
                x_center = centroid[0] / img_width
                y_center = centroid[1] / img_height
                min_x = min(point[0] for point in points)
                max_x = max(point[0] for point in points)
                min_y = min(point[1] for point in points)
                max_y = max(point[1] for point in points)
                width = (max_x - min_x) / img_width
                height = (max_y - min_y) / img_height
                yolo_label = f"{class_index} {x_center} {y_center} {width} {height}"
                for point in points:
                    x = point[0] / img_width
                    y = point[1] / img_height
                    yolo_label += f" {x} {y}"
                yolo_labels.append(yolo_label)
                
            elif 'polyline-pair' in obj:
                points = []
                for polyline in obj.get('polyline-pair'):
                    points.extend(polyline)
                
                centroid = calculate_centroid(points)
                hull = ConvexHull(points)
                sorted_points = [points[vertex] for vertex in hull.vertices]
                
                x_center = centroid[0] / img_width
                y_center = centroid[1] / img_height
                min_x = min(point[0] for point in sorted_points)
                max_x = max(point[0] for point in sorted_points)
                min_y = min(point[1] for point in sorted_points)
                max_y = max(point[1] for point in sorted_points)
                width = (max_x - min_x) / img_width
                height = (max_y - min_y) / img_height
                yolo_label = f"{class_index} {x_center} {y_center} {width} {height}"
                for point in sorted_points:
                    x = point[0] / img_width
                    y = point[1] / img_height
                    yolo_label += f" {x} {y}"
                yolo_labels.append(yolo_label)
        
        output_file_path = os.path.join(output_dir, f"{image_name}.txt")
        with open(output_file_path, 'w') as f:
            f.write('\n'.join(yolo_labels))

path = "data/rs19_val/jsons/rs19_val"
output_dir = "data/conver_labels"

for filename in os.listdir(path):
    file_path = os.path.join(path, filename)
    if os.path.isfile(file_path) and file_path.endswith('.json'):
        print(file_path)
        convert_to_yolo_v8(file_path, output_dir)

I still have some difficulties with the polyline-pair object type.

Reading your message has made me realize that maybe there is a way of instead of converting the json files I could use the 8uint files that are as well in the dataset, but what I am not sure of how to do is how to train yolo using this masks as you mentioned

@glenn-jocher
Copy link
Member

Hello @danicannt,

Thanks for sharing your conversion script! It looks like you've made significant progress. For handling the "polyline-pair" objects, your approach of using the Convex Hull to approximate the object's boundary is quite innovative. 👍

Regarding your idea of using 8-bit uint mask files directly for training, YOLOv8's segmentation models can indeed be trained using mask images if they are properly formatted. Each pixel value in your mask should correspond to a class ID, which the model will use to learn the segmentation task.

Here’s a brief outline on how you might adjust your training setup:

  1. Prepare Mask Images: Ensure each mask image is in grayscale where each pixel's intensity value corresponds to a class ID.
  2. Modify Dataset Configuration: Adjust your dataset configuration to point to where your images and corresponding mask files are stored.
  3. Training: Use the YOLOv8 segmentation model suitable for your task, ensuring the model configuration aligns with your dataset specifics.

Here's a simple example of how you might configure your dataset:

# dataset.yaml
train: /path/to/train/images
val: /path/to/val/images
nc: 5  # number of classes
names: ['fence', 'pole', 'traffic-light', 'traffic-sign', 'rail']

And then, you can start training using:

yolo segment train data=dataset.yaml model=yolov8n-seg.pt

This should help you leverage those mask files directly, potentially simplifying your preprocessing pipeline. Keep experimenting, and don't hesitate to reach out if you hit any snags! 🚀

@danicannt
Copy link
Author

Hi again @glenn-jocher,

I'm still struggling with this, sorry to bother you again.

I have been able to successfully train YOLO with my label files, but the results are not the best as some of the labels are not well converted.

Now I am trying to use these PNG files that are encoded as follows:
Each pixel in the image is encoded with values in a greyscale map to identify them:

  0: road
  1: sidewalk
  2: construction
  3: tram-track
  4: fence
  5: pole
  6: traffic-light
  7: traffic-sign
  8: vegetation
  9: terrain
  10: sky
  11: human
  12: rail-track
  13: car
  14: truck
  15: trackbed
  16: on-rails
  17: rail-raised
  18: rail-embedded
  19: switch-indicator
  20: crossing
  21: switch-left
  22: switch-right
  23: rail
  24: track-sign-front
  25: track-signal-front
  26: platform
  27: buffer-stop
  28: guard-rail
  29: train-car
  30: switch-unknown
  31: switch-static
  32: track-signal-back
  33: rail-occluder
  34: person-group
  35: person

I am not interested in all of the classes and will create a new script to alter these images and just keep the ones that I want, but at the moment just being able to train the model would be a success.

My folder structure is the following:

  dataset/
├── images/
│   ├── train/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   └── val/
│       ├── image1.jpg
│       ├── image2.jpg
│       └── ...
└── masks/
    ├── train/
    │   ├── image1.png
    │   ├── image2.png
    │   └── ...
    └── val/
        ├── image1.png
        ├── image2.png
        └── ...
  

This is my YAML file structure:

train: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/images/train # train images (relative to 'path') 4 images
val: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/images/val # val images (relative to 'path') 4 images

nc: 36
names: ['road', 'sidewalk', 'construction', 'tram-track', 'fence', 'pole', 'traffic-light', 'traffic-sign', 'vegetation', 'terrain', 'sky', 'human', 'rail-track', 'car', 'truck', 'trackbed', 'on-rails', 'rail-raised', 'rail-embedded', 'switch-indicator', 'crossing', 'switch-left', 'switch-right', 'rail', 'track-sign-front', 'track-signal-front', 'platform', 'buffer-stop', 'guard-rail', 'train-car', 'switch-unknown', 'switch-static', 'track-signal-back', 'rail-occluder', 'person-group', 'person']

And I am training it with this configuration:

usuario@L22308x:~/workspaces/autotram_ws/data$ yolo segment train data=/home/usuario/workspaces/autotram_ws/data/tram_config.yaml model=/home/usuario/workspaces/autotram_ws/data/models/yolov8x-seg.pt workers=16 device=cuda imgsz=416 lr0=0.005 lrf=0.01 cache=True

But I am getting this output when it starts to train; first, it says that my images are background. Do I have to point to where my masks are as well? Maybe I am missing some steps.

usuario@L22308x:~/workspaces/autotram_ws/data$ yolo segment train data=/home/usuario/workspaces/autotram_ws/data/tram_config.yaml model=/home/usuario/workspaces/autotram_ws/data/models/yolov8x-seg.pt workers=16 device=cuda imgsz=416 lr0=0.005 lrf=0.01 cache=True
New https://pypi.org/project/ultralytics/8.2.19 available 😃 Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.2.2 🚀 Python-3.8.10 torch-2.0.1+cu117 CUDA:0 (NVIDIA RTX A4000, 16086MiB)
engine/trainer: task=segment, mode=train, model=/home/usuario/workspaces/autotram_ws/data/models/yolov8x-seg.pt, data=/home/usuario/workspaces/autotram_ws/data/tram_config.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=416, save=True, save_period=-1, cache=True, device=cuda, workers=16, project=None, name=train4, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.005, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs/segment/train4
Overriding model.yaml nc=80 with nc=36

                   from  n    params  module                                       arguments                     
  0                  -1  1      2320  ultralytics.nn.modules.conv.Conv             [3, 80, 3, 2]                 
  1                  -1  1    115520  ultralytics.nn.modules.conv.Conv             [80, 160, 3, 2]               
  2                  -1  3    436800  ultralytics.nn.modules.block.C2f             [160, 160, 3, True]           
  3                  -1  1    461440  ultralytics.nn.modules.conv.Conv             [160, 320, 3, 2]              
  4                  -1  6   3281920  ultralytics.nn.modules.block.C2f             [320, 320, 6, True]           
  5                  -1  1   1844480  ultralytics.nn.modules.conv.Conv             [320, 640, 3, 2]              
  6                  -1  6  13117440  ultralytics.nn.modules.block.C2f             [640, 640, 6, True]           
  7                  -1  1   3687680  ultralytics.nn.modules.conv.Conv             [640, 640, 3, 2]              
  8                  -1  3   6969600  ultralytics.nn.modules.block.C2f             [640, 640, 3, True]           
  9                  -1  1   1025920  ultralytics.nn.modules.block.SPPF            [640, 640, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 12                  -1  3   7379200  ultralytics.nn.modules.block.C2f             [1280, 640, 3]                
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 15                  -1  3   1948800  ultralytics.nn.modules.block.C2f             [960, 320, 3]                 
 16                  -1  1    922240  ultralytics.nn.modules.conv.Conv             [320, 320, 3, 2]              
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 18                  -1  3   7174400  ultralytics.nn.modules.block.C2f             [960, 640, 3]                 
 19                  -1  1   3687680  ultralytics.nn.modules.conv.Conv             [640, 640, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  3   7379200  ultralytics.nn.modules.block.C2f             [1280, 640, 3]                
 22        [15, 18, 21]  1  12350876  ultralytics.nn.modules.head.Segment          [36, 32, 320, [320, 640, 640]]
YOLOv8x-seg summary: 401 layers, 71785516 parameters, 71785500 gradients, 344.7 GFLOPs

Transferred 651/657 items from pretrained weights
Freezing layer 'model.22.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed ✅
train: Scanning /home/usuario/workspaces/autotram_ws/data/rs19_val/training.cache... 0 images, 11898 backgrounds, 0 corrupt: 100%|██████████| 11898/11898 [00:00<?, ?it/s]
WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/training.cache, training may not work correctly. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
train: Caching images (3.2GB RAM): 100%|██████████| 11898/11898 [00:17<00:00, 676.91it/s]
val: Scanning /home/usuario/workspaces/autotram_ws/data/rs19_val/validation.cache... 0 images, 5100 backgrounds, 0 corrupt: 100%|██████████| 5100/5100 [00:00<?, ?it/s]
WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/validation.cache, training may not work correctly. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
val: Caching images (1.4GB RAM): 100%|██████████| 5100/5100 [00:07<00:00, 686.91it/s]
Plotting labels to runs/segment/train4/labels.jpg... 
zero-size array to reduction operation maximum which has no identity
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.005' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
optimizer: SGD(lr=0.01, momentum=0.9) with parameter groups 106 weight(decay=0.0), 117 weight(decay=0.0005), 116 bias(decay=0.0)
Image sizes 416 train, 416 val
Using 16 dataloader workers
Logging results to runs/segment/train4
Starting training for 100 epochs...

      Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
      1/100      6.88G          0          0      7.288          0          0        416: 100%|██████████| 744/744 [04:00<00:00,  3.09it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 160/160 [00:20<00:00,  7.74it/s]
Traceback (most recent call last):
  File "/home/usuario/.local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/cfg/__init__.py", line 582, in entrypoint
    getattr(model, mode)(**overrides)  # default args from model
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/model.py", line 673, in train
    self.trainer.train()
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/trainer.py", line 199, in train
    self._do_train(world_size)
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/trainer.py", line 419, in _do_train
    self.metrics, self.fitness = self.validate()
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/trainer.py", line 560, in validate
    metrics = self.validator(self)
  File "/home/usuario/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/validator.py", line 195, in __call__
    stats = self.get_stats()
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/models/yolo/detect/val.py", line 170, in get_stats
    stats = {k: torch.cat(v, 0).cpu().numpy() for k, v in self.stats.items()}  # to numpy
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/models/yolo/detect/val.py", line 170, in <dictcomp>
    stats = {k: torch.cat(v, 0).cpu().numpy() for k, v in self.stats.items()}  # to numpy
RuntimeError: torch.cat(): expected a non-empty list of Tensors

@glenn-jocher
Copy link
Member

Hi @danicannt,

It looks like the issue might be related to the dataset configuration, particularly how the masks are linked in your YAML file. In your current setup, the model doesn't seem to recognize the mask files, which is why it's treating all images as background.

To resolve this, you should specify the path to your masks in the YAML configuration file. Here's how you can adjust your YAML to include masks:

train: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/images/train
val: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/images/val
mask_train: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/masks/train
mask_val: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/masks/val

nc: 36
names: ['road', 'sidewalk', 'construction', 'tram-track', 'fence', 'pole', 'traffic-light', 'traffic-sign', 'vegetation', 'terrain', 'sky', 'human', 'rail-track', 'car', 'truck', 'trackbed', 'on-rails', 'rail-raised', 'rail-embedded', 'switch-indicator', 'crossing', 'switch-left', 'switch-right', 'rail', 'track-sign-front', 'track-signal-front', 'platform', 'buffer-stop', 'guard-rail', 'train-car', 'switch-unknown', 'switch-static', 'track-signal-back', 'rail-occluder', 'person-group', 'person']

Make sure that your mask files are correctly formatted and correspond to each image file by name. This should help the model recognize and use the masks during training.

If the issue persists, double-check the format of your mask files to ensure they are compatible with YOLOv8's expectations (i.e., each pixel value directly corresponds to a class ID).

Hope this helps! Let me know how it goes. 🚀

@danicannt
Copy link
Author

danicannt commented May 24, 2024

Hi @glenn-jocher still getting the same issue,

I have adapt my dataset because there were some pixels that were 255 as value. Iterate through all the dataset and changed them to '36' as value so in the yaml file I've added a new class as 'other' being the last class:

this is my yaml file now:


`train: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/images/train # train images (relative to 'path') 4 images
val: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/images/val # val images (relative to 'path') 4 images
mask_train: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/masks/train
mask_val: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/masks/val


# Number of classes in the dataset
nc: 37
names: ['road', 'sidewalk', 'construction', 'tram-track', 'fence', 'pole', 'traffic-light', 'traffic-sign', 'vegetation', 'terrain', 'sky', 'human', 'rail-track', 'car', 'truck', 'trackbed', 'on-rails', 'rail-raised', 'rail-embedded', 'switch-indicator', 'crossing', 'switch-left', 'switch-right', 'rail', 'track-sign-front', 'track-signal-front', 'platform', 'buffer-stop', 'guard-rail', 'train-car', 'switch-unknown', 'switch-static', 'track-signal-back', 'rail-occluder', 'person-group', 'person', 'other']
`

I have add an additional check for my files to see if the masks format was correct and see max and min value of each image pixels and this is the the output:

usuario@L22308x:~/workspaces/workspace$ /bin/python /home/usuario/workspaces/workspaces/IMAGES.py
The image has 1 channel(s).
The minimum value is 0
The maximum value is 36

But when training the model I am still getting this output:

usuario@L22308x:~/workspaces/autotram_ws/data$ yolo segment train data=/home/usuario/workspaces/autotram_ws/data/tram_config.yaml model=/home/usuario/workspaces/autotram_ws/data/models/yolov8x-seg.pt workers=16 device=cuda imgsz=416 epochs=50
New https://pypi.org/project/ultralytics/8.2.20 available 😃 Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.2.2 🚀 Python-3.8.10 torch-2.0.1+cu117 CUDA:0 (NVIDIA RTX A4000, 16086MiB)
engine/trainer: task=segment, mode=train, model=/home/usuario/workspaces/autotram_ws/data/models/yolov8x-seg.pt, data=/home/usuario/workspaces/autotram_ws/data/tram_config.yaml, epochs=50, time=None, patience=100, batch=16, imgsz=416, save=True, save_period=-1, cache=False, device=cuda, workers=16, project=None, name=train20, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs/segment/train20
Overriding model.yaml nc=80 with nc=37

                   from  n    params  module                                       arguments                     
  0                  -1  1      2320  ultralytics.nn.modules.conv.Conv             [3, 80, 3, 2]                 
  1                  -1  1    115520  ultralytics.nn.modules.conv.Conv             [80, 160, 3, 2]               
  2                  -1  3    436800  ultralytics.nn.modules.block.C2f             [160, 160, 3, True]           
  3                  -1  1    461440  ultralytics.nn.modules.conv.Conv             [160, 320, 3, 2]              
  4                  -1  6   3281920  ultralytics.nn.modules.block.C2f             [320, 320, 6, True]           
  5                  -1  1   1844480  ultralytics.nn.modules.conv.Conv             [320, 640, 3, 2]              
  6                  -1  6  13117440  ultralytics.nn.modules.block.C2f             [640, 640, 6, True]           
  7                  -1  1   3687680  ultralytics.nn.modules.conv.Conv             [640, 640, 3, 2]              
  8                  -1  3   6969600  ultralytics.nn.modules.block.C2f             [640, 640, 3, True]           
  9                  -1  1   1025920  ultralytics.nn.modules.block.SPPF            [640, 640, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 12                  -1  3   7379200  ultralytics.nn.modules.block.C2f             [1280, 640, 3]                
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 15                  -1  3   1948800  ultralytics.nn.modules.block.C2f             [960, 320, 3]                 
 16                  -1  1    922240  ultralytics.nn.modules.conv.Conv             [320, 320, 3, 2]              
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 18                  -1  3   7174400  ultralytics.nn.modules.block.C2f             [960, 640, 3]                 
 19                  -1  1   3687680  ultralytics.nn.modules.conv.Conv             [640, 640, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  3   7379200  ultralytics.nn.modules.block.C2f             [1280, 640, 3]                
 22        [15, 18, 21]  1  12351839  ultralytics.nn.modules.head.Segment          [37, 32, 320, [320, 640, 640]]
YOLOv8x-seg summary: 401 layers, 71786479 parameters, 71786463 gradients, 344.7 GFLOPs

Transferred 651/657 items from pretrained weights
Freezing layer 'model.22.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed ✅
train: Scanning /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels/train... 0 images, 5949 backgrounds, 0 corrupt: 100%|██████████| 5949/5949 [00:01<00:00, 5392.95it/s]
train: WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels/train.cache. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
train: WARNING ⚠️ Cache directory /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels is not writeable, cache not saved.
WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels/train.cache, training may not work correctly. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
val: Scanning /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels/val... 0 images, 2550 backgrounds, 0 corrupt: 100%|██████████| 2550/2550 [00:00<00:00, 6433.10it/s]
val: WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels/val.cache. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
val: WARNING ⚠️ Cache directory /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels is not writeable, cache not saved.
WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset/labels/val.cache, training may not work correctly. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
Plotting labels to runs/segment/train20/labels.jpg... 
zero-size array to reduction operation maximum which has no identity
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
optimizer: AdamW(lr=0.000244, momentum=0.9) with parameter groups 106 weight(decay=0.0), 117 weight(decay=0.0005), 116 bias(decay=0.0)
Image sizes 416 train, 416 val
Using 16 dataloader workers
Logging results to runs/segment/train20
Starting training for 50 epochs...

      Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
       1/50      6.91G          0          0      23.76          0          0        416: 100%|██████████| 372/372 [02:02<00:00,  3.04it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 80/80 [00:10<00:00,  7.76it/s]
                   all       2550          0          0          0          0          0          0          0          0          0
WARNING ⚠️ no labels found in segment set, can not compute metrics without labels

      Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
       2/50      7.23G          0          0     0.6782          0          0        416: 100%|██████████| 372/372 [02:05<00:00,  2.98it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 80/80 [00:10<00:00,  7.63it/s]
Traceback (most recent call last):
  File "/home/usuario/.local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/cfg/__init__.py", line 582, in entrypoint
    getattr(model, mode)(**overrides)  # default args from model
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/model.py", line 673, in train
    self.trainer.train()
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/trainer.py", line 199, in train
    self._do_train(world_size)
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/trainer.py", line 419, in _do_train
    self.metrics, self.fitness = self.validate()
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/trainer.py", line 560, in validate
    metrics = self.validator(self)
  File "/home/usuario/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/engine/validator.py", line 195, in __call__
    stats = self.get_stats()
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/models/yolo/detect/val.py", line 170, in get_stats
    stats = {k: torch.cat(v, 0).cpu().numpy() for k, v in self.stats.items()}  # to numpy
  File "/home/usuario/.local/lib/python3.8/site-packages/ultralytics/models/yolo/detect/val.py", line 170, in <dictcomp>
    stats = {k: torch.cat(v, 0).cpu().numpy() for k, v in self.stats.items()}  # to numpy
RuntimeError: torch.cat(): expected a non-empty list of Tensors
Exception in thread Thread-26:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/usuario/.local/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 51, in _pin_memory_loop
    do_one_step()
  File "/home/usuario/.local/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 28, in do_one_step
    r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 116, in get
    return _ForkingPickler.loads(res)
  File "/home/usuario/.local/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 307, in rebuild_storage_fd
    fd = df.detach()
  File "/usr/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/usr/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 508, in Client
    answer_challenge(c, authkey)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 752, in answer_challenge
    message = connection.recv_bytes(256)         # reject large message
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer

Sorry to bother you with this problem, but learning how to properly train Yolo with masks would be really useful, also I am attaching one mask image as an example of how they are
rs02131

@danicannt
Copy link
Author

danicannt commented May 24, 2024

Also by mistake I have changed the masks directories while the model was training and I have seen that it is doing something with the masks because It started publishing this, so it seems it is doing something with the png files apparently, but it still crashes:

usuario@L22308x:~/workspaces/autotram_ws/data$ yolo segment train data=/home/usuario/workspaces/autotram_ws/data/tram_config.yaml model=/home/usuario/workspaces/autotram_ws/data/models/yolov8x-seg.pt workers=16 device=cuda imgsz=416 epochs=50
New https://pypi.org/project/ultralytics/8.2.20 available 😃 Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.2.2 🚀 Python-3.8.10 torch-2.0.1+cu117 CUDA:0 (NVIDIA RTX A4000, 16086MiB)
engine/trainer: task=segment, mode=train, model=/home/usuario/workspaces/autotram_ws/data/models/yolov8x-seg.pt, data=/home/usuario/workspaces/autotram_ws/data/tram_config.yaml, epochs=50, time=None, patience=100, batch=16, imgsz=416, save=True, save_period=-1, cache=False, device=cuda, workers=16, project=None, name=train25, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs/segment/train25
Overriding model.yaml nc=80 with nc=37

                   from  n    params  module                                       arguments                     
  0                  -1  1      2320  ultralytics.nn.modules.conv.Conv             [3, 80, 3, 2]                 
  1                  -1  1    115520  ultralytics.nn.modules.conv.Conv             [80, 160, 3, 2]               
  2                  -1  3    436800  ultralytics.nn.modules.block.C2f             [160, 160, 3, True]           
  3                  -1  1    461440  ultralytics.nn.modules.conv.Conv             [160, 320, 3, 2]              
  4                  -1  6   3281920  ultralytics.nn.modules.block.C2f             [320, 320, 6, True]           
  5                  -1  1   1844480  ultralytics.nn.modules.conv.Conv             [320, 640, 3, 2]              
  6                  -1  6  13117440  ultralytics.nn.modules.block.C2f             [640, 640, 6, True]           
  7                  -1  1   3687680  ultralytics.nn.modules.conv.Conv             [640, 640, 3, 2]              
  8                  -1  3   6969600  ultralytics.nn.modules.block.C2f             [640, 640, 3, True]           
  9                  -1  1   1025920  ultralytics.nn.modules.block.SPPF            [640, 640, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 12                  -1  3   7379200  ultralytics.nn.modules.block.C2f             [1280, 640, 3]                
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 15                  -1  3   1948800  ultralytics.nn.modules.block.C2f             [960, 320, 3]                 
 16                  -1  1    922240  ultralytics.nn.modules.conv.Conv             [320, 320, 3, 2]              
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 18                  -1  3   7174400  ultralytics.nn.modules.block.C2f             [960, 640, 3]                 
 19                  -1  1   3687680  ultralytics.nn.modules.conv.Conv             [640, 640, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  3   7379200  ultralytics.nn.modules.block.C2f             [1280, 640, 3]                
 22        [15, 18, 21]  1  12351839  ultralytics.nn.modules.head.Segment          [37, 32, 320, [320, 640, 640]]
YOLOv8x-seg summary: 401 layers, 71786479 parameters, 71786463 gradients, 344.7 GFLOPs

Transferred 651/657 items from pretrained weights
Freezing layer 'model.22.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed ✅
train: Scanning /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/labels.cache... 0 images, 11898 backgrounds, 0 corrupt: 100%|██████████| 11898/11898 [00:00<?, ?it/s]
WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/labels.cache, training may not work correctly. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
val: Scanning /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/val/labels... 0 images, 5099 backgrounds, 0 corrupt: 100%|██████████| 5099/5099 [00:00<00:00, 6962.24it/s]
val: WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/val/labels.cache. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
val: New cache created: /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/val/labels.cache
WARNING ⚠️ No labels found in /home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/val/labels.cache, training may not work correctly. See https://docs.ultralytics.com/datasets/detect for dataset formatting guidance.
Plotting labels to runs/segment/train25/labels.jpg... 
zero-size array to reduction operation maximum which has no identity
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
optimizer: AdamW(lr=0.000244, momentum=0.9) with parameter groups 106 weight(decay=0.0), 117 weight(decay=0.0005), 116 bias(decay=0.0)
Image sizes 416 train, 416 val
Using 16 dataloader workers
Logging results to runs/segment/train25
Starting training for 50 epochs...

      Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
       1/50       6.9G          0          0      36.21          0          0        416:  37%|███▋      | 273/744 [01:31<02:41,  2.92it/s][ WARN:0@97.620] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/rs04500.png'): can't open/read file: check file path/integrity
       1/50       6.9G          0          0      36.14          0          0        416:  37%|███▋      | 274/744 [01:31<02:42,  2.88it/s][ WARN:0@97.889] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/rs01872.png'): can't open/read file: check file path/integrity
       1/50       6.9G          0          0      36.07          0          0        416:  37%|███▋      | 275/744 [01:32<02:42,  2.89it/s][ WARN:0@98.253] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/rs05441.png'): can't open/read file: check file path/integrity
       1/50       6.9G          0          0      35.99          0          0        416:  37%|███▋      | 276/744 [01:32<02:41,  2.90it/s][ WARN:0@98.576] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/rs08233.png'): can't open/read file: check file path/integrity
       1/50       6.9G          0          0      35.92          0          0        416:  37%|███▋      | 277/744 [01:32<02:40,  2.90it/s][ WARN:0@98.919] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/rs07968.png'): can't open/read file: check file path/integrity
       1/50       6.9G          0          0      35.87          0          0        416:  37%|███▋      | 278/744 [01:33<02:40,  2.90it/s][ WARN:0@99.295] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/rs06223.png'): can't open/read file: check file path/integrity
       1/50       6.9G          0          0       35.8          0          0        416:  38%|███▊      | 279/744 [01:33<02:40,  2.90it/s][ WARN:0@99.608] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/rs00912.png'): can't open/read file: check file path/integrity
       1/50       6.9G          0          0      35.73          0          0        416:  38%|███▊      | 280/744 [01:33<02:39,  2.90it/s][ WARN:0@99.952] global loadsave.cpp:248 findDecoder imread_('/home/usuario/workspaces/autotram_ws/data/rs19_val/dataset_/train/masks/r

The folder structure that I am using now is this one:

  dataset/
├── train/
│   ├── images/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   ├── masks/
│   │   ├── image1.png
│   │   ├── image2.png
│   │   └── ...
└── val/
│   ├── images/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   ├── masks/
│   │   ├── image1.png
│   │   ├── image2.png
│   │   └── ...

@glenn-jocher
Copy link
Member

@danicannt hi there!

It seems like the model is indeed attempting to access the mask files, but there are issues with file paths or file integrity based on the warnings you're seeing. This could be due to incorrect paths in your dataset configuration or issues with the mask files themselves.

Here are a couple of things to check and try:

  1. Verify File Paths: Ensure that the paths specified in your YAML file accurately reflect your current directory structure and that all files are accessible. Double-check for typos or mismatches in file names and paths.

  2. Check File Integrity: Make sure that the mask files are not corrupted and can be opened. You might want to manually open a few of the mask files to confirm they are intact.

  3. Permissions: Ensure that you have the necessary read/write permissions for the directories and files involved in training.

  4. YAML Configuration: Based on your folder structure, your YAML should look something like this:

train: /path/to/dataset/train/images
val: /path/to/dataset/val/images
mask_train: /path/to/dataset/train/masks
mask_val: /path/to/dataset/val/masks

nc: 37
names: ['road', 'sidewalk', 'construction', 'tram-track', 'fence', 'pole', 'traffic-light', 'traffic-sign', 'vegetation', 'terrain', 'sky', 'human', 'rail-track', 'car', 'truck', 'trackbed', 'on-rails', 'rail-raised', 'rail-embedded', 'switch-indicator', 'crossing', 'switch-left', 'switch-right', 'rail', 'track-sign-front', 'track-signal-front', 'platform', 'buffer-stop', 'guard-rail', 'train-car', 'switch-unknown', 'switch-static', 'track-signal-back', 'rail-occluder', 'person-group', 'person', 'other']

Make sure the paths are correct and point directly to the folders containing the images and masks.

If these steps don't resolve the issue, it might be helpful to look at the logs for any additional errors or warnings that could give more insight into what might be going wrong. Keep us posted on your progress! 🚀

@danicannt
Copy link
Author

Hi @glenn-jocher , How can I activate the logs to check what is happening?

Also, do you know from any dataset with which I can compare the structure that successfully trains with masks and so?

I am gonna keep trying this week, will keep you updated

@glenn-jocher
Copy link
Member

Hi @danicannt,

To activate detailed logging, you can set the verbose=True flag in your training command. This will provide more insights into what's happening during the training process.

Regarding datasets, you might want to look at the COCO dataset, which is a well-structured and commonly used dataset for segmentation tasks. It can serve as a good reference for your dataset structure.

Feel free to keep us updated on your progress, and we're here to help if you have any more questions! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants