This repository provides a VideoMAE-based Segmentation Model. It includes:
Backbone: A ViT-Base encoder pre-trained with VideoMAE on EchoNet-Dynamic.
Segmentation Head: A decoder designed to predict pixel-wise masks for the Left Ventricle at the end of systole and end of diastole. The input is a 16-frame video.