---
title: "MONAI Model Zoo - Pre-trained Models for Medical Imaging"
description: "Explore MONAI Model Zoo - a collection of pre-trained models for medical imaging tasks. Find and use state-of-the-art models for your healthcare AI applications."
canonical: https://project-monai.github.io/model-zoo.html
audience: [engineer]
last_updated: 2026-06-11
source: model-zoo.html
---
Model Zoo

# Pre-trained models for medical AI

Every model ships as a MONAI Bundle: weights, training config, and inference code in one reproducible unit. The interactive browser needs JavaScript; the full catalog is listed below.

-   ## Brain MRI Latent Diffusion Synthesis (v1.0.3)
    
    A latent diffusion model that generates 160x224x160 voxel T1-weighted brain MRI volumes with 1mm isotropic resolution. The model accepts conditional inputs for age, gender, ventricular volume, and brain volume, enabling controlled generation of brain images with specific demographic and morphological characteristics.
    
    brain\_image\_synthesis\_latent\_diffusion\_model
    
-   ## BraTS MRI Axial Slices Latent Diffusion Generation (v1.1.4)
    
    Latent diffusion model that synthesizes 2D brain MRI axial slices (240x240 pixels) from Gaussian noise, trained on the BraTS dataset. The model processes 1-channel latent space features (64x64) and generates FLAIR sequences with 1mm in-plane resolution, capturing diverse tumor and brain tissue appearances.
    
    brats\_mri\_axial\_slices\_generative\_diffusion
    
-   ## BraTS MRI Latent Diffusion Generation (v1.1.4)
    
    Volumetric latent diffusion model that generates 3D brain MRI volumes (112x128x80 voxels) with tumor features from Gaussian noise, trained on the BraTS multimodal MRI dataset.
    
    brats\_mri\_generative\_diffusion
    
-   ## BraTS MRI segmentation (v0.5.4)
    
    3D segmentation model for delineating brain tumor subregions from multimodal MRI scans (T1, T1c, T2, FLAIR). The model processes 4-channel input volumes with 1mm isotropic resolution and outputs 3-channel segmentation masks for tumor core (TC), whole tumor (WT), and enhancing tumor (ET).
    
    brats\_mri\_segmentation
    
-   ## Breast density classification (v0.1.8)
    
    A deep learning model for automated classification of breast tissue density in mammograms according to the BI-RADS density categories (A through D). The model processes 299x299 pixel images and classifies breast tissue into four categories: fatty, scattered fibroglandular, heterogeneously dense, and extremely dense.
    
    breast\_density\_classification
    
-   ## Chest X-ray Latent Diffusion Synthesis (v1.0.2)
    
    A latent diffusion model that generates 512x512 pixel chest X-ray images from a 64x64x77 dimensional latent space. The model processes text-based condition inputs through a 1024-dimensional context vector, enabling controlled generation of X-rays with specific pathological features.
    
    cxr\_image\_synthesis\_latent\_diffusion\_model
    
-   ## CT-CHAT (v1.1.0)
    
    CT-CHAT is a multimodal AI assistant specifically designed for 3D chest CT imaging interpretation and analysis. The model excels at tasks including visual question answering, report generation, and multiple-choice questions, leveraging full 3D spatial information for superior performance compared to 2D-based approaches.
    
    hf\_ct\_chat
    
-   ## Endoscopic In-Body Classification (v0.5.1)
    
    A binary classification model based on SENet that distinguishes between inside-body and outside-body frames in endoscopic videos. The model processes 256x256 pixel RGB images and filters irrelevant frames, enabling automated procedure analysis.
    
    endoscopic\_inbody\_classification
    
-   ## Endoscopic Tool Segmentation (v0.6.2)
    
    A 2D segmentation model that identifies and delineates surgical instruments in endoscopic video frames. The model processes 736x480 pixel RGB images and provides binary segmentation masks. Based on an EfficientNet-UNet architecture, the model supports real-time analysis of surgical procedures.
    
    endoscopic\_tool\_segmentation
    
-   ## EXAONE Path 2.0 (v1.0.0)
    
    EXAONE Path 2.0, a pathology foundation model that learns patch-level representations under direct slide-level supervision. Using only 37k WSIs for training, EXAONE Path 2.0 achieves state-of-the-art average performance across 10 biomarker prediction tasks, demonstrating remarkable data efficiency.
    
    hf\_exaonepath\_2.0
    
-   ## EXAONEPath (v1.1.0)
    
    EXAONEPath is a patch-level pathology foundation model that achieves state-of-the-art performance across multiple pathology tasks while maintaining computational efficiency. It excels in tissue classification, tumor detection, and microsatellite instability assessment.
    
    hf\_exaonepath
    
-   ## EXAONEPath-CRC-MSI-Predictor (v1.0.0)
    
    MSI classification of CRC tumors using EXAONEPath - a patch-level foundation model for pathology.
    
    hf\_exaonepath-crc-msi-predictor
    
-   ## HoVer-Net: Nuclear Segmentation and Classification (v0.2.8)
    
    A multi-task learning model based on the HoVer-Net architecture that simultaneously performs nuclei segmentation and type classification in H&E-stained histology images. The model processes 256x256 pixel RGB patches and outputs three complementary predictions: binary nuclear segmentation (Dice score: 0.83), hover maps for instance separation, and pixel-level nuclear type classification.
    
    pathology\_nuclei\_segmentation\_classification
    
-   ## Llama3-VILA-M3-13B (v1.1.0)
    
    VILA-M3 is a medical visual language model built on Llama 3 and VILA architecture. This 13B parameter model performs medical image analysis including segmentation, classification, visual question answering, and report generation across multiple imaging modalities.
    
    hf\_llama3\_vila\_m3\_13b
    
-   ## Llama3-VILA-M3-3B (v1.1.0)
    
    VILA-M3 is a medical visual language model built on Llama 3 and VILA architecture. This 3B parameter model performs medical image analysis including segmentation, classification, visual question answering, and report generation across multiple imaging modalities.
    
    hf\_llama3\_vila\_m3\_3b
    
-   ## Llama3-VILA-M3-8B (v1.1.0)
    
    VILA-M3 is a medical visual language model built on Llama 3 and VILA architecture. This 8B parameter model performs medical image analysis including segmentation, classification, visual question answering, and report generation across multiple imaging modalities.
    
    hf\_llama3\_vila\_m3\_8b
    
-   ## Lung Nodule CT Detection (v0.6.10)
    
    A 3D detection model for identifying pulmonary nodules in CT scans. The model processes variable-sized patches and outputs detection boxes with classification scores. Trained on the LUNA16 challenge dataset, it provides automated screening capabilities for pulmonary nodule detection in chest CT examinations.
    
    lung\_nodule\_ct\_detection
    
-   ## MAISI: Medical AI for Synthetic Imaging (v1.0.2)
    
    MAISI is a diffusion-based model for generating synthetic 3D CT images with anatomical control. The model produces realistic CT volumes up to 512×512×768 voxels and can generate images conditioned on organ segmentations of 127 anatomical structures.
    
    maisi\_ct\_generative
    
-   ## Medical Image Classification Template (v0.0.4)
    
    A comprehensive template for developing 2D medical image classification models, featuring a modular architecture and standardized training pipeline. The template supports single-channel 128x128 pixel input images and outputs 4-class probability distributions, serving as a foundation for custom medical image classification tasks.
    
    classification\_template
    
-   ## Medical Image Segmentation Template (v0.0.4)
    
    A comprehensive 3D segmentation framework designed as a foundation for developing custom medical volumetric segmentation models. The template includes a configurable architecture and preprocessing pipeline, processing 128x128x128 voxel volumes with single-channel input and producing 4-class segmentation outputs. Includes support for random sphere generation for demonstration and testing purposes.
    
    segmentation\_template
    
-   ## MedNIST DDPM Hand X-ray Generation (v1.0.3)
    
    A denoising diffusion probabilistic model (DDPM) that synthesizes hand X-ray images based on the MedNIST dataset. The model learns the underlying distribution of the dataset through an iterative denoising process, demonstrating the capabilities of diffusion models in medical image synthesis. Features progressive noise-to-image generation with fine-grained control over the generation process.
    
    mednist\_ddpm
    
-   ## MedNIST GAN (v0.4.4)
    
    A generative adversarial network (GAN) that synthesizes hand X-ray images based on the MedNIST dataset. The model generates 64x64 pixel hand radiographs with varying appearances and orientations. The generated images maintain anatomical plausibility and can be used for data augmentation and educational purposes.
    
    mednist\_gan
    
-   ## MedNIST Hand X-ray Registration (v0.0.7)
    
    A ResNet-based spatial transformer model for precise registration of hand X-ray images from the MedNIST dataset. The model processes 64x64 pixel input pairs (moving and fixed images) and outputs registered images, demonstrating the application of deep learning in medical image registration.
    
    mednist\_reg
    
-   ## Multi-organ Abdominal Segmentation (v0.0.6)
    
    A 3D segmentation model optimized through Neural Architecture Search (DiNTS) that processes 96x96x96 pixel patches from CT scans to segment eight abdominal organs and structures. The model achieves a mean Dice score of 0.88 across all structures, including liver, spleen, pancreas, stomach, gallbladder, and vascular structures (artery and portal vein).
    
    multi\_organ\_segmentation
    
-   ## Pancreas and Tumor DiNTS Segmentation (v0.5.2)
    
    A 3D segmentation model optimized through Neural Architecture Search (DiNTS) that processes 96x96x96 pixel patches from CT scans to segment pancreas and pancreatic tumors. The model architecture was automatically discovered to balance accuracy and computational efficiency, achieving a mean Dice score of 0.62 across both structures.
    
    pancreas\_ct\_dints\_segmentation
    
-   ## Pathology Nuclei Classification (v0.2.2)
    
    A deep learning model based on the HoVer-Net architecture that classifies nuclei in H&E-stained histology images. The model processes 128x128 pixel RGB images with nuclei masks and classifies four distinct cell types: inflammatory, epithelial, spindle-shaped, and other nuclei
    
    pathology\_nuclei\_classification
    
-   ## Pathology NuClick Annotation (v0.2.3)
    
    An interactive nuclei segmentation model based on the NuClick framework. The model processes 128x128 pixel RGB images with positive and negative click signals to generate nuclei segmentation masks. Trained on the CoNSeP dataset
    
    pathology\_nuclick\_annotation
    
-   ## Pathology Tumor Detection (v0.6.4)
    
    A deep learning model for detecting metastatic tissue in whole-slide pathology images. The model processes 224x224 pixel RGB patches and provides probability scores for metastasis detection. Trained on the Camelyon16 dataset
    
    pathology\_tumor\_detection
    
-   ## Pediatric Abdominal CT Segmentation (v0.4.6)
    
    A 3D segmentation model for liver, spleen, and pancreas in pediatric abdominal CT images. The model processes 96x96x96 pixel patches and provides segmentation masks. Pre-trained on TotalSegmentator, TCIA and BTCV datasets and fine-tuned on Cincinnati Children's Healthy Pediatric Dataset.
    
    pediatric\_abdominal\_ct\_segmentation
    
-   ## Prostate MRI Anatomy (v0.3.6)
    
    A 3D segmentation model that differentiates between central gland and peripheral zone within the prostate in MRI images. The model processes 96x96x96 pixel patches and provides segmentation masks.
    
    prostate\_mri\_anatomy
    
-   ## Renal Structures CECT Segmentation (v0.2.3)
    
    A 3D UNet-based segmentation model for comprehensive renal structure analysis in contrast-enhanced CT scans. The model processes 96x96x96 voxel patches and identifies six anatomical structures: arteries, veins, ureters, parenchyma, cysts, and tumors.
    
    renalStructures\_CECT\_segmentation
    
-   ## Renal Structures UNEST Segmentation (v0.2.7)
    
    A transformer-based 3D segmentation model that delineates kidney cortex, medulla, and pelvicalyceal system in CT images. The model processes 96x96x96 pixel patches and provides segmentation masks for detailed morphological analysis.
    
    renalStructures\_UNEST\_segmentation
    
-   ## retinalOCT\_RPD\_segmentation (v0.0.1)
    
    This network detects and segments Reticular Pseudodrusen (RPD) instances in Optical Coherence Tomography (OCT) B-scans which can be presented in a vol or dicom format.
    
    retinalOCT\_RPD\_segmentation
    
-   ## Spleen CT Segmentation (v0.6.1)
    
    A 3D segmentation model for spleen delineation in CT images. The model processes 96x96x96 pixel patches and provides segmentation masks for spleen tissue. Trained on the Medical Segmentation Decathlon dataset.
    
    spleen\_ct\_segmentation
    
-   ## Spleen DeepEdit Interactive Segmentation (v0.5.8)
    
    An interactive 3D segmentation model that processes 128x128x128 pixel patches from CT scans to segment the spleen. The model incorporates user-provided point annotations through the DeepEdit framework. It accepts positive and negative click inputs to refine segmentation boundaries in real-time.
    
    spleen\_deepedit\_annotation
    
-   ## Swin UNETR BTCV Multi-organ Segmentation (v0.5.8)
    
    A 3D segmentation model based on the Swin UNETR architecture that processes 96x96x96 pixel patches from CT scans to segment 13 abdominal organs and structures. The model utilizes self-supervised pre-training and hierarchical transformer blocks.
    
    swin\_unetr\_btcv\_segmentation
    
-   ## Valve Landmarks Regression (v0.5.2)
    
    A cardiac valve landmark detection model that localizes 10 valve insertion points throughout the cardiac cycle in long-axis MR images. The model processes 256x256 pixel images and outputs 2D coordinates for mitral, aortic, and tricuspid valve insertion points, enabling 3D finite element modeling for cardiac simulation.
    
    valve\_landmarks
    
-   ## Ventricular Short Axis 3-Label Segmentation (v0.3.5)
    
    A cardiac MRI segmentation model that delineates three key structures in 2D short-axis images: left ventricle blood pool, myocardium, and right ventricle blood pool. The model processes 256x256 pixel images and provides segmentation masks for functional assessment of cardiac structures throughout the cardiac cycle.
    
    ventricular\_short\_axis\_3label
    
-   ## VISTA-2D: Cell Instance Segmentation (v0.4.0)
    
    VISTA-2D is a flow-based cell instance segmentation model for microscopy images. It processes 256x256 RGB images and generates instance masks with unique labels for each cell. The model supports brightfield, fluorescence, and phase contrast imaging, handling touching cells and overlapping instances.
    
    vista2d
    
-   ## VISTA-3D: Versatile Imaging SegmenTation and Annotation (v0.5.11)
    
    A 3D segmentation model that processes 128x128x128 pixel patches from CT scans to identify and delineate over 130 anatomical structures. The model employs zero-shot learning capabilities to adapt to new anatomical targets without retraining, supporting comprehensive volumetric analysis of organs, bones, muscles, and pathological findings.
    
    vista3d
    
-   ## Whole Body CT Segmentation (v0.2.7)
    
    A SegResNet-based volumetric segmentation model that segments 104 distinct anatomical structures from CT scans. The model processes 96x96x96 pixel patches and provides segmentation masks for major organs, bones, muscles, and vascular structures throughout the body, trained on TotalSegmentator data.
    
    wholeBody\_ct\_segmentation
    
-   ## Whole Brain Large UNEST Segmentation (v0.2.7)
    
    A transformer-based 3D segmentation model that identifies 133 distinct brain structures in T1W MRI scans. The model processes 96x96x96 pixel patches and provides segmentation masks for comprehensive neuroanatomical analysis.
    
    wholeBrainSeg\_Large\_UNEST\_segmentation
