Skip to content

Latest commit

 

History

History
78 lines (60 loc) · 7.12 KB

File metadata and controls

78 lines (60 loc) · 7.12 KB

Model Zoo

We provide a spectrum of pre-trained models on different datasets.

Example Usage using Detectron2:

import layoutparser as lp
model = lp.Detectron2LayoutModel(
            config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
            label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
            extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
        )
model.detect(image)

Example Usage using PaddleDetection:

import layoutparser as lp
model = lp.PaddleDetectionLayoutModel(
  					config_path="lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config", # In model catalog
            label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
            threshold =0.5] # Optional
        )
model.detect(image)

Model Catalog

Dataset Model Config Path Eval Result (mAP)
HJDataset faster_rcnn_R_50_FPN_3x lp://HJDataset/faster_rcnn_R_50_FPN_3x/config
HJDataset mask_rcnn_R_50_FPN_3x lp://HJDataset/mask_rcnn_R_50_FPN_3x/config
HJDataset retinanet_R_50_FPN_3x lp://HJDataset/retinanet_R_50_FPN_3x/config
PubLayNet faster_rcnn_R_50_FPN_3x lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config
PubLayNet mask_rcnn_R_50_FPN_3x lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config
PubLayNet mask_rcnn_X_101_32x8d_FPN_3x lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config 88.98 eval.csv
PubLayNet ppyolov2_r50vd_dcn_365e_publaynet lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config 93.6 eval.csv
PrimaLayout mask_rcnn_R_50_FPN_3x lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config 69.35 eval.csv
NewspaperNavigator faster_rcnn_R_50_FPN_3x lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config
TableBank faster_rcnn_R_50_FPN_3x lp://TableBank/faster_rcnn_R_50_FPN_3x/config 89.78 eval.csv
TableBank faster_rcnn_R_101_FPN_3x lp://TableBank/faster_rcnn_R_101_FPN_3x/config 91.26 eval.csv
TableBank ppyolov2_r50vd_dcn_365e_tableBank_word lp://TableBank/ppyolov2_r50vd_dcn_365e_tableBank_word/config 96.2 eval.csv
  • For PubLayNet models, we suggest using mask_rcnn_X_101_32x8d_FPN_3x model as it's trained on the whole training set, while others are only trained on the validation set (the size is only around 1/50). You could expect a 15% AP improvement using the mask_rcnn_X_101_32x8d_FPN_3x model.
  • Compare the time cost of Detectron2 and PaddleDetection(ppyolov2_* models in the above table):

PubLayNet Dataset:

Model model mAP CPU time cost GPU time cost
Detectron2 mask_rcnn_X_101_32x8d_FPN_3x 89.0 16545.5ms 209.5ms
PaddleDetection ppyolov2_r50vd_dcn_365e 93.6 1713.7ms 66.6ms

TableBank Dataset:

Model model mAP CPU time cost GPU time cost
Detectron2 faster_rcnn_R_101_FPN_3x 91.3 7623.2ms 104.2.ms
PaddleDetection ppyolov2_r50vd_dcn_365e 96.2 1968.4ms 65.1ms

Envrionment:

CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz,24core

GPU: a single NVIDIA Tesla P40

Model label_map

Dataset Label Map
HJDataset {1:"Page Frame", 2:"Row", 3:"Title Region", 4:"Text Region", 5:"Title", 6:"Subtitle", 7:"Other"}
PubLayNet {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}
PrimaLayout {1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"}
NewspaperNavigator {0: "Photograph", 1: "Illustration", 2: "Map", 3: "Comics/Cartoon", 4: "Editorial Cartoon", 5: "Headline", 6: "Advertisement"}
TableBank {0: "Table"}