ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • PDFium 라이브러리 삽질기 - 4
    카테고리 없음 2019. 12. 13. 17:16

    PDFium 라이브러리 삽질기 - 4

    gdal 과 함께 사용하기 (2)

    작성일자: 2019년 12월 13일

    작성자: N3


    0. 새로운 PDFium PDF 엔진의 장점

    GDAL 에서는 새로운 PDFium 의 장점을 다음과 같이 설명하고 있다.
    • Significantly higher performance (compared to the previous PoDoFo and Poppler engines)
    • Support for larger PDF files with smaller memory footprint - even large AutoCAD plans or huge GeoPDFs can be processed efficiently now.
    • A non-restrictive BSD license! The copy-left GPL prevented the existence of applications supporting both PDF and MrSID/ECW formats for example.

    1. gdal 버전 업 변경 가이드

    MIGRATION GUIDE FROM GDAL 2.3 to GDAL 2.4
    -----------------------------------------

    1) Out-of-tree drivers: RawRasterBand() constructor changes

    RawRasterBand now only accepts a VSILFILE* file. Consequently the void* fpRaw
    argument has become a VSILFILE* one. And the bIsVSIL = FALSE argument has
    been removed. The int bOwnsFP = FALSE has seen its default value suppressed,
    and has seen its type changed to the RawRasterBand::OwnFP::YES/NO enumeration,
    to detect places where your code must be changed.

    Caution: code like RawRasterBand(..., bNativeOrder, TRUE) must be changed to
    RawRasterBand(..., bNativeOrder, RawRasterBand::OwnFP::NO, the TRUE value
    being the bIsVSIL value, and the default argument being bOwnsFP == FALSE.


    MIGRATION GUIDE FROM GDAL 2.4 to GDAL 3.0
    -----------------------------------------

    - Unix Build: ./configure arguments --without-bsb, --without-grib,
      and --without-mrf have been renamed to --disable-driver-bsb,
      --disable-driver-grib and --disable-driver-mrf

    - Substantial changes, sometimes backward incompatible, in coordinate reference
      system and coordinate transformations have been introduced per
      https://trac.osgeo.org/gdal/wiki/rfc73_proj6_wkt2_srsbarn
        * OSRImportFromEPSG() takes into account official axis order.
          Traditional GIS-friendly axis order can be restored with
          OGRSpatialReference::SetAxisMappingStrategy(OAMS_TRADITIONAL_GIS_ORDER);
        * Same for SetWellKnownGeogCS("WGS84") / SetFromUserInput("WGS84")
        * removal of OPTGetProjectionMethods(), OPTGetParameterList() and OPTGetParameterInfo()
          No equivalent.
        * removal of OSRFixup() and OSRFixupOrdering(): no longer needed since objects
          constructed are always valid
        * removal of OSRStripCTParms(). Use OSRExportToWktEx() instead with the
          FORMAT=SQSQL option
        * exportToWkt() outputs AXIS nodes
        * OSRIsSame(): now takes into account data axis to CRS axis mapping, unless
          IGNORE_DATA_AXIS_TO_SRS_AXIS_MAPPING=YES is set as an option to OSRIsSameEx()
        * ogr_srs_api.h: SRS_WKT_WGS84 macro is no longer declared by default since
          WKT without AXIS is too ambiguous. Preferred remediation: use SRS_WKT_WGS84_LAT_LONG.
          Or #define USE_DEPRECATED_SRS_WKT_WGS84 before including ogr_srs_api.h

    Out-of-tree drivers:
    * GDALDataset::GetProjectionRef() made non-virtual.
      Replaced by GetSpatialRef() virtual method.
      Compatibility emulation possible by defining:
        const char* _GetProjectionRef() override; // note leading underscore
        const OGRSpatialReference* GetSpatialRef() const override {
            return GetSpatialRefFromOldGetProjectionRef();
        }

    * GDALDataset::SetProjection() made non-virtual.
      Replaced by SetSpatialRef() virtual method.
      Compatibility emulation possible by defining:
        CPLErr _SetProjection(const char*) override; // note leading underscore
        CPLErr SetSpatialRef(const OGRSpatialReference* poSRS) override {
            return OldSetProjectionFromSetSpatialRef(poSRS);
        }

    * GDALDataset::GetGCPProjection() made non-virtual.
      Replaced by GetGCPSpatialRef() virtual method.
      Compatibility emulation possible by defining:
        const char* _GetGCPProjectionRef() override; // note leading underscore
        const OGRSpatialReference* GetGCPSpatialRef() const override {
            return GetGCPSpatialRefFromOldGetGCPProjection();
        }

    * GDALDataset::SetGCPs(..., const char* pszWKT) made non-virtual.
      Replaced by SetGCPs(..., const OGRSpatialReference* poSRS) virtual mode.
        CPLErr _SetGCPs( int nGCPCount, const GDAL_GCP *pasGCPList,
                        const char *pszGCPProjection ) override; // note leading underscore
        CPLErr SetGCPs( int nGCPCountIn, const GDAL_GCP *pasGCPListIn,
                        const OGRSpatialReference* poSRS ) override {
            return OldSetGCPsFromNew(nGCPCountIn, pasGCPListIn, poSRS);
        }

    2. PDFium 을 사용하는 gdal  패치하기

    먼저, 최신 pdfium 으로 수정된 GDAL 의 3.1 git 소스중 gda-3.1/frmts/pdf/ 아래의 소스를 현재 gdal 버전의 같은 위치에 덮어 쓴다.

    $ cd gdal-2.3.2/frmts/pdf

    $ cp ~/gdal/gdal/frmts/pdf/* .


    pdfcreatefromcomposition.cpp 파일(GDALPDFComposerWriter 클래스)이 새로 생긴 것으로 보인다. 

    해당 클래스는 일단 빌드과정에서 제거한 후, 향후에 추가해서 일거리를 줄여본다.


    GNUmakefile

    OBJ     =       pdfdataset.o pdfio.o pdfobject.o pdfcreatecopy.o ogrpdflayer.o pdfwritabledataset.o pdfreadvectors.o pdfcreatefromcomposition.o

    ..
    $(O_OBJ):       pdfobject.h pdfio.h pdfcreatecopy.h pdfcreatefromcomposition.h gdal_pdf.h ../../ogr/ogrsf_frmts/mem/ogr_mem.h
    ..


    빌드해 본다.

    ./configure \

    ...

            --with-pdfium           \

            --with-pdfium-extra-lib-for-test="-lpthread -lm -lc -lstdc++ -lz -ljpeg -lopenjp2 -llcms2 -lpng " \

    ...


    libpdfium-devel 과 libpdfium RPM 패키지를 시스템에 설치했다고 가정한다. (이전의 패키징 참조)

    checking if we have Poppler >= 0.20.0... yes

    checking if we have Poppler >= 0.23.0... yes

    checking for podofo... disabled

    checking for pdfium... no

    configure: error: pdfium requested but not found


    제대로 될리가 없다.


    pdfium 라이브러리가 있는지 테스트하는 코드의 헤더를 수정해준다.

    configure 와 configure.ac

         if test "x$with_pdfium_lib" = "x" ; then

             rm -f testpdfium.*

    -        echo '#include <fpdfview.h>' > testpdfium.cpp

    -        echo '#include <core/include/fpdfapi/fpdf_page.h>' >> testpdfium.cpp

    +        echo '#include <public/fpdfview.h>' > testpdfium.cpp

    +        echo '#include <core/fpdfapi/page/cpdf_page.h>' >> testpdfium.cpp

             echo 'int main(int argc, char** argv) { FPDF_InitLibrary(); FPDF_DestroyLibrary(); return 0; } ' >> testpdfium.cpp

             TEST_CXX_FLAGS="-std=c++0x"

             if test ! -z "`uname | grep Darwin`" ; then


    다시 빌드해 보면, configure 테스트틀 통과하는 것을 알 수 있다. make 해 본다.

    복잡한 POPPLER 버전 확인매크로들이 POPPLER_MAJOR_VERSION 과 POPPLER_MINOR_VERSION  으로 define 이 변경되었다. (frmts/pdf/)


    poppler 의 Define 을 추가해 준다.

    diff -urN gdal-2.3.2/GDALmake.opt.in gdal-2.3.2-pdf/GDALmake.opt.in

    --- gdal-2.3.2/GDALmake.opt.in  2018-09-21 18:01:50.000000000 +0900

    +++ gdal-2.3.2-pdf/GDALmake.opt.in      2019-12-12 13:24:28.223648374 +0900

    @@ -468,6 +468,8 @@

     #


     HAVE_POPPLER = @HAVE_POPPLER@

    +POPPLER_MAJOR_VERSION = @POPPLER_MAJOR_VERSION@

    +POPPLER_MINOR_VERSION = @POPPLER_MINOR_VERSION@

     POPPLER_HAS_OPTCONTENT = @POPPLER_HAS_OPTCONTENT@

     POPPLER_BASE_STREAM_HAS_TWO_ARGS = @POPPLER_BASE_STREAM_HAS_TWO_ARGS@

     POPPLER_0_20_OR_LATER = @POPPLER_0_20_OR_LATER@

    diff -urN gdal-2.3.2/configure gdal-2.3.2-pdf/configure

    --- gdal-2.3.2/configure        2019-12-12 12:40:51.807185686 +0900

    +++ gdal-2.3.2-pdf/configure    2019-12-12 13:13:35.928570492 +0900

    @@ -663,6 +663,8 @@

     PODOFO_INC

     HAVE_PODOFO

     POPPLER_PLUGIN_LIB

    +POPPLER_MINOR_VERSION

    +POPPLER_MAJOR_VERSION

     POPPLER_INC

     POPPLER_0_58_OR_LATER

     POPPLER_0_23_OR_LATER

    @@ -34381,6 +34383,8 @@



     HAVE_POPPLER=no

    +POPPLER_MAJOR_VERSION=

    +POPPLER_MINOR_VERSION=

     POPPLER_HAS_OPTCONTENT=no

     POPPLER_BASE_STREAM_HAS_TWO_ARGS=no

     POPPLER_0_20_OR_LATER=no

    @@ -34538,8 +34542,21 @@

     $as_echo "disabled" >&6; }

     fi


    +if test "$HAVE_POPPLER" = "yes"; then

    +    POPPLER_VERSION=`$PKG_CONFIG --modversion poppler`

    +    if test "$POPPLER_VERSION" != ""; then

    +        HAVE_POPPLER=yes

    +        POPPLER_MAJOR_VERSION=`expr $POPPLER_VERSION : '\([0-9]*\)'`

    +        POPPLER_MINOR_VERSION=`expr $POPPLER_VERSION : '[0-9]*\.\([0-9]*\)'`

    +    fi

    +fi

    +

     HAVE_POPPLER=$HAVE_POPPLER


    +POPPLER_MAJOR_VERSION=$POPPLER_MAJOR_VERSION

    +

    +POPPLER_MINOR_VERSION=$POPPLER_MINOR_VERSION

    +

     POPPLER_HAS_OPTCONTENT=$POPPLER_HAS_OPTCONTENT


     POPPLER_BASE_STREAM_HAS_TWO_ARGS=$POPPLER_BASE_STREAM_HAS_TWO_ARGS



    다시 빌드해본다.

    ..

    does not override

         virtual const char* _GetProjectionRef() override;

    ..

      GDALPamDataset::_SetProjection(pszWKTIn);

    ..


    앞의 GDAL 2.4 에서 3.0 마이그레이션 가이드에 나와 있는 함수들에서 오류가 생기고 있다.

    해당 함수를 찾아서 다시 예전방식으로 변경하거나,  해당 코드를 채워넣는다.


    gdal_pdf.h

    +#if 1

    +    OGRSpatialReference* GetSpatialRef() {

    +       const char* pWKT = GetProjectionRef();

    +       if( !pWKT || pWKT[0] == '\0')

    +       {

    +           return nullptr;

    +       }

    +       OGRSpatialReference *m_pSRS = new OGRSpatialReference();

    +       if( m_pSRS->importFromWkt(pWKT) != OGRERR_NONE )

    +       {

    +           return nullptr;

    +       }

    +        return m_pSRS;

    +    }

    +

    +    CPLErr SetSpatialRef(const OGRSpatialReference* poSRS) {

    +       if( !poSRS )

    +       {

    +           return SetProjection("");

    +       }

    +       char* pWKT = nullptr;

    +       if( poSRS->exportToWkt(&pWKT) != OGRERR_NONE )

    +       {

    +           CPLFree(pWKT);

    +           return CE_Failure;

    +       }

    +       auto ret = SetProjection(pWKT);

    +       CPLFree(pWKT);

    +       return ret;

    +    }

    +}

    +#else

    +    // Since GDAL 3.0

    +    const OGRSpatialReference* GetSpatialRef() const override {

    +        return GetSpatialRefFromOldGetProjectionRef();

    +    }

    +    CPLErr SetSpatialRef(const OGRSpatialReference* poSRS) override {

    +        return OldSetProjectionFromSetSpatialRef(poSRS);

    +    }

    +#endif

    +

    +#if 0

    +    const OGRSpatialReference* GetGCPSpatialRef() const override {

    +        return GetGCPSpatialRefFromOldGetGCPProjection();

    +    }

    +#endif



    +#if 0  // 임시
    +    const OGRSpatialReference* GetGCPSpatialRef() const override {
    +        return GetGCPSpatialRefFromOldGetGCPProjection();
    +    }
    +#endif

    +#if 0 // 임시
    +    using GDALPamDataset::SetGCPs;
    +    CPLErr SetGCPs( int nGCPCountIn, const GDAL_GCP *pasGCPListIn,
    +                    const OGRSpatialReference* poSRS ) override {
    +        return OldSetGCPsFromNew(nGCPCountIn, pasGCPListIn, poSRS);
    +    }
    +#endif


    SetAxisMapping 함수는 모두 사용을 막는다. (2.x 에서는 막아도 된다.)

    +    // OSRSetAxisMappingStrategy(hSRS, OAMS_TRADITIONAL_GIS_ORDER); // Since GDAL 3.0


    +    // poSRS->SetAxisMappingStrategy(OAMS_TRADITIONAL_GIS_ORDER); // Since GDAL 3.0



    다시 빌드한다.

    /home/respiro/rpmbuild/BUILD/gdal-2.3.2-fedora/.libs/libgdal.so: undefined reference to `CPDF_OCContext::CPDF_OCContext(CPDF_Document*, CPDF_OCContext::UsageType)'

    /home/respiro/rpmbuild/BUILD/gdal-2.3.2-fedora/.libs/libgdal.so: undefined reference to `CPDF_RenderContext::AppendLayer(CPDF_PageObjectHolder*, CFX_Matrix const*)'

    /home/respiro/rpmbuild/BUILD/gdal-2.3.2-fedora/.libs/libgdal.so: undefined reference to `CPDF_Document::GetPageDictionary(int)'

    /home/respiro/rpmbuild/BUILD/gdal-2.3.2-fedora/.libs/libgdal.so: undefined reference to `CPDFPageFromFPDFPage(fpdf_page_t__*)'



    음!!! 

    libpdfium 동적라이브러리를 뭔가 잘못 빌드했다.


    [respiro@localhost shared]$ readelf -a libpdfium.so |grep CPDF_OCContext

       246: 000000000006bbb0   494 FUNC    LOCAL  DEFAULT   12 _ZNK14CPDF_OCContext8GetO

      3998: 000000000006af90   172 FUNC   LOCAL  HIDDEN    12 _ZNK23CPDF_OCContextInter

      3999: 000000000006b0c0    61 FUNC    LOCAL  HIDDEN    12 _ZN14CPDF_OCContextC2EP13

      4000: 00000000003f35e8    48 OBJECT  LOCAL  HIDDEN    18 _ZTV14CPDF_OCContext

      4001: 000000000006b0c0    61 FUNC    LOCAL  HIDDEN    12 _ZN14CPDF_OCContextC1EP13

      4002: 000000000006b100  1146 FUNC    LOCAL  HIDDEN    12 _ZNK14CPDF_OCContext22Loa

      4003: 000000000006b580   704 FUNC    LOCAL  HIDDEN    12 _ZNK14CPDF_OCContext12Loa

      4005: 000000000006b890    24 FUNC    LOCAL  HIDDEN    12 _ZN14CPDF_OCContextD2Ev

      4006: 000000000006b890    24 FUNC    LOCAL  HIDDEN    12 _ZN14CPDF_OCContextD1Ev

      4007: 000000000006b8b0    55 FUNC    LOCAL  HIDDEN    12 _ZN14CPDF_OCContextD0Ev

      4010: 000000000006ba60   323 FUNC    LOCAL  HIDDEN    12 _ZNK14CPDF_OCContext13Get

      4011: 000000000006bda0    19 FUNC    LOCAL  HIDDEN    12 _ZNK14CPDF_OCContext8GetO

      4012: 000000000006bdc0   730 FUNC    LOCAL  HIDDEN    12 _ZNK14CPDF_OCContext13Loa

      4013: 000000000006c0a0   177 FUNC    LOCAL  HIDDEN    12 _ZNK14CPDF_OCContext15Che

      4014: 00000000002496d0    26 OBJECT  LOCAL  HIDDEN    14 _ZTS23CPDF_OCContextInter

      4015: 00000000003fd1f8    24 OBJECT  LOCAL  HIDDEN    21 _ZTI23CPDF_OCContextInter

      4016: 00000000002496f0    17 OBJECT  LOCAL  HIDDEN    14 _ZTS14CPDF_OCContext

      4017: 00000000003fd210    24 OBJECT  LOCAL  HIDDEN    21 _ZTI14CPDF_OCContext


    이런, 

    삽질중에.... pdfium 라이브러리의 클래스들은 Export 되어 있지 않다는 것을 알게 되었다.


    Visibility 가 HIDDEN 으로 되어 있다.

    빌드 옵션에서 해당 옵션을 찾아 제거한다. pdflium 라이브러리 동적 빌드를 다시하고, 패키징도 다시하고, 설치도 다시 한다.


    해당 사항은 삽질기 2에 수정되어 추가되었다.


    다시 빌드한다.

    /home/respiro/rpmbuild/BUILD/gdal-2.3.2-fedora/.libs/libgdal.so: undefined reference to `GDALPDFCreateFromCompositionFile(char const*, char const*)'

    collect2: error: ld returned 1 exit status

    make[1]: *** [gdalinfo] 오류 1


    처음에 빌드에서 제외한 클래스를 찾지 못해서 오류가 발생했다.

    이제 해당 소스를 추가하고 마이그레이션한다.

    또는 다음과 같이 임시로 아래코드를 막는다.


    pdfwritabledataset.cpp

    ..

    GDALDataset* PDFWritableVectorDataset::Create( const char * pszName,

                                                   int nXSize,

                                                   int nYSize,

                                                   int nBands,

                                                   GDALDataType eType,

                                                   char ** papszOptions )

    {

        if( nBands == 0 && nXSize == 0 && nYSize == 0 && eType == GDT_Unknown )

        {

            const char* pszFilename = CSLFetchNameValue(papszOptions, "COMPOSITION_FILE");

            if( pszFilename )

            {

                //if( CSLCount(papszOptions) != 1 )

                {

                    CPLError(CE_Warning, CPLE_AppDefined,

                             "All others options than COMPOSITION_FILE are ignored");

                }

                //return GDALPDFCreateFromCompositionFile(pszName, pszFilename);

            }

    }

    해당 기능은 3.1 코드에 추가된 것으로 이 기능을 막고 사용해도 무방할 듯 하다.

    코드를 보니 마이그레이션하려면, 꽤 많은 코드를 봐야될듯 싶다. 그래서 무시하자.


    다시 빌드하면

    [respiro@localhost .libs]$ ldd libgdal.so.20.4.2  | grep pdf

            libpdfium.so => /lib64/libpdfium.so (0x00007ff21ee7d000)



    용용 프로그램과 함께빌드하면, apps 폴더 밑에서 테스트 프로그램으로 확인할 수 있다.

    apps 폴더에 생성된 유틸리티를 실행해본다.

    [respiro@localhost apps]$ ./gdalinfo  --formats | grep PDF

      PDF -raster,vector- (rw+vs): Geospatial PDF


    [respiro@localhost apps]$ ./gdalinfo --format PDF

    Format Details:

      Short Name: PDF

      Long Name: Geospatial PDF

      Supports: Raster

      Supports: Vector

      Extension: pdf

      Help Topic: frmt_pdf.html

      Supports: Subdatasets

      Supports: Open() - Open existing dataset.

      Supports: Create() - Create writable dataset.

      Supports: CreateCopy() - Create dataset by copying another.

      Supports: Virtual IO - eg. /vsimem/

      Creation Datatypes: Byte

      Supports: Feature styles.


    <CreationOptionList>

      <Option name="COMPRESS" type="string-select" description="Compression method for raster data" default="DEFLATE">

        <Value>NONE</Value>

        <Value>DEFLATE</Value>

        <Value>JPEG</Value>

        <Value>JPEG2000</Value>

      </Option>

      <Option name="STREAM_COMPRESS" type="string-select" description="Compression method for stream objects" default="DEFLATE">

        <Value>NONE</Value>

        <Value>DEFLATE</Value>

      </Option>

      <Option name="GEO_ENCODING" type="string-select" description="Format of geo-encoding" default="ISO32000">

        <Value>NONE</Value>

        <Value>ISO32000</Value>

        <Value>OGC_BP</Value>

        <Value>BOTH</Value>

      </Option>

      <Option name="NEATLINE" type="string" description="Neatline" />

      <Option name="DPI" type="float" description="DPI" default="72" />

      <Option name="WRITE_USERUNIT" type="boolean" description="Whether the UserUnit parameter must be written" />

      <Option name="PREDICTOR" type="int" description="Predictor Type (for DEFLATE compression)" />

      <Option name="JPEG_QUALITY" type="int" description="JPEG quality 1-100" default="75" />

      <Option name="JPEG2000_DRIVER" type="string" />

      <Option name="TILED" type="boolean" description="Switch to tiled format" default="NO" />

      <Option name="BLOCKXSIZE" type="int" description="Block Width" />

      <Option name="BLOCKYSIZE" type="int" description="Block Height" />

      <Option name="LAYER_NAME" type="string" description="Layer name for raster content" />

      <Option name="CLIPPING_EXTENT" type="string" description="Clipping extent for main and extra rasters. Format: xmin,ymin,xmax,ymax" />

      <Option name="EXTRA_RASTERS" type="string" description="List of extra (georeferenced) rasters." />

      <Option name="EXTRA_RASTERS_LAYER_NAME" type="string" description="List of layer names for the extra (georeferenced) rasters." />

      <Option name="EXTRA_STREAM" type="string" description="Extra data to insert into the page content stream" />

      <Option name="EXTRA_IMAGES" type="string" description="List of image_file_name,x,y,scale[,link=some_url] (possibly repeated)" />

      <Option name="EXTRA_LAYER_NAME" type="string" description="Layer name for extra content" />

      <Option name="MARGIN" type="int" description="Margin around image in user units" />

      <Option name="LEFT_MARGIN" type="int" description="Left margin in user units" />

      <Option name="RIGHT_MARGIN" type="int" description="Right margin in user units" />

      <Option name="TOP_MARGIN" type="int" description="Top margin in user units" />

      <Option name="BOTTOM_MARGIN" type="int" description="Bottom margin in user units" />

      <Option name="OGR_DATASOURCE" type="string" description="Name of OGR datasource to display on top of the raster layer" />

      <Option name="OGR_DISPLAY_FIELD" type="string" description="Name of field to use as the display field in the feature tree" />

      <Option name="OGR_DISPLAY_LAYER_NAMES" type="string" description="Comma separated list of OGR layer names to display in the feature tree" />

      <Option name="OGR_WRITE_ATTRIBUTES" type="boolean" description="Whether to write attributes of OGR features" default="YES" />

      <Option name="OGR_LINK_FIELD" type="string" description="Name of field to use as the URL field to make objects clickable." />

      <Option name="XMP" type="string" description="xml:XMP metadata" />

      <Option name="WRITE_INFO" type="boolean" description="to control whether a Info block must be written" default="YES" />

      <Option name="AUTHOR" type="string" />

      <Option name="CREATOR" type="string" />

      <Option name="CREATION_DATE" type="string" />

      <Option name="KEYWORDS" type="string" />

      <Option name="PRODUCER" type="string" />

      <Option name="SUBJECT" type="string" />

      <Option name="TITLE" type="string" />

      <Option name="OFF_LAYERS" type="string" description="Comma separated list of layer names that should be initially hidden" />

      <Option name="EXCLUSIVE_LAYERS" type="string" description="Comma separated list of layer names, such that only one of those layers can be ON at a time." />

      <Option name="JAVASCRIPT" type="string" description="Javascript script to embed and run at file opening" />

      <Option name="JAVASCRIPT_FILE" type="string" description="Filename of the Javascript script to embed and run at file opening" />

    </CreationOptionList>



    <LayerCreationOptionList />


    <OpenOptionList>

      <Option name="RENDERING_OPTIONS" type="string-select" description="Which graphical elements to render" default="RASTER,VECTOR,TEXT" alt_config_option="GDAL_PDF_RENDERING_OPTIONS">

        <Value>RASTER,VECTOR,TEXT</Value>

        <Value>RASTER,VECTOR</Value>

        <Value>RASTER,TEXT</Value>

        <Value>RASTER</Value>

        <Value>VECTOR,TEXT</Value>

        <Value>VECTOR</Value>

        <Value>TEXT</Value>

      </Option>

      <Option name="DPI" type="float" description="Resolution in Dot Per Inch" default="72" alt_config_option="GDAL_PDF_DPI" />

      <Option name="USER_PWD" type="string" description="Password" alt_config_option="PDF_USER_PWD" />

      <Option name="LAYERS" type="string" description="List of layers (comma separated) to turn ON (or ALL to turn all layers ON)" alt_config_option="GDAL_PDF_LAYERS" />

      <Option name="LAYERS_OFF" type="string" description="List of layers (comma separated) to turn OFF" alt_config_option="GDAL_PDF_LAYERS_OFF" />

      <Option name="BANDS" type="string-select" description="Number of raster bands" default="3" alt_config_option="GDAL_PDF_BANDS">

        <Value>3</Value>

        <Value>4</Value>

      </Option>

      <Option name="NEATLINE" type="string" description="The name of the neatline to select" alt_config_option="GDAL_PDF_NEATLINE" />

    </OpenOptionList>


      Other metadata items:

        HAVE_POPPLER=YES


    이게 뭐냐? 분명 config.log 에는 HAVE_PDFIUM='yes'로 되어 있고, 라이브러리도 물고 있는 것을 확인했는데...ㅠㅠ

    HAVE_PDFIUM=YES 는 어디로 갔는가?


    pdfdataset.cpp

    #if defined(HAVE_PDFIUM) && defined(HAVE_POPPLER)

    #define HAVE_MULTIPLE_PDF_BACKENDS

    #elif defined(HAVE_PDFIUM) && defined(HAVE_PODOFO)

    #define HAVE_MULTIPLE_PDF_BACKENDS

    #elif defined(HAVE_POPPLER) && defined(HAVE_PODOFO)

    #define HAVE_MULTIPLE_PDF_BACKENDS

    #endif


    #ifdef HAVE_MULTIPLE_PDF_BACKENDS

    "  <Option name='PDF_LIB' type='string-select' description='Which underlying PDF library to use' "

    #if defined(HAVE_PDFIUM)

      "default='PDFIUM'"

    #elif defined(HAVE_POPPLER)

      "default='POPPLER'"

    #elif defined(HAVE_PODOFO)

      "default='PODOFO'"

    #endif  // ~ default PDF_LIB

      "alt_config_option='GDAL_PDF_LIB'>"

    #if defined(HAVE_POPPLER)

    "     <Value>POPPLER</Value>\n"

    #endif  // HAVE_POPPLER

    #if defined(HAVE_PODOFO)

    "     <Value>PODOFO</Value>\n"

    #endif  // HAVE_PODOFO

    #if defined(HAVE_PDFIUM)

    "     <Value>PDFIUM</Value>\n"

    #endif  // HAVE_PDFIUM

    "  </Option>"

    #endif // HAVE_MULTIPLE_PDF_BACKENDS


    컴파일시에 HAVE_MULTIPLE_PDF_BACKENDS 가 제대로 선언되지 않은 것으로 보인다. 이상한 현상이다.

    강제로 define  하고 다시 빌드해본다.


    마찬가지다.


    바보 아냐!  DLL 을 LD_LIBRARY_PATH로 물고 가도록 해야지....ㅠ ㅠ   (이런게 삽질이다.)


    [respiro@localhost apps]$ cd apps

    [respiro@localhost apps]$ LD_LIBRARY_PATH=../.libs:$LD_LIBRARY_PATH  ./gdalinfo --format PDF

    Format Details:


    ..

      <Option name="USER_PWD" type="string" description="Password" alt_config_option="PDF_USER_PWD" />

      <Option name="PDF_LIB" type="string-select" description="Which underlying PDF library to use" default="PDFIUM" default="POPPLER" alt_config_option="GDAL_PDF_LIB">

        <Value>POPPLER</Value>

        <Value>PDFIUM</Value>

      </Option>

      <Option name="LAYERS" type="string" description="TEST by KJI List of layers (comma separated) to turn ON (or ALL to turn all layers ON)" alt_config_option="GDAL_PDF_LAYERS" />

      <Option name="LAYERS_OFF" type="string" description="List of layers (comma separated) to turn OFF" alt_config_option="GDAL_PDF_LAYERS_OFF" />

      <Option name="BANDS" type="string-select" description="Number of raster bands" default="3" alt_config_option="GDAL_PDF_BANDS">

        <Value>3</Value>

        <Value>4</Value>

      </Option>

      <Option name="NEATLINE" type="string" description="The name of the neatline to select" alt_config_option="GDAL_PDF_NEATLINE" />

    </OpenOptionList>


      Other metadata items:

        HAVE_PDFIUM=YES

        HAVE_POPPLER=YES


    파일 열기(Open Options)시에 PDF_LIB 으로 PDF 벡엔드 엔진을 선택하는 옵션이 추가된 것을 알 수 있다.

    [respiro@localhost apps]$ LD_LIBRARY_PATH=../.libs:$LD_LIBRARY_PATH ./gdalinfo -oo PDF_LIB=POPPLER ../gdalautotest-2.3.2/gdrivers/data/adobe_style_geospatial.pdf

    Driver: PDF/Geospatial PDF

    Files: ../gdalautotest-2.3.2/gdrivers/data/adobe_style_geospatial.pdf

    Size is 1275, 1650

    Coordinate System is:

    PROJCS["WGS_1984_UTM_Zone_20N",

        GEOGCS["GCS_WGS_1984",

            DATUM["WGS_1984",

                SPHEROID["WGS_84",6378137.0,298.257223563]],

            PRIMEM["Greenwich",0.0],

            UNIT["Degree",0.0174532925199433]],

        PROJECTION["Transverse_Mercator"],

        PARAMETER["False_Easting",500000.0],

        PARAMETER["False_Northing",0.0],

        PARAMETER["Central_Meridian",-63.0],

        PARAMETER["Scale_Factor",0.9996],

        PARAMETER["Latitude_Of_Origin",0.0],

        UNIT["Meter",1.0]]

    Origin = (333274.616544058371801,4940391.759349998086691)

    Pixel Size = (42.353069656601626,-42.392994002225727)

    Metadata:

      CREATION_DATE=D:20101021125101-07

      CREATOR=ESRI ArcMap 10.0.0.2414

      NEATLINE=POLYGON ((338304.150126181 4896673.63942063,338304.177293829 4933414.79937582,382774.271384474 4933414.54626367,382774.767330031 4896674.27358034,338304.150126181 4896673.63942063))

    Corner Coordinates:

    Upper Left  (  333274.617, 4940391.759) ( 65d 6' 2.64"W, 44d35'51.19"N)

    Lower Left  (  333274.617, 4870443.319) ( 65d 4'42.29"W, 43d58' 5.60"N)

    Upper Right (  387274.780, 4940391.759) ( 64d25'14.13"W, 44d36'28.95"N)

    Lower Right (  387274.780, 4870443.319) ( 64d24'19.77"W, 43d58'42.54"N)

    Center      (  360274.698, 4905417.539) ( 64d45' 4.60"W, 44d17'18.91"N)

    Band 1 Block=1275x1 Type=Byte, ColorInterp=Red

    Band 2 Block=1275x1 Type=Byte, ColorInterp=Green

    Band 3 Block=1275x1 Type=Byte, ColorInterp=Blue



    [respiro@localhost apps]$ LD_LIBRARY_PATH=../.libs:$LD_LIBRARY_PATH ./gdalinfo -oo PDF_LIB=PDFIUM ../gdalautotest-2.3.2/gdrivers/data/adobe_style_geospatial.pdf
    Driver: PDF/Geospatial PDF
    Files: ../gdalautotest-2.3.2/gdrivers/data/adobe_style_geospatial.pdf
    Size is 1275, 1650
    Coordinate System is:
    PROJCS["WGS_1984_UTM_Zone_20N",
        GEOGCS["GCS_WGS_1984",
            DATUM["WGS_1984",
                SPHEROID["WGS_84",6378137.0,298.257223563]],
            PRIMEM["Greenwich",0.0],
            UNIT["Degree",0.0174532925199433]],
        PROJECTION["Transverse_Mercator"],
        PARAMETER["False_Easting",500000.0],
        PARAMETER["False_Northing",0.0],
        PARAMETER["Central_Meridian",-63.0],
        PARAMETER["Scale_Factor",0.9996],
        PARAMETER["Latitude_Of_Origin",0.0],
        UNIT["Meter",1.0]]
    Origin = (333275.124066242307890,4940392.123364951461554)
    Pixel Size = (42.352600157603341,-42.393311561151322)
    Metadata:
      CREATION_DATE=D:20101021125101-07
      CREATOR=ESRI ArcMap 10.0.0.2414
      NEATLINE=POLYGON ((338304.285365684 4896674.10591548,338304.812551275 4933414.85396058,382774.246895812 4933414.85514894,382774.983309293 4896673.9572296,338304.285365684 4896674.10591548))
    Corner Coordinates:
    Upper Left  (  333275.124, 4940392.123) ( 65d 6' 2.61"W, 44d35'51.20"N)
    Lower Left  (  333275.124, 4870443.159) ( 65d 4'42.27"W, 43d58' 5.60"N)
    Upper Right (  387274.689, 4940392.123) ( 64d25'14.13"W, 44d36'28.96"N)
    Lower Right (  387274.689, 4870443.159) ( 64d24'19.78"W, 43d58'42.54"N)
    Center      (  360274.907, 4905417.641) ( 64d45' 4.59"W, 44d17'18.91"N)
    Band 1 Block=1275x1 Type=Byte, ColorInterp=Red
      Overviews: 638x825, 319x413, 160x207
      Mask Flags: PER_DATASET ALPHA
      Overviews of mask band: 638x825, 319x413, 160x207
    Band 2 Block=1275x1 Type=Byte, ColorInterp=Green
      Overviews: 638x825, 319x413, 160x207
      Mask Flags: PER_DATASET ALPHA
      Overviews of mask band: 638x825, 319x413, 160x207
    Band 3 Block=1275x1 Type=Byte, ColorInterp=Blue
      Overviews: 638x825, 319x413, 160x207
      Mask Flags: PER_DATASET ALPHA
      Overviews of mask band: 638x825, 319x413, 160x207
    Band 4 Block=1275x1 Type=Byte, ColorInterp=Alpha
      Overviews: 638x825, 319x413, 160x207


    PDFium 으로 읽으면 Apha Band 까지 인식하는 것을 알 수 있다.


    poppler 라이브러리는 CentOS7의 gdal 기본 백엔드이기 때문에 여기서는 pdfium 만 테스트해 본다.


    gdal_translate 로 geotiff 파일을 pdf 로 변경 테스트 해본다.

    PDF Create 에서는 PDF_LIB 을 선택하는 옵션이 없다. backend 로 뭘 쓰는지 모르겠다. 소스 추적 삽질이 필요한가?

    pdfdataset.cpp  소스를 보면...


    if(bHasLib.count() != 1) {

            const char* pszDefaultLib =

                    bHasLib.test(PDFLIB_PDFIUM) ? "PDFIUM" :

                    bHasLib.test(PDFLIB_POPPLER) ? "POPPLER" : "PODOFO";

            const char* pszPDFLib = GetOption(poOpenInfo->papszOpenOptions, "PDF_LIB", pszDefaultLib );

            while( true )

            {

                if (EQUAL(pszPDFLib, "POPPLER"))

                    bUseLib.set(PDFLIB_POPPLER);

                else if (EQUAL(pszPDFLib, "PODOFO"))

                    bUseLib.set(PDFLIB_PODOFO);

                else if (EQUAL(pszPDFLib, "PDFIUM"))

                    bUseLib.set(PDFLIB_PDFIUM);


                if(bUseLib.count() != 1 || (bHasLib & bUseLib) == 0)

                {

                    CPLDebug("PDF", "Invalid value for GDAL_PDF_LIB config option: %s. Fallback to %s",

                            pszPDFLib, pszDefaultLib);

                    pszPDFLib = pszDefaultLib;

                    bUseLib.reset();

                }

                else

                    break;

            }

        }


    PDF_LIB 이 여러개인 경우 다음의 순으로 디폴트 라이브러리를 사용한다. 옵션에서 명시하지 않으면.. PDFIUM 이 사용된다.


    PDFIUM > POPPLER > PODOFO


    $ gdal_translate utm.tif utm-poppler.pdf -of PDF





    gdal_translate 로 geopdf 파일을 geotiff 로 변경 테스트 해본다.


    [respiro@localhost data]$ cd ~/gdalautotest/gdriver/data

    [respiro@localhost data]$ gdal_translate adobe_style_geospatial.pdf adobe_pdfium.tif -oo PDF_LIB=PDFIUM -of GTiff

    Input file size is 1275, 1650

    0...10...20...30...40...50...60...70...80...90...100 - done.


    Qgis 등의 도구에서 열어보면, 좌표를 인식하는 것을 볼 수 있다.


    이제 gdal 패키징을 업데이트한다.


    현재 My 시스템에서 사용하는 gdal 버전은 fedora 용이다.


    https://koji.fedoraproject.org/koji/buildinfo?buildID=1186932


    을 수정한 것으로 사용한 pdfium 패치와 SPEC 파일의 수정은 다음과 같다.


    gdal-2.3.2 pdfium 패치


     최신 GDAL Git 소스(3.x) 를 사용하는 경우에는 적당히 알아서 쉽게 적용할 수 있을것이며, 앞에서 주석처리한 pdfcreatefromcomposition.cpp 소스 파일도 사용할 수 있을 것이다.


    소스 RPM 의 용량문제로 spec 파일 수정내용만 올린다.


    gdal.spec 수정 내용.

    %global with_pdfium 1   // 추가


    ...


    Patch12:        %{name}-2.3.2-pdfium.patch    // 추가


    ...

    %if 0%{?with_pdfium}

    BuildRequires:  libpdfium-devel   // pdfium rpm 패키징 참조

    BuildRequires:  lcms2-devel

    BuildRequires:  libjpeg-devel

    BuildRequires:  libpng-devel

    BuildRequires:  zlib-devel

    %endif



    %if 0%{?with_pdfium}
    Requires:       libpdfium
    Requires:       lcms2
    Requires:       libjpeg
    Requires:       libpng
    Requires:       zlib
    %endif

    %setup ..
    ..
    %patch12 -p1 -b .pdfium

    % configure \
    ..
    %if 0%{?with_pdfium}
            --with-pdfium           \
            --with-pdfium-extra-lib-for-test="-lpthread -lm -lc -lstdc++ -lz -ljpeg -lopenjp2 -llcms2 -lpng " \
    %endif
    ..

    ## 추가변수 확인
    POPPLER_OPTS="POPPLER_MAJOR_VERSION=0 POPPLER_MINOR_VERSION=26 POPPLER_0_20_OR_LATER=yes POPPLER_0_23_OR_LATER=yes POPPLER_BASE_STREAM_HAS_TWO_ARGS=yes"


    CentOS 7 RPM 패키징을 위한 SPEC  파일

    (proj_somaj  값은 사용하는 Proj  라이브러리의 이름을 참조하여 각자 수정하기 바람.  예) /usr/lib64/libproj.so.15 이면 15를 사용)


    gdal 과 pdfium 의 기본적인 동작이 잘 동작하는 것을 확인했다.


    항상 그렇듯이 나지 않는 시간이 나면, 앞에서 주석 처리한 함수를 백포팅해보자.


    댓글 0

Designed by Tistory.