Join this group if you use the Hadoop version of DMX.

77 Members
Join Us!

Dear All,

Could you please a sample DTL with Apache Avro output or Parquet output file format.

Basically I am looking following scenarios:

INPUT could be

a.) Text file or b.) Avro file or c.) Parquet file format

OUTPUT could be

a.) Text file or b.) Avro file or c.) Parquet file format

What I am looking is a sample DTL to configure above /REFORMAT or /DATADICTIONARY etc in a DTL approach.

I tried my best could not succeed so seeking help here.

Thanks,

Kishore Veleti A.V.K.

You need to be a member of Syncsort Community to add comments!

Join Syncsort Community

Email me when people reply –

Replies

  • Hi All,

    Here is the sample DTL I developed and I am getting compilation error for this. I understand the error but don't know how to fix it.

    DMExpress : (SCHMFRM) Invalid schema format in the file "/home/dmxdemo/./all_layouts.dtl"
                (AVROINV) Invalid Avro JSON schema format
                (AVROLIBERR) Error parsing JSON: '[' or '{' expected near '/'

    /DTL RUNTIMEVARIABLES ON ENCODING UTF-8
        /TASKTYPE COPY
        /INFILE /data/user/users_2016-01-01_1.txt SERVERCONNECTION DL_HDFS_CONNECTION_INFO_ALIAS STLF FIELDSEPARATOR "," LAYOUT user_layout_input
        /INFILE /data/user/users_2016-01-01_2.txt SERVERCONNECTION DL_HDFS_CONNECTION_INFO_ALIAS STLF FIELDSEPARATOR "," LAYOUT user_layout_input

        /OUTFILE /data/avro/users_2016-01-01.avro SERVERCONNECTION DL_HDFS_CONNECTION_INFO_ALIAS STLF OVERWRITE
        /OUTFILE /data/parquet/users_2016-01-01.parquet SERVERCONNECTION DL_HDFS_CONNECTION_INFO_ALIAS STLF OVERWRITE

        /DATADICTIONARY all_layouts.dtl DMEXPRESS INCLUDE LAYOUTS (user_layout_input)
        /DATADICTIONARY all_layouts.dtl  APACHEAVRO INCLUDE LAYOUTS (user_layout_output)

        /VALUE etl_workflow_id_value_variable $DL_ETL_WORKFLOW_ID
        /VALUE created_dt_iso_formatted user_layout_input.created_dt
        /VALUE updated_dt_iso_formatted user_layout_input.updated_dt

        /REFORMAT MAPTO LAYOUT user_layout_output MAPPINGS
        user_layout_output.id = user_layout_input.userid,
        user_layout_output.first_name = user_layout_input.first_name,
        user_layout_output.last_name = user_layout_input.last_name,
        user_layout_output.email = user_layout_input.email,
        user_layout_output.salary = user_layout_input.salary,
        user_layout_output.created_dt = user_layout_input.created_dt,
        user_layout_output.updated_dt = user_layout_input.updated_dt,
        user_layout_output.created_dt_iso = user_layout_input.created_dt,
        user_layout_output.updated_dt_iso = user_layout_input.updated_dt,
        user_layout_output.etl_workflow_id = etl_workflow_id_value_variable

        /SERVERCONNECTION $DL_HDFS_SERVER_NAME HDFS ALIAS DL_HDFS_CONNECTION_INFO_ALIAS

    /END

    • Hi Kishore,

      The issue is that you have specified that "all_layouts.dtl" contains Apache Avro metadata in your second /DATADICTIONARY option. This is incorrect, because it contains DMEXPRESS (DTL) metadata, as indicated by the statement before it.

      It looks like you want APACHEAVRO in your /OUTFILE statement instead, to set it as the target file format.

    • Here is the content of all_layouts.dtl

      /DTL RUNTIMEVARIABLES ON   
          /DELIMITEDRECORDLAYOUT user_layout_input  {
          /* This is definitely a comment */
          userid en NOTNULLABLE, /* This is a comment */
          first_name character  NOTNULLABLE  ,
          last_name character  NOTNULLABLE  ,
          email character NOTNULLABLE  ,
          salary EN,
          created_dt datetime (DD0/MN/YEAR:HH0:MI0:SE0 ZH0ZM0) NOTNULLABLE,
          updated_dt datetime (DD0/MN/YEAR:HH0:MI0:SE0 ZH0ZM0) NOTNULLABLE
          }
          
          /DELIMITEDRECORDLAYOUT user_layout_output  {
          /* This is definitely a comment */
          userid en NOTNULLABLE, /* This is a comment */
          first_name  NOTNULLABLE ,
          last_name  NOTNULLABLE ,
          email NOTNULLABLE ,
          salary EN,
          created_dt datetime (DD0/MN/YEAR:HH0:MI0:SE0 ZH0ZM0) NOTNULLABLE,
          updated_dt datetime (DD0/MN/YEAR:HH0:MI0:SE0 ZH0ZM0) NOTNULLABLE,
          created_dt_iso datetime (DD0/MN/YEAR:HH0:MI0:SE0 ZH0ZM0) NOTNULLABLE,
          updated_dt_iso datetime (DD0/MN/YEAR:HH0:MI0:SE0 ZH0ZM0) NOTNULLABLE,
          etl_workflow_id NOTNULLABLE
          }
      /END

This reply was deleted.

To access Syncsort Knowledge Base, visit:

My Support