EDI/Edifact Plugin

Aus expecco Wiki (Version 2.x)
Wechseln zu: Navigation, Suche

Edi/Edifact Plugin

The Edifact Plugin provides extensive support for EDIFACT message processing and adds a hierarchical structure editor to the GUIBrowser user interface. There, EDIFACT messages can be interactively edited and action sequences constructed to operate on it.

Background Information

EDIFACT is an international standard for the exchange of business transaction messages, such as invoices, orders, quotes etc. EDIFACT transactions are heavily used in B2B communications, and supported by SAP, Oracle, IBM and other business processing systems.

EDIFACT messages consist of a number of records (called "segments"), which themselves contain a number of data elements. Both non-composite and composite data elements are possible.

EDIFACT messages are surrounded by a header- and trailer segment.

An EDIFACT transaction may contain multiple individual messages, and is itself surrounded by a transaction header segment and a transaction trailer segment.

For comprehensive documentation and a full specification, see United Nations Economic Commission for Europe in general, and Introducing UN/EDIFACT in particular.

EDIFACT Messages

UN/EDIFACT defines the overall format and layout of EDIFACT messages. Standard specifications are typically published twice a year. New revisions typically add additional data elements, additional segment types and messages. New revisions are typically backward compatible. That is previous definitions are seldom (if ever) obsoleted.

All messages and segments contain a tag (or message type) field, which allows for the type of message/segment to be identified. Thus, in theory, it is possible to extract field values from arbitrary messages, even if the concrete message layout is unknown.

However, the semantic interpretation of a particular datum highly depends on its position inside the message. For example, a DTM-segment's datetime value can be extracted easily, but its interpretation (delivery date, invoice date, quotation date etc.) is only possible if the overall message structure is known.

Message Subsets

The definitions as published by UN/Unice define the overall structure of messages and segments. However, most concrete application do not need/use/support the full set of possible segments in a message. Many segments are specified as optional and not supported/expected by many concrete users.

For this typical B2B transaction systems expect and handle only subsets of the over all message set and subsets of the possible segment set within those messages.

Meta Descriptions

In order for a correct semantic interpretation, modification and verification of data elements, a meta description is required, which specifies the format of a message (segment order and grouping) and of data elements (alphanumeric, numeric, minimum/maximum size and value etc.).

Such specifications are published by UN/Unice for the overall framework, and by companies for the B2B partners. Specifications exist in both formal (machine readable) formats and in non-formal (human readable) formats. Formal specification formats are SEF (Structured EDIFACT Format Specification) and XSD (XML Schema definitions). Non-formal specification are PDF, HTML and Text formats.

The biggest challenges in dealing with EDIFACT is the trouble to get formal (machine readable) specifications.

SEF format files provide all required information and would be perfect for this task - however, SEF seems to be obsolete and SEF specifications for newer versions are hard to find (which is a pity). SEF files are not available from UN/Unice.

XSD seems to be the way to go, but official XSD specifications are rare and not published by UN.

DFDL is a new XML-based definition standard, which is very promising in theory. However, it is relatively new. Only a small subset of EDIFACT messages is formally specified in this format.

Official documents from UN/Unice include PDF, HTML and Text files. Although being informal, existing HTML and text files seem to follow a common layout and structure.

EDIFACT MessageBrowser Plugin

This plugin adds a structured EDIFACT message inspector and editor to expecco. It provides an interface very similar to the GUIBrowser's interface, showing the hierarchical structure of EDIFACT messages. It also provides a UI to interactively compose action sequences to analyze, modify and test an EDIFACT message.

EDIFACT Plugin Components

EDIFACT Smalltalk Class Library

This library contains all real functionality of the EDIFACT plugin. The expecco action library (edifact.ets) contains interfacing blocks to the functions found there.

EDIFACT Message Decoder

Includes a general message/segment/data decoder, which is able to read arbitrary (unknown) messages. Such textual messages (wire data) is decoded and presented as a hierarchy of objects, which model the message, segments and elements. The decoder can decode arbitrary messages - i.e. a meta description is not required for basic decoding.

EDIFACT Message Encoder

This encodes a message object back into the external EDIFACT wire representation. It cares for the character set, separator definitions etc. In general, the original input is regenerated (with minor differences) when the a decoded EDIFACT message is encoded back to wire data.

EDIFACT Object Representation Framework

These classes model EDIFACT objects as messages, segments, composite- and non-composite data elements. Without a meta description of the message set, data elements can by accessed by numeric indices (segment#, element#). With a meta description, elements are accessible by more user friendly names, such as the official UN standard field names (C009, C123, 4056 etc), or user assigned application field names (deliveryDate, deliveryLocation, street, city etc.)

EDIFACT Schema Definition Framework

This set of classes represent a meta description of EDIFACT objects. They can be hand-written, dynamically constructed or read in from external specification files in various formats.

EDIFACT Schema Definition Readers

These classes read external meta specifications in various formats and create a schema object. Schema objects are needed to correctly interpret element data, to access elements by user friendly names, and to create new messages which have the correct segment and segment group structure. Readers exist for the following formats:

  • XSD
  • DFDL
  • SEF
  • HTML (as in UN docs)
  • Text (as in UN docs)
  • bots (import specs written in the python programming language, written for the bots processor)
  • internal private format

Smalltalk Library API

The EDIFACT class library contains functions to be called from elementary code (and also code which is called by the blocks contained in the EDIFACT block library, which is described below).

Included are parsers for various formats (regular EDIFACT and XML-EDIFACT), generators for those, parsers for meta-definitions and data objects to represent EDIFACT messages (segments, groups, fields etc.)

Decoding a Message

given the following EDIFACT message in wire data form:

UNA:+.? '
LIN+1++Produkt Schrauben:SA'

assuming, the above message (as String) is held in the "msg" variable, it can be decoded with:

    document := EdifactDecoder decode:msg.

This returns a structured (hierarchical) message object in "document", which consists of segments and data fields.

Encoding a Message

for the opposite direction, a number of EdifactCoder classes exist, which take a decoded or constructed EDIFACT message and generate an external representation from it:

    EdifactCoder encode: document

generates a standard EDIFACT representation (the above).

For XML output, various different formats are supported, which use different tags for the elements. For example,

    EdifactXMLCoder encode: document

generates the following XML representation:

 <?xml version="1.0" encoding="UTF-8"?>


    EdifactXMLEdifactCoder encode: document

would generate:

 <?xml version="1.0" encoding="UTF-8"?>
               <D_7140>Produkt Schrauben</D_7140>

many more (customizable) formats are supported; please contact eXept for details.

Accessing Document and Message Components
EdifactDocument API

"document" will be an instance of EdifactDocument. It represents a single transmission or document (possibly containing multiple messages). It supports the API:

  • aDocument numberOfMessages
    - returns the number of individual messages in the transaction
  • aDocument messages
    - a collection of messages contained in the transaction
  • aDocument adviceSegment
    - returns the transmission's advice segment (the 'UNA' segment)
  • aDocument headerSegment
    - returns the transaction's header segment (the 'UNB' segment)
  • aDocument trailerSegment
    - returns the transaction's trailer segment (the 'UNZ' segment)

Individual messages are fetched as instances of EdifactMessage via "document messages at: index" or enumerated with "document messages do:[:each | ...]".

   bgmSegment := document messages first segments first.
   dtmSegmeent := document messages first segments at:2.
EdifactMessage API

EdifactMessage is a single message inside an edifactDocument (or transmission). It consists of segments, and supports the API:

  • aMessage segments
    - a collection of individual segments inside the message (excludes header and trailer). This is a "raw" (e.g. flat) list of segments, as present in the original message. No segment group information is reflected in this list.
  • aMessage groupedSegments
    - a hierarchical collection of segments and segment groups (excludes header and trailer). This is a hierarchical list where group-segments are founder under segment group objects.
  • aMessage headerSegment
    - returns the message header segment (the 'UNH' segment)
  • aMessage trailerSegment
    - returns the message trailer segment (the 'UNT' segment)
  • aMessage messageType
    - extracts the message type from the message header as a string. In the above example, this would be 'ORDERS'.
    The list of version d14b message types is found at [1] and [2].
  • aMessage controllingAgency
    - extracts the standards controlling agency which defined the message structure. In the above, this would be 'UN'.
  • aMessage version
    - extracts the message structure standard version number. In the above message, this is '96A'.
  • aMessage release
    - extracts the message structure standard release number. In the above message, this is 'D'.
  • aMessage messageSchema
    - return the message's schema definition (see below)
  • aMessage edifactVersion
    - return the messageSet which contains the message and segment schemas (see below)

The "controllingAgency", "version" and "release" fields are required to determine which meta schema definition is to be used (see below for details)

EdifactSegment API

Segments as returned by any of the above. They consist of a tag (segment type) and a number of data elements. They understand the following API:

  • aSegment tag
    - the segments's id (e.g. 'UNA', 'DTM', 'BGM' etc.) The list of version d14b tags is found at [3].
  • aSegment dataElementAt: index
    - return a possibly composite data element. This is a wrapper object, which holds on both the schema definition (if known) and the data value.
  • aSegment dataElementStringAt: index
    - return the datum as a string. If the element is not present, an empty string is returned.
  • aSegment dataElementValueAt: index
    - return the datum as a string or number or nil, if the element is not present.
  • aSegment dataElementAt: index1 at: index2
    - similar to the above, but return a composite element's subcomponent
  • aSegment dataElementStringAt: index1 at: index2
    - similar to the above, but return a composite element's subcomponent's string
  • aSegment dataElementValueAt: index1 at: index2
    - similar to the above, but return a composite element's subcomponent' value
  • aSegment dataElementAt: index put: newElement
  • aSegment dataElementStringAt: index put: newString
  • aSegment dataElementValueAt: index put: newObject
  • aSegment dataElementAt: index1 at: index2 put: newElement
  • aSegment dataElementStringAt: index1 at: index2 put: newString
  • aSegment dataElementValueAt: index1 at: index2 put: newObject


   dtmSegment id -> 'DTM'
   (dtmSegment dataElementStringAt:1) -> '4:20060620:102'
   (dtmSegment dataElementValueAt:1) -> #('4' '20060620' '102')
   (dtmSegment dataElementStringAt:1 at:2) -> '20060620'
Well-known Segment Types

Although any data element can be accessed either by an index or its UN-standard name (e.g. C107 or 4063), for some well-known, often used segment types, additional user friendly accessors are provided. The idea is to make program code more readable, by using human readable field names, instead of the more technical UN-standard names.

UNB (Interchange Header) Segment API
  • unbSegment interchangeRecipient
  • unbSegment interchangeSender
  • unbSegment processingPriorityCode
  • unbSegment testIndicator
  • -- more to be documented --
UNH (Message Header) Segment API
  • unhSegment messageIdentifier
    - the contents of the S009 element as a composite string (with colons)
  • unhSegment messageType
    - the contents of the S009:0065 element
  • unhSegment messageVersionNumber
    - the contents of the S009:0052 element
  • unhSegment messageReleaseNumber
    - the contents of the S009:0054 element
  • unhSegment controllingAgency
    - the contents of the S009:0051 element
  • unhSegment associationAssignedCode
    - the contents of the S009:0057 element
  • -- more to be documented --
UNT (Message Trailer) Segment API
  • untSegment messageReferenceNumber
  • -- more to be documented --
UNZ (Interchange Trailer) Segment API
  • unzSegment numberOfMessages
    - the number of messages in the transmission (document)
  • unzSegment interchangeControlReference
  • -- more to be documented --
DTM (Date Time) Segment API
  • dtmSegment dateTimeString
    - retrieves the 2nd component, which is the date time as a string
  • dtmSegment dateTimeFormat
    - retrieves the 3rd component, which specifies the format of the dateTimeString
  • dtmSegment dateTime
    - converts the dateTimeString as specified by dateTimeFormat and returns a Date, Time, Timestamp, TimeDuration or similar object.
NAD (Name and Address) Segment API
  • nadSegment cityName
    - retrieves the city component
  • nadSegment countryNameCode
  • nadSegment nameAndAddress
  • nadSegment partyName
  • nadSegment postalIDCode
  • nadSegment street
    - retrieves the street component
  • - to be documented -
QTY (Quantity) Segment API
  • qtySegment quantity
    - the contents of the 6060 element as a string
  • qtySegment quantityDetails
    - the contents of the C186 element as a composite string (with colons)
  • qtySegment quantityTypeCodeQualifier
    - the contents of the 6063 element
  • -- more to be documented -
MOA (Monetary Amount) Segment API
  • moaSegment amount
  • moaSegment currencyIdCode
  • moaSegment currencyTypeCodeQualifier
  • moaSegment statusDescriptionCode
  • moaSegment typeCode
  • -- more to be documented -
Getting a Schema Definition

Schema definitions are normally automatically fetched as required, whenever a segment- or messageSchema is needed. For example, when asking a message for its groupedSegments, or when asking a segment for a named field, the message is consulted for its messageIdentifier, and a corresponding schema definition is tried to be loaded from a folder containing the schema definition files.

However, you can also import schemas manually via the schema parsers. These parsers read specifications in various formats and return an instance of EdifactMessageSet. This holds the definitions of segments and messages. Because schema parsing may be a relatively expensive operation (depending on the format), these schema definitions are cached both in memory and as an option also in external files (schema cache folder).

Assuming, a schema definition is present in SEF format in the file "~expecco/definitions/un/d10a/sef/INVOIC.SEF", read it with:

   messageSet := EdifactSEFParser parseFile: '~expecco/definitions/un/d10a/sef/INVOIC.SEF'.

For XSD schema definitions use the EdifactXSDParser, as in:

   messageSet := EdifactXSDParser parseFile: '~expecco/definitions/un/d10a/xsd/INVOIC.xsd'.

The expecco plugin contains definition files for all UN versions d93a to d14b in the "definitions" subfolder of the plugin.

For standard definitions, a more convenient automatic parsing is possible, by asking the utility class "EdifactVersion" for a particular schema. Give it the controlling agency, version and release, as in:

   messageSet := EdifactVersion controllingAgency:'UN' version:'03A' release:'D'.

Notice that this is also the proper way to get an incoming message's schema definition and done by the "EdifactMessage >> messageSchema method. This will automatically extract the correct schema definition (from the message identifier) and load it via an appropriate parser as required. This is also the mechanism used when a schema definition is needed by the XPath accessors or when asking for the segment group structure.

MessageSet API

A message set object contains the definitions of one or more messages (as described in the schema file) and of all required (eg. referred to) segments. You can fetch individual messageSchemas with:

  • aMessageSchema messageSchemaAt:messageType
    where messageType is one of 'INVOIC', 'ORDERS', etc.

and segment schemas with:

  • aMessageSchema segmentSchemaAt:segmentID
    where segmentID is one of 'BGM', 'DTM' etc.

and element schemas with:

  • aMessageSchema elementSchemaAt:elementName
    where elementName is one of 'C107', '4063' etc.

MessageSets are organized in a hierarchy, and missing schemas are also searched in a messageSet's baseSchema. Typically, messageSets for custom applications have the more general standard UN messageSet as their baseSchema.

MessageSets are cached internally, to prevent parsing overhead.

Accessing fields with Schema Information Present

if a schema is known, fields and segments and groups of segments can be accessed by name. Assuming the standard schema for the ORDERS message, the above message has the following group schema:

       UNA:+.? '
SG2      NAD+BY+++Bestellername+Strasse+Stadt++23436+xx'
SG25       LIN+1++Produkt Schrauben:SA'
SG25       QTY+1:1000'

e.g. the NAD belongs to SG2 and the LIN+QTY belong to SG25. Without proper knowledge of which segment group a segment belongs to, it is usually hard to impossible to correctly interpret the values inside the message (just consider the fact, that there are normally multiple NAD, LIN, QTY etc. segments present).

Accessing fields with XPath like Accessors

When a message knows its schema, you can access elements by XPath:

  • message dataElementStringAt: '/SG25/QTY/

or, as a convenient shortcut:

  • message xPathGet: '/SG25/QTY'

Many of the XPath search and filter functions are possible, for example, to find the second quantity in a message which contains multiple instances of SG25, use:

  • message xPathGet: '/SG25[2]/QTY'

or, to find all fields which contain a particular value, use:

  • message xPathGet: '//QTY[contains(text(),"100.0")]'

The same scheme can be used to change fields, when EDIFACT objects are constructed:

  • message xPathSet: '/SG25[2]/QTY/6060' value: 100

Edifact Plugin Library for Expecco

The plugin includes a library of action blocks, to call the above described functions. For example, the following activity decodes an EDIFACT message and extracts a particular field by XPath from it:


where the attachment contains an INVOIC as EDIFACT:

UNA:+.? '
NAD+MR+9900111111113::293++Lieferant:Kreditoren- und Energiedatenmanagem::::Z02+Musterstraße::42+Musterhausen++12345+DE'

The INVOIC's segment structure is:


Back to Online Documentation.

Edifact Plugin Library for Expecco

The library contains blocks for:

  • EDIFACT document decoding and encoding
  • message, segment, field extraction, replacement and adding
  • analysis and verification against a schema definition, grouping of segment groups
Message Decoding
Message Encoding
Message Extraction
Segment and Segment Group Extraction
Field Extraction
Field Modification
Segment Insertion
Message Insertion

Back to Online Documentation.

Copyright © 2014-2016 eXept Software AG