EDI/Edifact Plugin: Unterschied zwischen den Versionen
| Cg (Diskussion | Beiträge) | Cg (Diskussion | Beiträge)  | ||
| Zeile 261: | Zeile 261: | ||
| ===== Getting a Schema Definition ===== | ===== Getting a Schema Definition ===== | ||
| Schema definitions are normally automatically fetched as required, whenever a segment- or messageSchema is needed. For example, when asking a message for its groupedSegments, or when asking a segment for a named field, the message is consulted for its messageIdentifier, and a corresponding schema definition is tried to be loaded from a folder conatining the schema definition files. | |||
| ⚫ | |||
| ⚫ | |||
| return an instance of EdifactMessageSet. This holds the definitions of segments and messages. | return an instance of EdifactMessageSet. This holds the definitions of segments and messages. | ||
| Because schema parsing may be a relatively expensive operation (depending on the format), these schema definitions are | Because schema parsing may be a relatively expensive operation (depending on the format), these schema definitions are | ||
| Zeile 284: | Zeile 286: | ||
| Notice that this is also the proper way to get an incoming message's schema definition and done by the "EdifactMessage >> messageSchema method. This will automatically extract the correct schema definition (from the message identifier) and load it via an appropriate parser as required. | Notice that this is also the proper way to get an incoming message's schema definition and done by the "EdifactMessage >> messageSchema method. This will automatically extract the correct schema definition (from the message identifier) and load it via an appropriate parser as required. | ||
| This is also the mechanism used when a schem definition is needed by the XPath accessors or when asking for the segment group structure. | |||
| ===== MessageSet API ===== | ===== MessageSet API ===== | ||
Version vom 24. Juni 2015, 11:26 Uhr
Inhaltsverzeichnis
- 1 Edi/Edifact Message Plugin
- 1.1 Background Information
- 1.2 Edifact Messages
- 1.3 Message Subsets
- 1.4 Meta Descriptions
- 1.5 Edifact Plugin Components
- 1.6 Edifact Smalltalk Class Library
- 1.6.1 Edifact Message Decoder
- 1.6.2 Edifact Message Encoder
- 1.6.3 Edifact Object Representation Framework
- 1.6.4 Edifact Schema Definition Framework
- 1.6.5 Edifact Schema Definition Readers
- 1.6.6 Smalltalk Library API
- 1.6.6.1 Decoding a Message
- 1.6.6.2 Accessing Document and Message Components
- 1.6.6.2.1 EdifactDocument API
- 1.6.6.2.2 EdifactMessage API
- 1.6.6.2.3 EdifactSegment API
- 1.6.6.2.4 Wellknown Segment Types
- 1.6.6.2.5 UNB (Interchange Header) Segment API
- 1.6.6.2.6 UNH (Message Header) Segment API
- 1.6.6.2.7 UNT (Message Trailer) Segment API
- 1.6.6.2.8 UNZ (Interchange Trailer) Segment API
- 1.6.6.2.9 DTM (Date Time) Segment API
- 1.6.6.2.10 NAD (Name and Address) Segment API
- 1.6.6.2.11 QTY (Quantity) Segment API
- 1.6.6.2.12 MOA (Monetary Amount) Segment API
 
- 1.6.6.3 Getting a Schema Definition
- 1.6.6.4 MessageSet API
- 1.6.6.5 Accessing fields with Schema Information Present
- 1.6.6.6 Accessing fields with XPath like Accesors
 
 
- 1.7 Edifact Plugin Library for Expecco
 
Edi/Edifact Message Plugin[Bearbeiten]
The Edifact Message plugin provides extensive support for Edifact message processing.
Background Information[Bearbeiten]
Edifact is an international standard for the exchange of business transaction messages, such as invoices, orders, quotes etc. Edifact transactions is heavily used in B2B communications, and supported by SAP, Oracle, IBM and other business processing systems.
Edifact messages consist of a number of records (called "segments"), which themself contain a number of data elements. Both non-composite and composite data elements are possible.
Edifact messages are surrounded by a header- and trailer segment.
An edifact transaction may contain multiple individual messages, and is itself surrounded by a transaction header segment and a transaction trailer segment.
For comprehensive documentation and a full specification, see United Nations Economic Commision for Europe in general, and Introducing UN/Edifact in particular.
Edifact Messages[Bearbeiten]
UN/Edifact defines the overall format and layout of edifact messages. Standard specifications are typically published twice a year. New revisions typically add additional data elements, additional segment types and messages. New revisions are typically backward compatible. That is previous definitions are seldom (if ever) obsoleted.
All messages and segments contains a tag (or message type) field, which allows for the type of message/segment to be identified. Thus, in theory, it is possible to extract field values from arbitrary messages, even if the concrete message layout is unknown.
However, the semantic interpretation of a particular datum highly depends on its position inside the message. For example, a DTM-segment's datetime value can be extracted easily, but its interpretation (delivery date, invoice date, quotation date etc.) is only possible if the overall message structure is known.
Message Subsets[Bearbeiten]
The definitions as published by UN/Unice define the overall structure of messages and segments. However, most concrete application do not need/use/support the full set of possible segments in a message. Many segments are specified as optional and not supported/expected by many concrete users.
For this typical B2B transaction systems expect and handle only subsets of the over all message set and subsets of the possible segment set within those messages.
Meta Descriptions[Bearbeiten]
In order for a correct semantic interpretation, modification and verification of data elements, a meta description is required, which specifies the format of a message (segment order and grouping) and of data aelements (alphanumeric, numeric, minimum/maximum size and value etc.).
Such specifications are published by UN/Unice for the overall framework, and by companies for the B2B partners. Specifications exist in both formal (machine readable) formats and in non-formal (human readable) formats. Formal specification formats are SEF (Structured Edifact Format Specification) and XSD (XML Schema definitions). Non-formal specification are PDF, HTML and Text formats.
The biggest challenges in dealing with Edifact is the trouble to get formal (machine readable) specifications.
SEF format files provide all required information and would be perfect for this task - however, SEF seems to be obsolete and SEF specifications for newer versions are hard to find (which is a pity). SEF files are not available from UN/Unice.
XSD seems to be the way to go, but official XSD sepcifications are rare and not published by UN.
DFDL is a new XML-based definition standard, which is very promising in theory. However, it is relatively new. Only a small subset of Edifact messages is formally specified in this format.
Official documents from UN/Unice include PDF, HTML and Text files. Although being informal, existing html and text files seem to follow a common layout and structure.
Edifact Plugin Components[Bearbeiten]
Edifact Smalltalk Class Library[Bearbeiten]
This library contains all real functionality of the edifact plugin. The expecco action library (edifact.ets) contains interfacing blocks to the functions found there.
Edifact Message Decoder[Bearbeiten]
Includes a general message/segment/data decoder, which is able to read arbitrary (unknown) messages. Such textual messages (wire data) is decoded and presented as a hierarchy of objects, which model the message, segments and elements. The decoder can decode arbitrary messages - i.e. a meta description is not required for basic decoding.
Edifact Message Encoder[Bearbeiten]
This encodes a message object back into the external edifact wire representation. It cares for the character set, separator definitions etc. In general, the original input is regenerated (with minor differences) when the a decoded edifact message is encoded back to wire data.
Edifact Object Representation Framework[Bearbeiten]
These classes model edifact objects as messages, segments, composite- and non-composite data elements. Without a meta descritpion of the message set, data elements can by accessed by numeric indices (segment#, element#). With a meta description, elements are accessable by more user friendly names, such as the official UN standard field names (C009, C123, 4056 etc), or user assigned application field names (delivaryDate, deliveryLocation, street, city etc.)
Edifact Schema Definition Framework[Bearbeiten]
This set of classes represent a meta description of edifact objects. They can be hand-written, dynamically constructed or read in from external specification files in various formats.
Edifact Schema Definition Readers[Bearbeiten]
These classes read external meta specifications in various formats and create a schema object. Schema objects are needed to correctly interpret element data, to acces elements by user friendly names, and to create new messages which have the correct segment and segment group structure. Readers exist for the following formats:
- XSD
- DFDL
- SEF
- HTML (as in UN docs)
- Text (as in UN docs)
- bots (import specs written in the python programming language, written for the bots processor)
- internal private format
Smalltalk Library API[Bearbeiten]
Decoding a Message[Bearbeiten]
given the following Edifact message in wire data form:
UNA:+.? '
UNB+UNOC:3+Senderkennung+Empfaengerkennung+060620:0931+1++1234567'
UNH+1+ORDERS:D:96A:UN'
BGM+220+B10001'
DTM+4:20060620:102'
NAD+BY+++Bestellername+Strasse+Stadt++23436+xx'
LIN+1++Produkt Schrauben:SA'
QTY+1:1000'
UNS+S'
CNT+2:1'
UNT+9+1'
UNZ+1+1234567'
assuming, the above message (as String) is held in the "msg" variable, it can be decoded with:
    document := EdifactDecoder decode:msg.
Accessing Document and Message Components[Bearbeiten]
EdifactDocument API[Bearbeiten]
"document" will be an instance of EdifactDocument. It represents a single transmission or document (possibly containing multiple messages). It supports the API:
- aDocument numberOfMessages
 - returns the number of individual messages in the transaction
- aDocument messages
 - a collection of messages contained in the transaction
- aDocument headerSegment
 - returns the transmission advice segment (the 'UNA' segment)
- aDocument headerSegment
 - returns the transaction header segment (the 'UNB' segment)
- aDocument trailerSegment
 - returns the transaction trailer segment (the 'UNZ' segment)
Individual messages are fetched as instances of EdifactMessage via "document messages at: index" or enumerated with "document messages do:[:each | ...]".
Examples:
bgmSegment := document messages first segments first. dtmSegmeent := document messages first segments at:2.
EdifactMessage API[Bearbeiten]
EdifactMessage is a single message inside an edifactDocument. It consists of segments, and supports the API:
- aMessage segments
 - a collection of individual segments inside the message (excludes header and trailer). This is a "raw" (eg. flat) list of segments, as present in the original message. No segment group information is reflected in this list.
- aMessage groupedSegments
 - a hierarchical collection of segments and segment groups (excludes header and trailer). This is a hierarchical list where group-segments are founder under segment group objects.
- aMessage headerSegment
 - returns the message header segment (the 'UNH' segment)
- aMessage trailerSegment
 - returns the message trailer segment (the 'UNT' segment)
- aMessage messageType
 - extracts the message type from the message header as a string. In the above example, this would be 'ORDERS'.
 The list of version d14b message types is found at [1] and [2].
- aMessage controllingAgency
 - extracts the standards controlling agency which defined the message structure. In the above, this would be 'UN'.
- aMessage version
 - extracts the message structure standard version number. In the above message, this is '96A'.
- aMessage release
 - extracts the message structure standard release number. In the above message, this is 'D'.
- aMessage messageSchema
 - return the message's schema definition (see below)
- aMessage edifactVersion
 - return the messageSet which contains the message and segment schemas (see below)
The "controllingAgency", "version" and "release" fields are required to determine which meta schema definition is to be used (see below for details)
EdifactSegment API[Bearbeiten]
Segments as returned by any of the above. They consist of a tag (segment type) and a number of data elements. They understand the following API:
- aSegment tag
 - the segments's id (eg. 'UNA', 'DTM', 'BGM' etc.) The list of version d14b tags is found at [3].
- aSegment dataElementAt: index
 - return a possibly composite data element. This is a wrapper object, which holds on both the schema definition (if known) and the data value.
- aSegment dataElementStringAt: index
 - return the datum as a string. If the element is not present, an empty string is returned.
- aSegment dataElementValueAt: index
 - return the datum as a string or number or nil, if the element is not present.
- aSegment dataElementAt: index1 at: index2
 - similar to the above, but return a composite element's subcomponent
- aSegment dataElementStringAt: index1 at: index2
 - similar to the above, but return a composite element's subcomponent's string
- aSegment dataElementValueAt: index1 at: index2
 - similar to the above, but return a composite element's subcomponent' value
- aSegment dataElementAt: index put: newElement
 -
- aSegment dataElementStringAt: index put: newString
 -
- aSegment dataElementValueAt: index put: newObject
 -
- aSegment dataElementAt: index1 at: index2 put: newElement
 -
- aSegment dataElementStringAt: index1 at: index2 put: newString
 -
- aSegment dataElementValueAt: index1 at: index2 put: newObject
 -
Examples:
   dtmSegment id -> 'DTM'
   (dtmSegment dataElementStringAt:1) -> '4:20060620:102'
   (dtmSegment dataElementValueAt:1) -> #('4' '20060620' '102')
   (dtmSegment dataElementStringAt:1 at:2) -> '20060620'
Wellknown Segment Types[Bearbeiten]
Although any data element can be accessed either by an index or its UN-standard name (eg. C107 or 4063), for some wellknown, often used segment types, additional user friendly accessors are provided. The idea is to make program code more readable, by using human readable field names, instead of the more technical UN-standard names.
UNB (Interchange Header) Segment API[Bearbeiten]
- unbSegment interchangeRecipient
- unbSegment interchangeSender
- unbSegment processingPriorityCode
- unbSegment testIndicator
- -- more to be documented --
UNH (Message Header) Segment API[Bearbeiten]
- unhSegment messageIdentifier
 - the contents of the S009 element as a composite string (with colons)
- unhSegment messageType
 - the contents of the S009:0065 element
- unhSegment messageVersionNumber
 - the contents of the S009:0052 element
- unhSegment messageReleaseNumber
 - the contents of the S009:0054 element
- unhSegment controllingAgency
 - the contents of the S009:0051 element
- unhSegment associationAssignedCode
 - the contents of the S009:0057 element
- -- more to be documented --
UNT (Message Trailer) Segment API[Bearbeiten]
- untSegment messageReferenceNumber
 -
- -- more to be documented --
UNZ (Interchange Trailer) Segment API[Bearbeiten]
- unzSegment numberOfMessages
 - the number of messages in the transmission (document)
- unzSegment interchangeControlReference
 -
- -- more to be documented --
DTM (Date Time) Segment API[Bearbeiten]
- dtmSegment dateTimeString
 - retrieves the 2nd component, which is the date time as a string
- dtmSegment dateTimeFormat
 - retrieves the 3rd component, which specifies the format of the dateTimeString
- dtmSegment dateTime
 - converts the dateTimeString as specified by dateTimeFormat and returns a Date, Time, Timestamp, TimeDuration or similar object.
NAD (Name and Address) Segment API[Bearbeiten]
- nadSegment cityName
 - retrieves the city component
- nadSegment countryNameCode
 -
- nadSegment nameAndAddress
 -
- nadSegment partyName
 -
- nadSegment postalIDCode
 -
- nadSegment street
 - retrieves the street component
- - to be documented -
QTY (Quantity) Segment API[Bearbeiten]
- qtySegment quantity
 - the contents of the 6060 element as a string
- qtySegment quantityDetails
 - the contents of the C186 element as a composite string (with colons)
- qtySegment quantityTypeCodeQualifier
 - the contents of the 6063 element
- -- more to be documented -
MOA (Monetary Amount) Segment API[Bearbeiten]
- moaSegment amount
 -
- moaSegment currencyIdCode
 -
- moaSegment currencyTypeCodeQualifier
 -
- moaSegment statusDescriptionCode
 -
- moaSegment typeCode
 -
- -- more to be documented -
Getting a Schema Definition[Bearbeiten]
Schema definitions are normally automatically fetched as required, whenever a segment- or messageSchema is needed. For example, when asking a message for its groupedSegments, or when asking a segment for a named field, the message is consulted for its messageIdentifier, and a corresponding schema definition is tried to be loaded from a folder conatining the schema definition files.
However, you can also import schemas manually via the schema parsers. These parsers read specifications in various formats and return an instance of EdifactMessageSet. This holds the definitions of segments and messages. Because schema parsing may be a relatively expensive operation (depending on the format), these schema definitions are cached both in memory and as an option also in external files (schema cache folder).
Assuming, a schema definition is present in SEF format in the file "~expecco/definitions/un/d10a/sef/INVOIC.SEF", read it with:
messageSet := EdifactSEFParser parseFile: '~expecco/definitions/un/d10a/sef/INVOIC.SEF'.
For XSD schema definitions use the EdifactXSDParser, as in:
messageSet := EdifactXSDParser parseFile: '~expecco/definitions/un/d10a/xsd/INVOIC.xsd'.
The expecco plugin contains definition files for all UN versions d93a to d14b in the "definitions" subfolder of the plugin.
For standard definitions, a more convenient automatic parsing is possible, by asking the utility class "EdifactVersion" for a particular schema. Give it the controlling agency, version and release, as in:
messageSet := EdifactVersion controllingAgency:'UN' version:'03A' release:'D'.
Notice that this is also the proper way to get an incoming message's schema definition and done by the "EdifactMessage >> messageSchema method. This will automatically extract the correct schema definition (from the message identifier) and load it via an appropriate parser as required. This is also the mechanism used when a schem definition is needed by the XPath accessors or when asking for the segment group structure.
MessageSet API[Bearbeiten]
A message set object contains the definitions of one or more messages (as described in the schema file) and of all required (eg. referred to) segments. You can fetch individual messageSchemas with:
- aMessageSchema messageSchemaAt:messageType
 where messageType is one of 'INVOIC', 'ORDERS', etc.
and segment schemas with:
- aMessageSchema segmentSchemaAt:segmentID
 where segmentID is one of 'BGM', 'DTM' etc.
MessageSets are organized in a hierarchy, and missing schemas are also searched in a messageSet's baseSchema. Typically, messageSets for custom applications have the more general standard UN messageSet as their baseSchema.
MessageSets are cached internally, to prevent parsing overhead.
Accessing fields with Schema Information Present[Bearbeiten]
if a schem is known, fields and segments and groups of segments can be accessed by name. Assuming the standard schema for the ORDERS message, the above message has the following group schema:
       UNA:+.? '   
       UNB+UNOC:3+Senderkennung+Empfaengerkennung+060620:0931+1++1234567'
       UNH+1+ORDERS:D:96A:UN'
       BGM+220+B10001'
       DTM+4:20060620:102'
SG2    NAD+BY+++Bestellername+Strasse+Stadt++23436+xx'
SG25   LIN+1++Produkt Schrauben:SA'
SG25   QTY+1:1000'
       UNS+S'
       CNT+2:1'
       UNT+9+1'
       UNZ+1+1234567'
eg. the NAD belongs to SG2 and the LIN+QTY belong to SG25. Without proper knowledge of which segment group a segment belongs to, it is usually hard to impossible to correctly interpret the values inside the message (just consider the fact, that there are normally multiple NAD, LIN, QTY etc. segments present).
Accessing fields with XPath like Accesors[Bearbeiten]
When a message knows its schema, you can access elements by xpath:
- message dataElementStringAt: '/SG25/QTY/
or, as a convenient shortcut:
- message xPathGet: '/SG25/QTY'
Many of the XPath search and filter functions are possible, for example, to find the second quantity in a message which contains multiple instances of SG25, use:
- message xPathGet: '/SG25[2]/QTY'
or, to find all fields which contain a particular value, use:
- message xPathGet: '//QTY[contains(text(),"100.0")]'
The same scheme can be used to change fields, when edifact objects are constructed:
- message xPathSet: '/SG25[2]/QTY/6060' value: 100
Edifact Plugin Library for Expecco[Bearbeiten]
The plugin includes a library of action blocks, to call the above described functions. For example, the following activity decodes an edifact message and extracts a particular field by XPath from it:
where the attachment contains an INVOIC as edifact:
UNA:+.? ' UNB+UNOC:3+9900111111112:500+9900111111113:500+150407:0813+000011606231' UNH+000610018555+INVOIC:D:06A:UN:2.6a' BGM+457+PRN780101466824+9' DTM+137:20150405:102' DTM+9:20150405:102' DTM+155:20120731:102' DTM+156:20121127:102' IMD++JVR' RFF+Z13:31004' RFF+OI:PRN771200331164' DTM+171:20121204:102' NAD+MS+9900111111112::293++VNB:::::Z02+Musterstraße::42+Musterhausen++12345+DE' RFF+VA:DE814588308' NAD+MR+9900111111113::293++Lieferant:Kreditoren- und Energiedatenmanagem::::Z02+Musterstraße::42+Musterhausen++12345+DE' NAD+DP++++Musterstr.::41+Musterhausen++12345+DE' LOC+172+DE0001231234500000000000004700823' CUX+2:EUR:4' PYT+3' DTM+265:20121220:102' UNS+S' MOA+77:-53.08' MOA+113:-26' MOA+9:-27.08' TAX+7+VAT+++:::0+O' MOA+125:-53.08' MOA+161:0' MOA+113:-26' MOA+115:0' UNT+28+000610018555' UNZ+1+000011606231'
The INVOIC's segment structure is:
Back to Online Documentation.
Edifact Plugin Library for Expecco[Bearbeiten]
Back to Online Documentation.


