HelpWizard Pages Documents HTML/en: Unterschied zwischen den Versionen

Aus expecco Wiki (Version 2.x)
Zur Navigation springen Zur Suche springen
 
Zeile 22: Zeile 22:
''aDocNode'' // ''tagName''
''aDocNode'' // ''tagName''
attribute access:
''aDocNode'' @ ''attribName''

For example, to get all anchors:
For example, to get all anchors:
|doc|
|doc anchors|
doc := HTML::HTMLParser parseFile:'myFile.html'.
doc := HTML::HTMLParser parseFile:'myFile.html'.
anchors := doc // 'a'.
anchors := doc // 'a'.

To extract only anchors to a URL which matches a check function (in this case, which refer to a URL which starts with a prefix):
|doc anchors anchorsMatching|
doc := HTML::HTMLParser parseFile:'myFile.html'.
anchors := doc // 'a'.
anchorsMatching := anchors select:[:a | (a @ 'HREF') startsWith:'misc']



Use the class browser to see all methods provided by HTML elements and the parser.
Use the class browser to see all methods provided by HTML elements and the parser.

Aktuelle Version vom 29. August 2022, 11:21 Uhr

HTML Documents

back (Back to Documents)

Reading/Parsing[Bearbeiten]

Typically, documents are parsed into a so called DOM, which is then processed (searched or manipulated). Actions to parse are found in the standard library.

Elementary Smalltalk code can get a DOM tree via:

HTML::HtmlParser parse:aStringOrStream

or

HTML::HtmlParser parseFile:aFilename

The resulting document node (DOM) can then be processed:

aDocNode head
aDocNode body

xPath like access:

aDocNode / tagName
aDocNode // tagName

attribute access:

aDocNode @ attribName

For example, to get all anchors:

|doc anchors|
doc := HTML::HTMLParser parseFile:'myFile.html'.
anchors := doc // 'a'.

To extract only anchors to a URL which matches a check function (in this case, which refer to a URL which starts with a prefix):

|doc anchors anchorsMatching|
doc := HTML::HTMLParser parseFile:'myFile.html'.
anchors := doc // 'a'.
anchorsMatching := anchors select:[:a | (a @ 'HREF') startsWith:'misc']


Use the class browser to see all methods provided by HTML elements and the parser. Use an inspector on the results, for what can be done with an element.



Copyright © 2014-2024 eXept Software AG