de.l3s.boilerpipe.sax
Class CommonTagActions

java.lang.Object
  extended by de.l3s.boilerpipe.sax.CommonTagActions

public abstract class CommonTagActions
extends java.lang.Object

Defines an action that is to be performed whenever a particular tag occurs during HTML parsing.

Author:
Christian Kohlschütter

Nested Class Summary
static class CommonTagActions.BlockTagLabelAction
          CommonTagActions for block-level elements, which triggers some LabelAction on the generated TextBlock.
static class CommonTagActions.Chained
           
static class CommonTagActions.InlineTagLabelAction
          CommonTagActions for inline elements, which triggers some LabelAction on the generated TextBlock.
 
Field Summary
static TagAction TA_ANCHOR_TEXT
          Marks this tag as "anchor" (this should usually only be set for the <A> tag).
static TagAction TA_BODY
          Marks this tag the body element (this should usually only be set for the <BODY> tag).
static TagAction TA_FONT
          Special TagAction for the <FONT> tag, which keeps track of the absolute and relative font size.
static TagAction TA_IGNORABLE_ELEMENT
          Marks this tag as "ignorable", i.e. all its inner content is silently skipped.
static TagAction TA_INLINE
          Deprecated. Use TA_INLINE_WHITESPACE instead
static TagAction TA_INLINE_NO_WHITESPACE
          Marks this tag a simple "inline" element, which neither generates whitespace, nor a new block.
static TagAction TA_INLINE_WHITESPACE
          Marks this tag a simple "inline" element, which generates whitespace, but no new block.
 
Method Summary
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TA_IGNORABLE_ELEMENT

public static final TagAction TA_IGNORABLE_ELEMENT
Marks this tag as "ignorable", i.e. all its inner content is silently skipped.


TA_ANCHOR_TEXT

public static final TagAction TA_ANCHOR_TEXT
Marks this tag as "anchor" (this should usually only be set for the <A> tag). Anchor tags may not be nested. There is a bug in certain versions of NekoHTML which still allows nested tags. If boilerpipe encounters such nestings, a SAXException is thrown.


TA_BODY

public static final TagAction TA_BODY
Marks this tag the body element (this should usually only be set for the <BODY> tag).


TA_INLINE_WHITESPACE

public static final TagAction TA_INLINE_WHITESPACE
Marks this tag a simple "inline" element, which generates whitespace, but no new block.


TA_INLINE

@Deprecated
public static final TagAction TA_INLINE
Deprecated. Use TA_INLINE_WHITESPACE instead

TA_INLINE_NO_WHITESPACE

public static final TagAction TA_INLINE_NO_WHITESPACE
Marks this tag a simple "inline" element, which neither generates whitespace, nor a new block.


TA_FONT

public static final TagAction TA_FONT
Special TagAction for the <FONT> tag, which keeps track of the absolute and relative font size.