Class Org_Heigl_Hyphenator

Description

This class implements word-hyphenation

Word-hyphenation is implemented on the basis of the algorithms developed by Franklin Mark Liang for LaTeX as described in his dissertation at the department of computer science at stanford university.

This package is based on an idea of Mathias Nater<mnater@mac.com> who implemented this word-hyphenation-algorithm for javascript.

Hyphenating means in this case, that all possible hypheantions in a word are marked using the soft-hyphen character (ASCII-Caracter 173) or any other character set via the setHyphen() method.

A complete text will first be divided into words via a regular expression that takes all characters that the \w-Special-Character specifies as well as the '@'-Character and possible other - language-specific - characters that can be set via the setSpecialChars() method.

Hyphenation is done using a set of files taken from a current TeX-Distribution that are matched using the method getTexFile().

So here is an example for the usage of the class:

  1.  <?php
  2.  $hyphenator Org_Heigl_Hyphenator::getInstance 'de' );
  3.  $hyphenator -> setHyphen '-' )
  4.              // Minimum 5 characters before the first hyphenation
  5.              -> setLeftMin )
  6.              // Hyphenate only words with more than 4 characters
  7.              -> setWordMin )
  8.              // Set some special characters
  9.              -> setSpecialChars 'äöüß' )
  10.              // Only Hyphenate with the best quality
  11.              -> setQuality Org_Heigl_Hyphenate::QUALITY_BEST )
  12.              // Words that shall not be hyphenated have to start with this string
  13.              -> setNoHyphenateMarker 'nbr:' )
  14.              // Words that contain this string are custom hyphenated
  15.              -> setCustomHyphen '--' );
  16.  
  17.  // Hyphenate the string $string
  18.  $hyphenated $hyphenator -> hyphenate $text );
  19.  ?>

Located in /src/Org/Heigl/Hyphenator.php (line 89)


	
			
Class Constant Summary
Method Summary
 static Zend_Cache getCache ()
 static string getDefaultLanguage ()
 static Org_Heigl_Hyphenator getInstance ([string $language = 'en'])
 static string getTexFile (string $language)
 static string parse (string $string, [string $options = null])
 static boolean parseTexFile (string $file, string $parsedFile)
 static boolean setCache ( $cache)
 static void setDefaultLanguage (string $language)
 Org_Heigl_Hyphenator __construct ([string $language = 'en'])
 string cacheRead (string $key)
 Org_Heigl_Hyphenator cacheWrite (string $key, string $string)
 Org_Heigl_Hyphenator enableCaching ([boolean $caching = true])
 string getCustomMarker ()
 string getHyphen ()
 string getNoHyphenMarker ()
 string hyphenate (string $string)
 string hyphenateWord (string $word)
 boolean isCachingEnabled ()
 boolean markCustomization ([null|booelan $mark = null])
 Org_Heigl_Hyphenator setCustomHyphen ([string $customHyphen = null])
 Org_Heigl_Hyphenator setNoHyphenateMarker ([string $marker = null])
 Org_Heigl_Hyphenator setQuality ([int $quality = 5])
 Org_Heigl_Hyphenator setSpecialStrings ([array $specialStrings = array ()])
 string|false _handleSpecialStrings (string $word)
Methods
static getCache (line 502)

Get the cache-Object

  • access: public
static Zend_Cache getCache ()
static getDefaultLanguage (line 295)

Get the default language

  • access: public
static string getDefaultLanguage ()
static getInstance (line 308)

This method gets the hyphenator-instance for the language $language

If no instance exists, it is created and stored.

  • return: A Hyphenator-Object
  • throws: InvalidArgumentException
  • access: public
static Org_Heigl_Hyphenator getInstance ([string $language = 'en'])
  • string $language: The language to use for hyphenating
static getTexFile (line 432)

This method returns the name of a TeX-Hyphenation file to a language code

  • access: public
static string getTexFile (string $language)
  • string $language: The language code to get the to use
static parse (line 256)

This is the static way of hyphenating a string.

This method gets the appropriate Hyphenator-object and calls the method hyphenate() on it.

  • return: The hyphenated string
  • access: public
static string parse (string $string, [string $options = null])
  • string $string: The String to hyphenate
  • string $options: The Options to use for Hyphenation
static parseTexFile (line 350)

This method parses a TEX-Hyphenation file and creates the appropriate PHP-Hyphenation file

  • access: public
static boolean parseTexFile (string $file, string $parsedFile)
  • string $file: The original TEX-File
  • string $parsedFile: The PHP-File to be created
static setCache (line 491)

Set an instance of Zend_Cache as Caching-Backend.

static boolean setCache ( $cache)
  • Zend_Cache $cache: The caching Backend
static setDefaultLanguage (line 286)

Set the default Language

  • access: public
static void setDefaultLanguage (string $language)
  • string $language: The Lanfuage to set.
Constructor __construct (line 518)

This is the constructor, that initialises the hyphenator for the given language $language

This constructor is declared private to ensure, that it is only called via the getInstance() method, so we only initialize the stuff only once for each language.

  • throws: Exception
  • access: public
Org_Heigl_Hyphenator __construct ([string $language = 'en'])
  • string $language: The language to use for hyphenating
cacheRead (line 966)

Get the cached string to a key

  • access: public
string cacheRead (string $key)
  • string $key: The key to return a string to
cacheWrite (line 947)

Write string to the cache.

string can be retrieved using key

  • access: public
Org_Heigl_Hyphenator cacheWrite (string $key, string $string)
  • string $key: The key under which the string can be found in the cache
  • string $string: The string to cache
enableCaching (line 922)

Enable or disable caching of hyphenated texts

  • access: public
Org_Heigl_Hyphenator enableCaching ([boolean $caching = true])
  • boolean $caching: Whether to enable caching or not. Defaults to true
getCustomizationMarker (line 1075)

Get the string that shall be prepend to a customized word.

  • access: public
string getCustomizationMarker ()
getCustomMarker (line 1028)

Get the marker for custom hyphenations

  • access: public
string getCustomMarker ()
getHyphen (line 859)

Get the hyphenation character

  • access: public
string getHyphen ()
getNoHyphenMarker (line 1037)

Get the marker for Words not to hyphenate

  • access: public
string getNoHyphenMarker ()
hyphenate (line 566)

This method does the actual hyphenation.

The given $string is splitted into chunks (i.e. Words) at every blank.

After that every chunk is hyphenated and the array of chunks is merged into a single string using blanks again.

This method does not take into account other word-delimiters than blanks (eg. returns or tabstops) and it will fail with texts containing markup in any way.

  • return: The hyphenated string
  • access: public
string hyphenate (string $string)
  • string $string: The string to hyphenate
hyphenateWord (line 600)

This method hyphenates a single word

  • return: the hyphenated word
  • access: public
string hyphenateWord (string $word)
  • string $word: The Word to hyphenate
isCachingEnabled (line 933)

Check whether caching is enabled or not

  • access: public
boolean isCachingEnabled ()
markCustomization (line 1051)

Set and retrieve whether or not to mark custom hyphenations

This method always returns the current setting, so you can set AND retrieve the value with this method.

  • access: public
boolean markCustomization ([null|booelan $mark = null])
  • null|booelan $mark: Whether or not to mark
setCustomHyphen (line 1004)

Set a string that will be replaced with the soft-hyphen before Hyphenation actualy starts.

If this string is found in a word no hyphenation will be done except for the place where the custom hyphen has been found

  • access: public
Org_Heigl_Hyphenator setCustomHyphen ([string $customHyphen = null])
  • string $customHyphen: The Custom Hyphen to set
setCustomizationMarker (line 1065)

Set the string that shall be prepend to a customized word.

  • access: public
Org_Heigl_Hyphenator setCustomizationMarker (string $marker)
  • string $marker: The Marker to set
setHyphen (line 849)

This method sets the Hyphenation-Character.

  • return: Provides fluent Interface
  • access: public
Org_Heigl_Hyphenator setHyphen (string $char)
  • string $char: The Hyphenation Character
setLeftMin (line 871)

This method sets the minimum Characters, that have to stay to the left of a hyphenation

  • return: Provides fluent Interface
  • access: public
Org_Heigl_Hyphenator setLeftMin (int $count)
  • int $count: The left minimum
setNoHyphenateMarker (line 1017)

Set a string that marks a words not to hyphenate

  • access: public
Org_Heigl_Hyphenator setNoHyphenateMarker ([string $marker = null])
  • string $marker: THe Marker that marks a word
setQuality (line 988)

Set the quality that the Hyphenation needs to have minimum

The lower the number, the better is the quality

  • access: public
Org_Heigl_Hyphenator setQuality ([int $quality = 5])
  • int $quality: The quality-level to set
setRightMin (line 884)

This method sets the minimum Characters, that have to stay to the right of a hyphenation

  • return: Provides fluent Interface
  • access: public
Org_Heigl_Hyphenator setRightMin (int $count)
  • int $count: The minimmum characters
setSpecialChars (line 909)

This method sets the special Characters for a specified language

  • return: Provides fluent Interface
  • access: public
Org_Heigl_Hyphenator setSpecialChars (string $chars)
  • string $chars: The spechail characters
setSpecialStrings (line 836)

Set the special strings

These are strings that can be used for further parsing of the text.

For instance a string to be replaced with a soft return or any other symbol your application needs.

  • access: public
Org_Heigl_Hyphenator setSpecialStrings ([array $specialStrings = array ()])
  • array $specialStrings: An array of special strings.
setWordMin (line 897)

This method sets the minimum Characters a word has to have before being hyphenated

  • return: Provides fluent Interface
  • access: public
Org_Heigl_Hyphenator setWordMin (int $count)
  • int $count: The minimmum characters
_handleSpecialStrings (line 806)

Handle special strings

Hyphenate words containing special strings for further processing, so put a zerowidthspace after it and hyphenate the parts separated by the special string.

  • access: public
string|false _handleSpecialStrings (string $word)
  • string $word: The Word to hyphenate
Class Constants
QUALITY_HIGH = 7 (line 93)
QUALITY_HIGHEST = 9 (line 92)
QUALITY_LOW = 3 (line 95)
QUALITY_LOWEST = 1 (line 96)
QUALITY_NORMAL = 5 (line 94)

Documentation generated on Mon, 14 Jun 2010 16:57:06 +0200 by phpDocumentor 1.4.3