Module description

rgx -- Regular expressions
The rgx module implements regular expressions. It supports words for compiling and matching regular expressions. The module uses the nfe module for the actual expression building and matching.
    This module uses the following syntax:                                 
     .   Match any char [incl. newline]     *   Match zero or more         
     +   Match one or more                  ?   Match zero or one          
     |   Match alternatives                 []  Class                      
     &lb;&rb;  Group or subexpression                                      
    Backslash characters:                                                  
     \.  Character .                       \*   Character *                
     \+  Character +                       \?   Character ?                
     \|  Character |                       \\   Backslash                  
     \[  Character [                                                       
     \r  Carriage return                   \n   Line feed                  
     \t  Horizontal tab                    \e   Escape                     
     \d  Digits class: [0-9]               \D   No digits: [^0-9]          
     \w  Word class: [0-9a-zA-Z_]          \W   No word: [^0-9a-zA-Z_]     
     \s  Whitespace                        \S   No whitespace              
     All other backslash characters simply return the trailing character,  
     but this can change in future versions.                               
      [abc]  - match a or b or c                                           
      [^abc] - match everything except a or b or c                         
      [a-z]  - match a or b or .. z                                        
      [-abc] - match - or a or b or c                                      
      []abc] - match ] or a or b or c                                      
      [\d\n] - match digit or line feed                                    
     Backslash characters in classes:                                      
      \r  Carriage return                \n    Line feed                   
      \t  Horizontal tab                 \e    Escape                      
      \]  Character ]                    \-    Character -                 
      \d  Digits class: [0-9]            \w    Word class: [0-9a-zA-Z_]    
      \s  Whitespace                                                       
     All other backslash characters simply return the trailing character,  
     but this can change in future versions.                               

Regular expression structure

rgx% ( -- n )
Get the required space for a rgx variable

Regular expression creation, initialisation and destruction

rgx-init ( rgx -- )
Initialise the regular expression
rgx-create ( "<spaces>name" -- ; -- rgx )
Create a named regular expression in the dictionary
rgx-new ( -- rgx )
Create a new regular expression on the heap
rgx-free ( rgx -- )
Free the regular expression from the heap

Regular expression words

rgx-compile ( c-addr u rgx -- true | n false )
Compile a pattern as regular expression, return success and optional the error offset n
rgx-cmatch? ( c-addr u rgx -- flag )
Match case-sensitive a string with the regular expression, return match result
rgx-imatch? ( c-addr u rgx -- flag )
Match case-insensitive a string with the regular expression, return match result
rgx-csearch ( c-addr u rgx -- n )
Search case-sensitive in a string for the first match of the regular expression, return offset in string, or -1 for not found
rgx-isearch ( c-addr u rgx -- n:index )
Search case-insensitive in a string for the first match of the regular expression, return offset in string, or -1 if not found
rgx-result ( n rgx -- n1 n2 )
Get the match result of the nth grouping, return match start n2 and end n1, group 0 is the result of the whole match


rgx-dump ( rgx -- )
Dump the regular expression


\ ==============================================================================
\          rgx_expl - the regular expression example in the ffl
\               Copyright (C) 2007  Dick van Oudheusden
\  $Date: 2008-10-06 18:22:09 $ $Revision: 1.3 $
\ ==============================================================================

include ffl/rgx.fs

\ Create a regular expression variable rgx1 

rgx-create rgx1

\ Compile a regular expression and check the result

s" ((a*)b)*" rgx1 rgx-compile [IF] 
  .( Expression successful compiled) cr
  .( Compilation failed on position:) . cr

\ Match case sensitive a test string
s" abb" rgx1 rgx-cmatch? [IF]
  .( Test string matched) cr
  .( No match) cr  

\ Create a regular expression variable on the heap
rgx-new value rgx2

\ Compile a regular expression for matching a float number

s" [-+\s]?\d+(\.\d+)?" rgx2 rgx-compile [IF]
  .( Expression successful compiled) cr
  .( Compilation failed on position:) . cr

\ Match a float number

s" -12.47" rgx2 rgx-cmatch? [IF]
  .( Float number matched) cr
  .( No match) cr

\ Free the variable from the heap

rgx2 rgx-free

