MyWebUniversity.com Home Page
 



OpenSolaris man pages main menu


Introduction to Library Functions                    PCREBUILD(3)



NAME
     PCRE - Perl-compatible regular expressions

PCRE BUILD-TIME OPTIONS

     This document describes the optional features of  PCRE  that
     can be selected when the library is compiled. It assumes use
     of the configure script, where  the  optional  features  are
     selected  or  deselected  by  providing options to configure
     before running the make command. However, the  same  options
     can be selected in both Unix-like and non-Unix-like environ-
     ments using the GUI facility of CMakeSetup if you are  using
     CMake instead of configure to build PCRE.

     The complete list of options for configure  (which  includes
     the  standard ones such as the selection of the installation
     directory) can be obtained by running

       ./configure --help

     The following sections include descriptions of options whose
     names  begin  with  --enable  or  --disable.  These settings
     specify changes to the defaults for the  configure  command.
     Because  of  the  way  that configure works, --enable and --
     disable always come in pairs, so  the  complementary  option
     always  exists  as well, but as it specifies the default, it
     is not described.

C] SUPORT

     By default, the configure script will search for a C]  com-
     piler  and  C] header files. If it finds them, it automati-
     cally builds the C] wrapper library for PCRE. You can  dis-
     able this by adding

       --disable-cpp

     to the configure command.

UTF-8 SUPORT

     To build PCRE with support for UTF-8 character strings, add

       --enable-utf8

     to the configure command. Of itself, this does not make PCRE
     treat  strings as UTF-8. As well as compiling PCRE with this
     option, you also have have to set the PCREUTF8 option  when
     you call the pcrecompile() function.

UNICODE CHARACTER PROPERTY SUPORT




SunOS 5.10                Last change:                          1






Introduction to Library Functions                    PCREBUILD(3)



     UTF-8  support  allows  PCRE  to  process  character  values
     greater than 255 in the strings that it handles. On its own,
     however, it does not provide any  facilities  for  accessing
     the properties of such characters. If you want to be able to
     use the pattern escapes \P,  \p,  and  \X,  which  refer  to
     Unicode character properties, you must add

       --enable-unicode-properties

     to the configure command. This implies UTF-8  support,  even
     if you have not explicitly requested it.

     Including Unicode property support adds around 30K of tables
     to  the  PCRE  library. Only the general category properties
     such as Lu and Nd are supported. Details are  given  in  the
     pcrepattern documentation.

CODE VALUE OF NEWLINE

     By default, PCRE interprets character 10 (linefeed,  LF)  as
     indicating  the  end  of  a line. This is the normal newline
     character on Unix-like systems. You can compile PCRE to  use
     character 13 (carriage return, CR) instead, by adding

       --enable-newline-is-cr

     to the configure command. There is also a  --enable-newline-
     is-lf  option,  which  explicitly  specifies linefeed as the
     newline character.

     Alternatively, you can specify that line endings are  to  be
     indicated  by  the  two character sequence CRLF. If you want
     this, add

       --enable-newline-is-crlf

     to the configure command. There is a fourth  option,  speci-
     fied by

       --enable-newline-is-anycrlf

     which causes PCRE to recognize any of  the  three  sequences
     CR,  LF,  or  CRLF  as  indicating a line ending. Finally, a
     fifth option, specified by

       --enable-newline-is-any

     causes PCRE to recognize any Unicode newline sequence.

     Whatever line ending convention is  selected  when  PCRE  is
     built  can  be  overridden  when  the  library functions are
     called. At build time it is conventional to use the standard



SunOS 5.10                Last change:                          2






Introduction to Library Functions                    PCREBUILD(3)



     for your operating system.

WHAT \R MATCHES

     By default, the sequence \R in a pattern matches any Unicode
     newline  sequence,  whatever  has  been selected as the line
     ending sequence. If you specify

       --enable-bsr-anycrlf

     the default is changed so that \R matches only  CR,  LF,  or
     CRLF.  Whatever  is selected when PCRE is built can be over-
     ridden when the library functions are called.

BUILDING SHARED AND STATIC LIBRARIES

     The PCRE building process uses libtool to build both  shared
     and  static  Unix libraries by default. You can suppress one
     of these by adding one of

       --disable-shared
       --disable-static

     to the configure command, as required.

POSIX MALOC USAGE

     When PCRE is called through the  POSIX  interface  (see  the
     pcreposix  documentation),  additional  working  storage  is
     required for holding the pointers to  capturing  substrings,
     because  PCRE requires three integers per substring, whereas
     the POSIX interface provides only  two.  If  the  number  of
     expected  substrings  is  small,  the  wrapper function uses
     space on the stack, because this is faster than  using  mal-
     loc()  for  each call. The default threshold above which the
     stack is no longer used is 10; it can be changed by adding a
     setting such as

       --with-posix-malloc-threshold=20

     to the configure command.

HANDLING VERY LARGE PATERNS

     Within a compiled pattern, offset values are used  to  point
     from  one  part  to  another  (for  example, from an opening
     parenthesis to an alternation  metacharacter).  By  default,
     two-byte  values  are  used  for these offsets, leading to a
     maximum size for a compiled pattern of around 64K.  This  is
     sufficient  to  handle  all  but the most gigantic patterns.
     Nevertheless, some people do want to process  enormous  pat-
     terns,  so  it is possible to compile PCRE to use three-byte



SunOS 5.10                Last change:                          3






Introduction to Library Functions                    PCREBUILD(3)



     or four-byte offsets by adding a setting such as

       --with-link-size=3

     to the configure command. The value given must be 2,  3,  or
     4.  Using  longer  offsets  slows down the operation of PCRE
     because it has to load additional bytes when handling them.

AVOIDING EXCESIVE STACK USAGE

     When matching with the pcreexec() function, PCRE implements
     backtracking  by making recursive calls to an internal func-
     tion called match(). In environments where the size  of  the
     stack  is limited, this can severely limit PCRE's operation.
     (The Unix environment does  not  usually  suffer  from  this
     problem,  but  it may sometimes be necessary to increase the
     maximum stack size.  There is a discussion in the  pcrestack
     documentation.)  An  alternative  approach to recursion that
     uses memory from the heap to remember data, instead of using
     recursive function calls, has been implemented to work round
     the problem of limited stack size. If you want  to  build  a
     version of PCRE that works this way, add

       --disable-stack-for-recursion

     to the configure command. With this configuration, PCRE will
     use  the  pcrestackmalloc and pcrestackfree variables to
     call memory management functions. By default these point  to
     malloc()  and  free(),  but  you can replace the pointers so
     that your own functions are used.

     Separate  functions   are   provided   rather   than   using
     pcremalloc and pcrefree because the usage is very predict-
     able: the block sizes requested are always the same, and the
     blocks  are always freed in reverse order. A calling program
     might be able to implement optimized functions that  perform
     better  than  malloc() and free(). PCRE runs noticeably more
     slowly when built in this way. This option affects only  the
     pcreexec()  function;  it  is  not  relevant  for  the  the
     pcredfaexec() function.

LIMITING PCRE RESOURCE USAGE

     Internally, PCRE has a function  called  match(),  which  it
     calls  repeatedly  (sometimes  recursively)  when matching a
     pattern with the pcreexec() function.  By  controlling  the
     maximum number of times this function may be called during a
     single matching operation, a limit  can  be  placed  on  the
     resources  used  by  a single call to pcreexec(). The limit
     can be changed at run time,  as  described  in  the  pcreapi
     documentation.  The  default  is 10 million, but this can be
     changed by adding a setting such as



SunOS 5.10                Last change:                          4






Introduction to Library Functions                    PCREBUILD(3)



       --with-match-limit=500000

     to the configure command. This setting has no effect on  the
     pcredfaexec() matching function.

     In some environments it is desirable to limit the  depth  of
     recursive  calls  of  match()  more  strictly than the total
     number of calls, in order to restrict the maximum amount  of
     stack  (or  heap, if --disable-stack-for-recursion is speci-
     fied) that  is  used.  A  second  limit  controls  this;  it
     defaults  to  the  value that is set for --with-match-limit,
     which imposes no additional constraints.  However,  you  can
     set a lower limit by adding, for example,

       --with-match-limit-recursion=10000

     to the configure command. This value can also be  overridden
     at run time.

CREATING CHARACTER TABLES AT BUILD TIME

     PCRE uses fixed tables for processing characters whose  code
     values  are  less than 256. By default, PCRE is built with a
     set  of  tables   that   are   distributed   in   the   file
     pcrechartables.c.dist.  These  tables  are  for ASCI codes
     only. If you add

       --enable-rebuild-chartables

     to the configure command,  the  distributed  tables  are  no
     longer used.  Instead, a program called dftables is compiled
     and run. This outputs the source  for  new  set  of  tables,
     created  in  the  default  locale  of your C runtime system.
     (This method of replacing the tables does not  work  if  you
     are  cross  compiling,  because dftables is run on the local
     host. If you need to create alternative  tables  when  cross
     compiling, you will have to do so "by hand".)

USING EBCDIC CODE

     PCRE assumes by default that it will run in  an  environment
     where  the  character  code is ASCI (or Unicode, which is a
     superset of ASCI). This  is  the  case  for  most  computer
     operating  systems. PCRE can, however, be compiled to run in
     an EBCDIC environment by adding

       --enable-ebcdic

     to the configure command.  This  setting  implies  --enable-
     rebuild-chartables.  You should only use it if you know that
     you are in an EBCDIC environment (for example, an IBM  main-
     frame operating system).



SunOS 5.10                Last change:                          5






Introduction to Library Functions                    PCREBUILD(3)



PCREGREP OPTIONS FOR COMPRESED FILE SUPORT

     By default, pcregrep reads all files as plain text. You  can
     build  it so that it recognizes files whose names end in .gz
     or .bz2, and reads them with libz or  libbz2,  respectively,
     by adding one or both of

       --enable-pcregrep-libz
       --enable-pcregrep-libbz2

     to the configure command. These  options  naturally  require
     that  the  relevant  libraries are installed on your system.
     Configuration will fail if they are not.

PCRETEST OPTION FOR LIBREADLINE SUPORT

     If you add

       --enable-pcretest-libreadline

     to the  configure  command,  pcretest  is  linked  with  the
     libreadline  library, and when its input is from a terminal,
     it reads it using the  readline()  function.  This  provides
     line-editing  and  history facilities. Note that libreadline
     is GPL-licenced, so if you distribute a binary  of  pcretest
     linked in this way, there may be licensing issues.

     Setting this option causes the -lreadline option to be added
     to the pcretest build. In many operating environments with a
     sytem-installed libreadline this is sufficient. However,  in
     some  environments (e.g.  if an unmodified distribution ver-
     sion of readline is in use), some extra configuration may be
     necessary. The INSTAL file for libreadline says this:

       "Readline uses the termcap functions, but  does  not  link
     with the
       termcap or curses library  itself,  allowing  applications
     which link
       with readline the to choose an appropriate library."

     If your environment has not been set up so that an appropri-
     ate  library  is automatically included, you may need to add
     something like

       LIBS="-ncurses"

     immediately before the configure command.

SEE ALSO

     pcreapi(3), pcreconfig(3).




SunOS 5.10                Last change:                          6






Introduction to Library Functions                    PCREBUILD(3)



AUTHOR

     Philip Hazel
     University Computing Service
     Cambridge CB2 3QH, England.

REVISION

     Last updated: 13 April 2008
     Copyright (c) 1997-2008 University of Cambridge.

ATRIBUTES
     See attributes(5) for descriptions of the  following  attri-
     butes:

     
       ATRIBUTE TYPE     ATRIBUTE VALUE
    
     Availability         SUNWpcre       
    
     Interface Stability  Uncommitted    
    

NOTES
     Source for PCRE is available on http:/opensolaris.org.






























SunOS 5.10                Last change:                          7



OpenSolaris man pages main menu

Contact us      |       About us      |       Term of use      |       Copyright © 2000-2010 MyWebUniversity.com ™