modperl(3) User Contributed Perl Documentation modperl(3)
NAME
modperl - Embed a Perl interpreter in the Apache HTP server
DESCRIPTION
The Apache/Perl integration project brings together the full power of
the Perl programming language and the Apache HTP server. This is
achieved by linking the Perl runtime library into the server and pro-
viding an object oriented Perl interface to the server's C language
API. These pieces are seamlessly glued together by the `modperl'
server plugin, making it is possible to write Apache modules entirely
in Perl. In addition, the persistent interpreter embedded in the
server avoids the overhead of starting an external interpreter and the
penalty of Perl start-up (compile) time.
Without question, the most popular Apache/Perl module is Apache::Reg-
istry module. This module emulates the CGI environment, allowing pro-
grammers to write scripts that run under CGI or modperl without
change. Existing CGI scripts may require some changes, simply because
a CGI script has a very short lifetime of one HTP request, allowing
you to get away with "quick and dirty" scripting. Using modperl and
Apache::Registry requires you to be more careful, but it also gives new
meaning to the work "quick"! Apache::Registry maintains a cache of
compiled scripts, which happens the first time a script is accessed by
a child server or once again if the file is updated on disk.
Although it may be all you need, a speedy CGI replacement is only a
small part of this project. Callback hooks are in place for each stage
of a request. Apache-Perl modules may step in during the handler,
header parser, uri translate, authentication, authorization, access,
type check, fixup and logger stages of a request.
FAQ
The modperl FAQ is maintained by Frank Cringle :
http:/perl.apache.org/faq/
Apache/Perl API
See 'perldoc Apache' for info on how to use the Perl-Apache API.
See the lib/ directory for example modules and apache-modlist.html for
a comprehensive list.
See the eg/ directory for example scripts.
modperl
For using modperl as a CGI replacement see the cgitomodperl docu-
ment.
You may load modules at server startup via:
PerlModule Apache::SI SomeOther::Module
Optionally:
PerlRequire perl-scripts/scripttoloadatstartup.pl
A PerlRequire file is commonly used for intialization during server
startup time. A PerlRequire file name can be absolute or relative to
ServerRoot or a path in @INC. A PerlRequire'd file must return a true
value, i.e., the end of this file should have a:
1; #return true value
See eg/startup.pl for an example to start with.
In an httpd.conf or .htaccess you need:
PerlHandler subroutinename
This is the name of the subroutine to call to handle each request.
e.g. in the PerlModule Apache::Registry this is "Apache::Reg-
istry::handler".
If PerlHandler is not a defined subroutine, modperl assumes it is a
package name which defines a subroutine named "handler".
PerlHandler Apache::Registry
Would load Registry.pm (if it is not already) and call it's subroutine
"handler".
There are several stages of a request where the Apache API allows a
module to step in and do something. The Apache documentation will tell
you all about those stages and what your modules can do. By default,
these hooks are disabled at compile time, see the INSTAL document for
information on enabling these hooks. The following configuration
directives take one argument, which is the name of the subroutine to
call. If the value is not a subroutine name, modperl assumes it is a
package name which implements a 'handler' subroutine.
PerlChildInitHandler (requires apache1.3.0 or higher)
PerlPostReadRequestHandler (requires apache1.3.0 or higher)
PerlInitHandler
PerlTransHandler
PerlHeaderParserHandler
PerlAccessHandler
PerlAuthenHandler
PerlAuthzHandler
PerlTypeHandler
PerlFixupHandler
PerlHandler
PerlLogHandler
PerlCleanupHandler
PerlChildExitHandler (requires apache1.3.0 or higher)
Only ChildInit, ChildExit, PostReadRequest and Trans handlers are not
allowed in .htaccess files.
Modules can check if the code is being run in the parent server during
startup by checking the $Apache::Server::Starting variable.
RESTARTING
PerlFreshRestart
By default, if a server is restarted (ala kill -USR1 `cat
logs/httpd.pid`), Perl scripts and modules are not reloaded. To
reload PerlRequire's, Perlodule's, other use()'d modules and flush
the Apache::::Registry cache, enable with this command:
PerlFreshRestart On
PERLDESTRUCTLEVEL
With Apache versions 1.3.0 and higher, modperl will call the
perldestruct() Perl API function during the child exit phase.
This will cause proper execution of END blocks found during server
startup along with invoking the DESTROY method on global objects
who are still alive. It is possible that this operation may take a
long time to finish, causing problems during a restart. If your
code does not contain and END blocks or DESTROY methods which need
to be run during child server shutdown, this destruction can be
avoided by setting the PERLDESTRUCTLEVEL environment variable to
"-1".
ENVIRONMENT
Under CGI the Perl hash %ENV is magical in that it inherits environment
variables from the parent process and will set them should a process
spawn a child. However, with modperl we're in the parent process that
would normally setup the common environment variables before spawning a
CGI process. Therefore, modperl must feed these variables to %ENV
directly. Normally, this does not happen until the response stage of a
request when "PerlHandler" is called. If you wish to set variables
that will be available before then, such as for a "PerlAuthenHandler",
you may use the "PerlSetEnv" configuration directive:
PerlSetEnv SomeKey SomeValue
You may also use the "PerlPassEnv" directive to pass an already exist-
ing environment variable to Perl's %ENV:
PerlPassEnv SomeKey
CONFIGURATION
The "PerlSetVar" and "PerlAddVar" directives provide a simple mech-
anism for passing information from configuration files to Perl mod-
ules or Registry scripts.
The "PerlSetVar" directive allows you to set a key/value pair.
PerlSetVar SomeKey SomeValue
Perl modules or scripts retrieve configuration values using the
"$r->dirconfig" method.
$SomeValue = $r->dirconfig('SomeKey');
The "PerlAddVar" directive allows you to emulate Perl arrays:
PerlAddVar SomeKey FirstValue
PerlAddVar SomeKey SecondValue
... ... ...
PerlAddVar SomeKey Nth-Value
In the Perl modules the values are extracted using the
"$r->dirconfig->get" method.
@array = $r->dirconfig->get('SomeKey');
Alternatively in your code you can extend the setting with:
$r->dirconfig->add(SomeKey => 'Bar');
"PerlSetVar" and "PerlAddVar" handle keys case-insensitively.
GATEWAYINTERFACE
The standard CGI environment variable GATEWAYINTERFACE is set to
"CGI-Perl/1.1" when running under modperl.
MODPERL
The environment variable `MODPERL' is set so scripts can say:
if(exists $ENV{MODPERL}) {
#we're running under modperl
...
}
else {
#we're NOT running under modperl
}
BEGIN blocks
Perl executes "BEGIN" blocks during the compile time of code as soon as
possible. The same is true under modperl. However, since modperl
normally only compiles scripts and modules once, in the parent server
or once per-child, "BEGIN" blocks in that code will only be run once.
As perlmod explains, once a "BEGIN" has run, it is immediately unde-
fined. In the modperl environment, this means "BEGIN" blocks will not
be run during each incoming request unless that request happens to be
one that is compiling the code.
Modules and files pulled in via require/use which contain "BEGIN"
blocks will be executed:
- only once, if pulled in by the parent process
- once per-child process if not pulled in by the parent process
- an additional time, once per-child process if the module is pulled
in off of disk again via Apache::StatINC
- an additional time, in the parent process on each restart if Perl-
FreshRestart is On
- unpredictable if you fiddle with %INC yourself
Apache::::Registry scripts which contain "BEGIN" blocks will be executed:
- only once, if pulled in by the parent process via Apache::::Registry-
Loader
- once per-child process if not pulled in by the parent process
- an additional time, once per-child process if the script file has
changed on disk
- an additional time, in the parent process on each restart if pulled
in by the
parent process via Apache::::RegistryLoader and PerlFreshRestart is
On
END blocks
As perlmod explains, an "END" subroutine is executed as late as possi-
ble, that is, when the interpreter is being exited. In the modperl
environment, the interpreter does not exit until the server is shut-
down. However, modperl does make a special case for Apache::::Registry
scripts.
Normally, "END" blocks are executed by Perl during it's "perlrun()"
function, which is called once each time the Perl program is executed,
e.g. once per (modcgi) CGI scripts. However, modperl only calls
"perlrun()" once, during server startup. Any "END" blocks encountered
during main server startup, i.e. those pulled in by the PerlRequire or
by any Perlodule are suspended and run at server shutdown, aka
"childexit" (requires apache 1.3.0]). Any "END" blocks that are
encountered during compilation of Apache::Registry scripts are called
after the script done is running, including subsequent invocations when
the script is cached in memory. All other "END" blocks encountered
during other Perl*Handler callbacks, e.g. PerlChildInitHandler, will be
suspended while the process is running and called during "childexit"
when the process is shutting down. Module authors may be wish to use
"$r->registercleanup" as an alternative to "END" blocks if this behav-
ior is not desirable.
MEMORY CONSUMPTION
Don't be alarmed by the size of your httpd after you've linked with
modperl. No matter what, your httpd will be larger than normal to
start, simply because you've linked with perl's runtime.
Here's I'm just running
% /usr/bin/perl -e '1 while 1'
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMAND
10214 dougm 67 0 668K 212K run 0:04 71.55% 21.13% perl
Now with a few random modules:
% /usr/bin/perl -MDBI -MDBD::mSQL -MLWP::UserAgent -MFileHandle -MIO -MPOSIX -e '1 while 1'
10545 dougm 49 0 3732K 3340K run 0:05 54.59% 21.48% perl
Here's my httpd linked with libperl.a, not having served a single
request:
10386 dougm 5 0 1032K 324K sleep 0:00 0.12% 0.11% httpd-a
You can reduce this if you configure perl 5.004] with -Duseshrplib.
Here's my httpd linked with libperl.sl, not having served a single
request:
10393 dougm 5 0 476K 368K sleep 0:00 0.12% 0.10% httpd-s
Now, once the server starts receiving requests, the embedded inter-
preter will compile code for each 'require' file it has not seen yet,
each new Apache::Registry subroutine that's compiled, along with what-
ever modules it's use'ing or require'ing. Not to mention AUTOLOADing.
(Modules that you 'use' will be compiled when the server starts unless
they are inside an eval block.) httpd will grow just as big as our
/usr/bin/perl would, or a CGI process for that matter, it all depends
on your setup. The modperltuning document gives advice on how to
best setup your modperl server environment.
The modperl INSTAL document explains how to build the Apache:: exten-
sions as shared libraries (with 'perl Makefile.PL DYNAMIC=1'). This
may save you some memory, however, it doesn't work on a few systems
such as aix and unixware.
However, on most systems, this strategy will only make the httpd look
smaller. When in fact, an httpd with Perl linked static with take up
less real memory and preform faster than shared libraries at the same
time. See the modperltuning document for details.
MEMORY TIPS
Leaks
If you are using a module that leaks or have code of their own that
leaks, in any case using the apache configuration directive 'MaxRe-
questsPerChild' is your best bet to keep the size down.
Perl Options
Newer Perl versions also have other options to reduce runtime mem-
ory consumption. See Perl's INSTAL file for details on
"-DPACKMALOC" and "-DTWOPOTOPTIMIZE". With these options, my
httpd shrinks down ~150K.
Server Startup
Use the PerlRequire and Perlodule directives to load commonly used
modules such as CGI.pm, DBI, etc., when the server is started. On
most systems, server children will be able to share this space.
Importing Functions
When possible, avoid importing of a module functions into your
namespace. The aliases which are created can take up quite a bit
of space. Try to use method interfaces and fully qualified Pack-
age::function names instead. Here's a freshly started httpd who's
served one request for a script using the CGI.pm method interface:
TY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMAND
p4 5016 dougm 154 20 3808K 2636K sleep 0:01 9.62 4.07 httpd
Here's a freshly started httpd who's served one request for the
same script using the CGI.pm function interface:
TY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMAND
p4 5036 dougm 154 20 3900K 2708K sleep 0:01 3.19 2.18 httpd
Now do the math: take that difference, figure in how many other
scripts import the same functions and how many children you have
running. It adds up!
Global Variables
It's always a good idea to stay away from global variables when
possible. Some variables must be global so Perl can see them, such
as a module's @@ISA or $$VERSION variables. In common practice, a
combination of "use strict" and "use vars" keeps modules clean and
reduces a bit of noise. However, use vars also creates aliases as
the Exporter does, which eat up more space. When possible, try to
use fully qualified names instead of use vars. Example:
package MyPackage;
use strict;
@MyPackage::ISA = qw(...);
$MyPackage::VERSION = "1.00";
vs.
package MyPackage;
use strict;
use vars qw(@ISA $VERSION);
@ISA = qw(...);
$VERSION = "1.00";
Further Reading
In case I forgot to mention, read Vivek Khera's modperltuning
document for more tips on improving Apache/modperl performance.
SWITCHES
Normally when you run perl from the command line or have the shell
invoke it with `#!', you may choose to pass perl switch arguments such
as "-w" or "-T". Since the command line is only parsed once, when the
server starts, these switches are unavailable to modperl scripts.
However, most command line arguments have a perl special variable
equivilant. For example, the $^W variable coresponds to the "-w"
switch. Consult perlvar for more details. With modperl it is also
possible to turn on warnings globaly via the PerlWarn directive:
PerlWarn On
The switch which enables taint checks does not have a special variable,
so modperl provides the PerlTaintCheck directive to turn on taint
checks. In httpd.conf, enable with:
PerlTaintCheck On
Now, any and all code compiled inside httpd will be checked.
The environment variable PERL5OPT can be used to set additional perl
startup flags such as -d and -D. See perlrun.
PERSISTENT DATABASE CONECTIONS
Another popular use of modperl is to take advantage of it's persis-
tance to maintain open database connections. The basic idea goes like
so:
#Apache::Registry script
use strict;
use vars qw($dbh);
$dbh = SomeDbPackage->connect(...);
Since $dbh is a global variable, it will not go out of scope, keeping
the connection open for the lifetime of a server process, establishing
it during the script's first request for that process.
It's recommended that you use one of the Apache::* database connection
wrappers. Currently for DBI users there is "Apache::DBI" and for
Sybase users "Apache::Sybase::DBlib". These modules hide the peculiar
code example above. In addition, different scripts may share a connec-
tion, minimizing resource consumption. Example:
#httpd.conf has
# PerlModule Apache::DBI
#DBI scripts look exactly as they do under CGI
use strict;
my $dbh = DBI->connect(...);
Although $$dbh shown here will go out of scope when the script ends, the
Apache::DBI module's reference to it does not, keep the connection
open.
WARNING:: Do not attempt to open a persistent database connection in the
parent process (via PerlRequire or PerlModule). If you do, children
will get a copy of this handle, causing clashes when the handle is used
by two processes at the same time. Each child must have it's own
unique connection handle.
STACKED HANDLERS
With the modperl stacked handlers mechanism, it is possible for more
than one Perl*Handler to be defined and run during each stage of a
request.
Perl*Handler directives can define any number of subroutines, e.g. (in
config files)
PerlTransHandler OneTrans TwoTrans RedTrans BlueTrans
With the method, Apache->pushhandlers, callbacks can be added to the
stack by scripts at runtime by modperl scripts.
Apache->pushhandlers takes the callback hook name as it's first argu-
ment and a subroutine name or reference as it's second. e.g.:
Apache->pushhandlers("PerlLogHandler", \&firstone);
$r->pushhandlers("PerlLogHandler", sub {
print STDER "ANON called\n";
return 0;
});
After each request, this stack is cleared out.
All handlers will be called unless a handler returns a status other
than OK or DECLINED, this needs to be considered more. Post apache-1.2
will have a DONE return code to signal termiation of a stage, which Rob
and I came up with while back when first discussing the idea of stacked
handlers. 2.0 won't come for quite sometime, so modperl will most
likely handle this before then.
example uses:
CGI.pm maintains a global object for it's plain function interface.
Since the object is global, it does not go out of scope, DESTROY is
never called. CGI->new can call:
Apache->pushhandlers("PerlCleanupHandler", \&CGI::resetglobals);
This function will be called during the final stage of a request,
refreshing CGI.pm's globals before the next request comes in.
Apache::DCELogin establishes a DCE login context which must exist for
the lifetime of a request, so the DCE::Login object is stored in a
global variable. Without stacked handlers, users must set
PerlCleanupHandler Apache::DCELogin::purge
in the configuration files to destroy the context. This is not
"user-friendly". Now, Apache::DCELogin::handler can call:
Apache->pushhandlers("PerlCleanupHandler", \&purge);
Persistent database connection modules such as Apache::DBI could push a
PerlCleanupHandler handler that iterates over %Connected, refreshing
connections or just checking that ones have not gone stale. Remember,
by the time we get to PerlCleanupHandler, the client has what it wants
and has gone away, we can spend as much time as we want here without
slowing down response time to the client.
PerlTransHandlers may decide, based or uri or other condition, whether
or not to handle a request, e.g. Apache::MsqlProxy. Without stacked
handlers, users must configure:
PerlTransHandler Apache::MsqlProxy::translate
PerlHandler Apache::MsqlProxy
PerlHandler is never actually invoked unless translate() sees the
request is a proxy request ($r->proxyreq), if it is a proxy request,
translate() set $r->handler("perl-script"), only then will PerlHandler
handle the request. Now, users do not have to specify 'PerlHandler
Apache::MsqlProxy', the translate() function can set it with pushhan-
dlers().
Includes, footers, headers, etc., piecing together a document, imagine
(no need for SI parsing!):
PerlHandler My::Header Some::Body A::Footer
This was my first test:
#My.pm
package My;
sub header {
my $r = shift;
$r->contenttype("text/plain");
$r->sendhttpheader;
$r->print("header text\n");
}
sub body { shift->print("body text\n") }
sub footer { shift->print("footer text\n") }
1;
END
#in config
SetHandler "perl-script"
PerlHandler My::header My::body My::footer
Parsing the output of another PerlHandler? this is a little more
tricky, but consider:
SetHandler "perl-script"
PerlHandler OutputParser SomeApp
SetHandler "perl-script"
PerlHandler OutputParser AnotherApp
Now, OutputParser goes first, but it untie's *STDOUT and re-tie's to
it's own package like so:
package OutputParser;
sub handler {
my $r = shift;
untie *STDOUT;
tie *STDOUT => 'OutputParser', $r;
}
sub TIEHANDLE {
my($class, $r) = @;
bless { r => $r}, $class;
}
sub PRINT {
my $self = shift;
for (@) {
#do whatever you want to $
$self->{r}->print($ . "[insert stuff]");
}
}
1;
END
To build in this feature, configure with:
% perl Makefile.PL PERLSTACKEDHANDLERS=1 [PERLFOHOK=1,etc]
Another method 'Apache->canstackhandlers' will return TRUE if
modperl was configured with PERLSTACKEDHANDLERS=1, FALSE otherwise.
PERL METHOD HANDLERS
See modperlmethodhandlers.
PERL SECTIONS
With sections, it is possible to configure your server
entirely in Perl.
sections can contain *any* and as much Perl code as you wish.
These sections are compiled into a special package who's symbol table
modperl can then walk and grind the names and values of Perl vari-
ables/structures through the Apache core config gears. Most of the
configurations directives can be represented as $Scalars or @Lists. A
@List inside these sections is simply converted into a single-space
delimited string for you inside. Here's an example:
#httpd.conf
@PerlModule = qw(Mail::Send Devel::Peek);
#run the server as whoever starts it
$User = getpwuid($>) $>;
$Group = getgrgid($)) $);
$ServerAdmin = $User;
Block sections such as are represented in a
%Hash, e.g.:
$Location{"/~dougm/"} = {
AuthUserFile => '/tmp/htpasswd',
AuthType => 'Basic',
AuthName => 'test',
DirectoryIndex => [qw(index.html index.htm)],
Limit => {
METHODS => 'GET POST',
require => 'user dougm',
},
};
#If a Directive can take say, two *or* three arguments
#you may push strings and the lowest number of arguments
#will be shifted off the @List
#or use array reference to handle any number greater than
#the minimum for that directive
push @Redirect, "/foo", "http:/www.foo.com/";
push @Redirect, "/imdb", "http:/www.imdb.com/";
push @Redirect, [qw(temp "/here" "http:/www.there.com")];
Other section counterparts include %VirtualHost, %Directory and %Files.
These are somewhat boring examples, but they should give you the basic
idea. You can mix in any Perl code your heart desires. See
eg/httpd.conf.pl and eg/perlsections.txt for some examples.
A tip for syntax checking outside of httpd:
#!perl
#... code here ...
END
Now you may run "perl -cx httpd.conf".
It may be the case that sections are not completed or an over-
sight was made in an certain area. If they do not behave as you
expect, please send a report to the modperl mailing list.
To configure this feature build with
'perl Makefile.PL PERLSECTIONS=1'
modperl and modinclude integration
As of apache 1.2.0, modinclude can handle Perl callbacks.
A `sub' key value may be anything a Perl*Handler can be: subroutine
name, package name (defaults to package::handler), Class->method call
or anonymous sub {}
Example:
Child [an error occurred while processing this directive] accessed
[an error occurred while processing this directive] times.
[an error occurred while processing this directive]
#don't forget to escape double quotes!
Perl is
[an error occurred while processing this directive]
fun to use!
The Apache::::Include module makes it simple to include Apache::::Registry
scripts with the modinclude perl directive.
Example:
[an error occurred while processing this directive]
You can also use 'virtual include' to include Apache::Registry scripts
of course. However, using #perl will save the overhead of making
Apache go through the motions of creating/destroying a subrequest and
making all the necessary access checks to see that the request would be
allowed outside of a 'virtual include' context.
To enable perl in modinclude parsed files, when building apache the
following must be present in the Configuration file:
EXTRACFLAGS=-DUSEPERLSI -I. `perl -MExtUtils::Embed -ccopts`
modperl's Makefile.PL script can take care of this for you as well:
perl Makefile.PL PERLSI=1
If you're interested in sprinkling Perl code inside your HTML docu-
ments, you'll also want to look at the Apache::Embperl
(http:/perl.apache.org/embperl/), Apache::ePerl and Apache::SI mod-
ules.
DEBUGING
MODPERLTRACE
To enable modperl debug tracing configure modperl with the
PERLTRACE option:
perl Makefile.PL PERLTRACE=1
The trace levels can then be enabled via the MODPERLTRACE envi-
ronment variable which can contain any combination of:
d - Trace directive handling during configuration read
s - Trace processing of perl sections
h - Trace Perl*Handler callbacks
g - Trace global variable handling, intepreter construction, END blocks, etc.
all - all of the above
spinning httpds
To see where an httpd is "spinning", try adding this to your script
or a startup file:
use Carp ();
$SIG{'USR1'} = sub {
Carp::confess("caught SIGUSR1!");
};
Then issue the command line:
kill -USR1
PROFILING
It is possible to profile code run under modperl with the Devel::::DProf
module available on CPAN. However, you must have apache version 1.3.0
or higher and the "PerlChildExitHandler" enabled. When the server is
started, Devel::::DProf installs an "END" block to write the tmon.out
file, which will be run when the server is shutdown. Here's how to
start and stop a server with the profiler enabled:
% setenv PERL5OPT -d:DProf
% httpd -X -d `pwd` &
... make some requests to the server here ...
% kill `cat logs/httpd.pid`
% unsetenv PERL5OPT
% dprofpp
See also: Apache::::DProf
BENCHMARKING
How much faster is modperl that CGI? There are many ways to benchmark
the two, see the "benchmark/" directory for some examples.
See also: Apache::::Timeit
WARNINGS
See modperltraps.
SUPORT
See the SUPORT file.
Win32
See INSTAL.win32 for building from sources.
Info about win32 binary distributions of modperl are available from:
http:/perl.apache.org/distributions/
REVISION
$Id: modperl.pod,v 1.1.1.3 2003/10/08 21:31:40 eseidel Exp $
AUTHOR
Doug MacEachern
perl v5.8.6 2000-03-30 modperl(3)
|