Describing language—whether coded, written, or spoken, is fundamentally difficult
because in order to understand the language components (nouns, verbs, adjectives)
you also need to need to understand the semantics that convert those components in
isolation into an understandable language that allows you to communicate. Unfortunately,
it’s impossible to describe those semantics without giving examples of their use!
As a rule, Perl lets you do what you want, when you want to, and, more or less,
how you want to. Perl is far more concerned about letting you develop a solution
that works than it is about slotting your chosen solution into a set of standards and
a rigid structure.
The core of any program are the variables used to hold changeable information. You
change the contents of those variables using operators, regular expressions, and functions.
Statements help to control the flow of your program and enable you to declare certain
facts about the programs you are running. If you can’t find what you want using the
base Perl function set, you can make use of a number of modules, which export a list
of variables and functions that provide additional information and operations. If you
want to work in a structured format, modules also support objects, methods, and object
classes. You can, of course, also make your own modules that use your own functions.
We’ll have a quick look at some of the elements and components within Perl that
will help when we start to look at these individual items in more detail in future chapters.
Variables
Variables hold variable pieces of information—they are just storage containers for
numbers, strings, and compound structures (lists of numbers and strings) that we
might want to change at some future point.
Perl supports one basic variable type, the scalar. A scalar holds numbers and strings,
so we could rewrite the simple “Hello World” example at the beginning of this chapter as
$message = "Hello World\n";
print $message;
In this example, we’ve assigned a literal to a variable called $message. When you
assign a value to a variable, you are just populating that variable with some information.
A literal is a piece of static information—in this case it’s a string, but it could have been
a number. By the way, when you assign information, you are assigning the value to theright of the assignation operator (the = sign) to the lvalue on the left. The lvalue is the
name given to a variable or structure that can hold information. Normally this is a
variable, but functions and objects are also types of lvalues.
You’ll notice in the preceding example that the variable, $message, has a “funny”
character at the beginning. In this case, it’s a dollar sign, and it identifies the variable as
being a scalar. You always use a dollar sign when accessing a scalar value. The way to
remember a scalar is that the $ sign looks like an “s”, for scalar!
There are also some compound variable types—namely the array and the hash.
The array is a list of scalar variables—thus we can store a list of days using
@days = ('Mon','Tue','Wed','Thu','Fri','Sat','Sun');
The leading character for an array is an @ sign (think “a” for array), and you always
access an array of one or more values using an @ sign. You access the values in an array
by the numerical index; the first value is at index 0, so to get the first day of the week
from the preceding list, we’d use $days[0]. Note the leading $ sign—this is required
because we are accessing the scalar value at index 0 from the array.
Perl also supports a hash—this is a list that uses not numerical indices, but instead a
string “key” to access each “value”—the so-called key/value pair. Hash variables start
with a % sign—think of the two “o” characters in the % as the key and value. Thus we
could create a hash that contains month names (as the keys) and the days in that month
(as the values):
%months = ('January' => 31,
...
'November' => 30,
'December' => 31);
Now all we need to do when we want to know how many days are in November is
access the value in the %months hash with a key of “November”:
print "Days in November:",$months{'November'},"\n";
Perl also supports some other types of variables, such as filehandles (which allow
us to read from and write to files) and typeglobs (which allow us to access a variable
via the internal symbol tables). We also use references, which just point to other
variables without actually containing a value themselves.
The special characters used to access variables are a vital part of the Perl
language—they enable us to identify the variables easily and let the programmer
and Perl know what sort of variable we are expecting to use.Operators
Operators perform some sort of operation on a value or variable. For example, the +
operator adds two numbers together:
$sum = 4 + 5;
Other operators allow you to perform other basic math calculations, introduce lists
of values (for use with functions and variables), and assign values to variables and
subroutines.
There are also operators that enable us to use regular expressions that can “match”
information contained within a string against an expression, or perform a substitution
so that we can replace and translate information without having to explicitly define
its contents.
We’ll be looking at Perl operators, and the core mechanics of how Perl takes a raw
script and interprets the contents, in Chapter 3.
Statements
Statements enable us to control the execution of our script—for example, we might use
the if statement to test the value of a variable or operation so that the script can make
an informed decision about what to do next. Other statements include the loops, which
allow us to repeat a process on the same piece of data or on a sequence of data. Statements
also include declarations, such as those that allow us to define variables and subroutines.
We’ll be covering statements and control structures in Chapter 5.
Subroutines (Functions)
When you want to perform an operation on a variable a number of times, or the same
operation on a number of variables, it makes sense to place that sequence of operations
into a subroutine or function. Now when you want to perform that operation, you send
the variable to the subroutine, and then use the value returned from that subroutine.
Perl includes a number of subroutines that perform different operations—including
the print subroutine, which sends information to the screen (or to a file). Other subroutines
built into Perl include those for opening and communicating with files, talking
over a network, or accessing information about the system. Other built-ins provide
simple ways for performing different operations on variables and values.Modules
Once you have a collection of subroutines that you find useful, then you’ll probably
want to use them in other scripts and applications that you build with Perl. You could
copy them to the new scripts, but a much better solution is to make your own modules.
These are the libraries that extend the functionality of Perl.
Perl comes with its own, quite extensive, set of modules that allow you to communicate
over a network (see Chapter 12), develop user interfaces (see Chapter 17), access external
databases (see Chapter 13), and provide an interface for communicating with a web
server and a client browser when developing web solutions (see Chapter 18).
If you can’t find what you want within the standard Perl distribution, then there
is a central repository of modules built by other programmers called CPAN. This
contains literally thousands of modules to handle everything from accessing data
sources through to handling XML (Extensible Markup Language).
Thursday, December 20, 2007
Perl Under Mac OS
Compared to Unix and Windows, Mac OS has one significant missing feature: it has
no command line interface. Mac OS is a 100 percent windowed GUI environment.
This presents some potential problems when we consider the methods already
suggested for running Perl programs.
The solution is a separate “development environment.” The MacPerl application
supports both the execution of Perl scripts and the creation of the scripts in the first
place. In addition to direct access to the execution process (scripts can be started and
stopped from menus and key combinations), MacPerl also permits you interactive use
of the Perl debugger in a familiar environment, and complete control of the environment
in which the scripts are executed.
Quite aside from the simple interface problem, there are also underlying differences
between text file formats, the value of the epoch used for dates and times, and even the
supported commands and functions. There are ways of getting around these problems,
both using your own Perl scripts and modifications and using a number of modules
that are supplied as standard with the MacPerl application.
The current version of MacPerl is based on v5.004 of Perl, which makes it a couple
of years old. Although the developer, Matthias Neeracher, has promised to work on a
new version, there is currently no set date for a Perl 5.6 release. In fact, it’s possible that
MacPerl may not be updated until Perl 6 is released toward the end of 2001.Repository URL
ActiveState http://www.activestate.com/packages
Jan Krynicky http://jenda.krynicky.cz/perl
Roth Consulting http://www.roth.net/perl/packages/
Achim Bohnet http://www.xray.mpe.mpg.de/~ach/prk/ppm
RTO http://rto.dk/packages/
Table 2-2. PPM RepositoriesFor those of you interested in developing with Perl under Mac OS X, you’ll be pleased
to hear that Mac OS X’s Unix layer is used to provide the same basic functionality as
any Unix distribution. In fact, Mac OS X actually comes with Perl installed as standard.
Installation
Perl is available in a number of different guises, depending on what you want to do
with it and how extensible and expandable you want the support modules to be. The
basic distribution, “appl”, includes the MacPerl binary, all the Perl and MacPerl libraries
and modules, and the documentation. The “tool” distribution works with MPW (the
Macintosh Programmer’sWorkshop), allowing you to develop and release Perl programs
that are part of a larger overall application while presenting you with the same interface
and development environment you use for C/C++ and Pascal Mac applications. Because
MacPerl provides an almost transparent interface to the underlying Mac toolbox, you
can use Perl and C/C++/Pascal programs and code interchangeably. The source, in the
“src” distribution, including all of the toolbox interfaces, is also available.
Installing the application is a case of downloading and decompressing the installer,
and then double-clicking on the installer application. This will install all the modules,
application, and documentation you need to start programming in Perl. Starting MacPerl
is a simple case of double-clicking on the application.
Executing Scripts
Perl scripts are identified using the Mac OS Creator and Type codes. The MacPerl
environment automatically sets this information when you save the script. In fact,
MacPerl specifies three basic formats for running scripts and one additional format
for use with Mac-based web servers. The different formats are outlined in Table 2-3.File Type Description
Droplet A droplet is a mini-application that consists of the original
Perl script and a small amount of glue code that uses Apple
events to start MacPerl, if it is not already running, and then
executes the script. Using droplets is the recommended
method for distributing MacPerl scripts.
To save a script as a droplet, go to Save As under the File
menu, and choose Droplet in the file type pop-up at the
bottom of the file dialog box.
Files dragged and dropped onto a droplet’s icon in the
Finder have their names passed to the script as arguments
(within @ARGV).
If you plan on distributing your scripts to other people,
droplets require that the destination users have MacPerl
already installed. This might make initial distributions large
(about 800K), although further updates should be smaller.
Table 2-3. MacPerl Script TypesFile Type Description
Stand-alone applications A stand-alone application creates a file composed of the Perl
application and the script and related modules. This creates a
single, “double-clickable” application that runs and executes
your script.
This can be a great solution if you want to provide a single-file
solution for a client, or if you want to save clients the task of
installing MacPerl on their machines.
However, this is still an interpreted version. The script is
not compiled into an executable, just bundled with the Perl
interpreter into a single file.
Plain text file A plain text file can be opened within the MacPerl environment
and executed as a Perl script. Make sure that if the script has
come from another platform, the script is in Mac OS text format.
These files will not automatically execute when you doubleclick
them. They open either the built-in editor within MacPerl
or the editor you usually use for editing text files (for example,
SimpleText, BBEdit, or emacs).
CGI Script This creates a script suitable for execution under many
Mac-specific web servers, including the one supported
by Apple’s AppleShare IP 6.0.
Table 2-3. MacPerl Script Types (continued)When a script is executing, STDIN, STDOUT, and STDERR are supported directly
within the MacPerl environment. If you want to introduce information on a “command
line” (other than files, if you are using a droplet), you will need to use the Mac-specific
toolbox modules and functions to request the information from the user.
Installing Third-Party Modules
Installation of third-party modules under MacPerl is complicated by the lack of either
standard development tools or a command-line environment that would enable you to
execute the normal Perl makefiles for make tools.
Scripts that rely on external modules, such as those from CPAN (especially those
that require C source code to be compiled), may cause difficulties, not all of which can
be easily overcome. The process for installing a third-party module is as follows:
1. Download and then extract the module. Most modules are supplied as a gzipped
tar file. You can either use the individual tools, MacGzip and suntar, to extract
the file, or use Aladdin System’s Stuffit Expander with the Expander Extensions.Whichever application set you use, remember to switch line-feed conversion
on. This will convert the Unix-style Perl scripts into Macintosh text files, which
will be correctly parsed by the MacPerl processor.
2. Read the documentation to determine whether the module or any modules on
which it relies use XS or C source code. If they do, it’s probably best to forget
about using the module. If you have access to the MPW toolkit, you may be
able to compile the extension, but success is not guaranteed. You can also ask
another MacPerl user, via the MacPerl mailing list, to compile it for you.
3. Ignore the Makefile.PL file. Although it might run, it will probably report an
error like this:
# On MacOS, we need to build under the Perl source directory or
have the MacPerl SDK installed in the MacPerl folder.
Ignore it, because you need to install the Perl modules manually. Even if the
Makefile.PL runs successfully, it will generate a makefile that you can’t use on
the Mac without the addition of some special tools!
4. Create a new folder (if you don’t already have one) to hold your site-specific
and contributed modules. This is usually located in $ENV{MACPERL}sitelib:,
although you can create it anywhere, as long as you add the directory to the
@INC variable via the MacPerl application preferences or use the use lib
pragma within a script.
Remember to create a proper directory structure if you are installing a
hierarchical module. For example, when installing Net::FTP, you need to
install the FTP.pm module into a subdirectory called Net, right below the
site_perl or alternative installation location.
5. Copy across the individual Perl modules to the new directory. If the modules
follow a structure, copy across all the directories and subdirectories.
6. Once the modules are copied across, try using the following script, which will
automatically split the installed module, suitable for autoloading:
use AutoSplit;
my $instdir = "$ENV{MACPERL}site_perl";
autosplit("$dir:Module.pm", "$dir:auto", 0, 1, 1);
Change the $instdir and module names accordingly. See Appendix B for
more details on the AutoSplit module.
7. Once the module is installed, try running one of the test programs, or write a
small script to use one of the modules you have installed. Check the MacPerl
error window. If you get an error like this,
# Illegal character \012 (carriage return).
File 'Midkemia:MacPerl ƒ:site_perl:Net:DNS:Header.pm'; Line 1
# (Maybe you didn't strip carriage returns after a network transfer?)then the file still has Unix-style line feeds in it. You can use BBEdit or a similar
application to convert these to Macintosh text. Alternatively, you could write a
Perl script to do it!
no command line interface. Mac OS is a 100 percent windowed GUI environment.
This presents some potential problems when we consider the methods already
suggested for running Perl programs.
The solution is a separate “development environment.” The MacPerl application
supports both the execution of Perl scripts and the creation of the scripts in the first
place. In addition to direct access to the execution process (scripts can be started and
stopped from menus and key combinations), MacPerl also permits you interactive use
of the Perl debugger in a familiar environment, and complete control of the environment
in which the scripts are executed.
Quite aside from the simple interface problem, there are also underlying differences
between text file formats, the value of the epoch used for dates and times, and even the
supported commands and functions. There are ways of getting around these problems,
both using your own Perl scripts and modifications and using a number of modules
that are supplied as standard with the MacPerl application.
The current version of MacPerl is based on v5.004 of Perl, which makes it a couple
of years old. Although the developer, Matthias Neeracher, has promised to work on a
new version, there is currently no set date for a Perl 5.6 release. In fact, it’s possible that
MacPerl may not be updated until Perl 6 is released toward the end of 2001.Repository URL
ActiveState http://www.activestate.com/packages
Jan Krynicky http://jenda.krynicky.cz/perl
Roth Consulting http://www.roth.net/perl/packages/
Achim Bohnet http://www.xray.mpe.mpg.de/~ach/prk/ppm
RTO http://rto.dk/packages/
Table 2-2. PPM RepositoriesFor those of you interested in developing with Perl under Mac OS X, you’ll be pleased
to hear that Mac OS X’s Unix layer is used to provide the same basic functionality as
any Unix distribution. In fact, Mac OS X actually comes with Perl installed as standard.
Installation
Perl is available in a number of different guises, depending on what you want to do
with it and how extensible and expandable you want the support modules to be. The
basic distribution, “appl”, includes the MacPerl binary, all the Perl and MacPerl libraries
and modules, and the documentation. The “tool” distribution works with MPW (the
Macintosh Programmer’sWorkshop), allowing you to develop and release Perl programs
that are part of a larger overall application while presenting you with the same interface
and development environment you use for C/C++ and Pascal Mac applications. Because
MacPerl provides an almost transparent interface to the underlying Mac toolbox, you
can use Perl and C/C++/Pascal programs and code interchangeably. The source, in the
“src” distribution, including all of the toolbox interfaces, is also available.
Installing the application is a case of downloading and decompressing the installer,
and then double-clicking on the installer application. This will install all the modules,
application, and documentation you need to start programming in Perl. Starting MacPerl
is a simple case of double-clicking on the application.
Executing Scripts
Perl scripts are identified using the Mac OS Creator and Type codes. The MacPerl
environment automatically sets this information when you save the script. In fact,
MacPerl specifies three basic formats for running scripts and one additional format
for use with Mac-based web servers. The different formats are outlined in Table 2-3.File Type Description
Droplet A droplet is a mini-application that consists of the original
Perl script and a small amount of glue code that uses Apple
events to start MacPerl, if it is not already running, and then
executes the script. Using droplets is the recommended
method for distributing MacPerl scripts.
To save a script as a droplet, go to Save As under the File
menu, and choose Droplet in the file type pop-up at the
bottom of the file dialog box.
Files dragged and dropped onto a droplet’s icon in the
Finder have their names passed to the script as arguments
(within @ARGV).
If you plan on distributing your scripts to other people,
droplets require that the destination users have MacPerl
already installed. This might make initial distributions large
(about 800K), although further updates should be smaller.
Table 2-3. MacPerl Script TypesFile Type Description
Stand-alone applications A stand-alone application creates a file composed of the Perl
application and the script and related modules. This creates a
single, “double-clickable” application that runs and executes
your script.
This can be a great solution if you want to provide a single-file
solution for a client, or if you want to save clients the task of
installing MacPerl on their machines.
However, this is still an interpreted version. The script is
not compiled into an executable, just bundled with the Perl
interpreter into a single file.
Plain text file A plain text file can be opened within the MacPerl environment
and executed as a Perl script. Make sure that if the script has
come from another platform, the script is in Mac OS text format.
These files will not automatically execute when you doubleclick
them. They open either the built-in editor within MacPerl
or the editor you usually use for editing text files (for example,
SimpleText, BBEdit, or emacs).
CGI Script This creates a script suitable for execution under many
Mac-specific web servers, including the one supported
by Apple’s AppleShare IP 6.0.
Table 2-3. MacPerl Script Types (continued)When a script is executing, STDIN, STDOUT, and STDERR are supported directly
within the MacPerl environment. If you want to introduce information on a “command
line” (other than files, if you are using a droplet), you will need to use the Mac-specific
toolbox modules and functions to request the information from the user.
Installing Third-Party Modules
Installation of third-party modules under MacPerl is complicated by the lack of either
standard development tools or a command-line environment that would enable you to
execute the normal Perl makefiles for make tools.
Scripts that rely on external modules, such as those from CPAN (especially those
that require C source code to be compiled), may cause difficulties, not all of which can
be easily overcome. The process for installing a third-party module is as follows:
1. Download and then extract the module. Most modules are supplied as a gzipped
tar file. You can either use the individual tools, MacGzip and suntar, to extract
the file, or use Aladdin System’s Stuffit Expander with the Expander Extensions.Whichever application set you use, remember to switch line-feed conversion
on. This will convert the Unix-style Perl scripts into Macintosh text files, which
will be correctly parsed by the MacPerl processor.
2. Read the documentation to determine whether the module or any modules on
which it relies use XS or C source code. If they do, it’s probably best to forget
about using the module. If you have access to the MPW toolkit, you may be
able to compile the extension, but success is not guaranteed. You can also ask
another MacPerl user, via the MacPerl mailing list, to compile it for you.
3. Ignore the Makefile.PL file. Although it might run, it will probably report an
error like this:
# On MacOS, we need to build under the Perl source directory or
have the MacPerl SDK installed in the MacPerl folder.
Ignore it, because you need to install the Perl modules manually. Even if the
Makefile.PL runs successfully, it will generate a makefile that you can’t use on
the Mac without the addition of some special tools!
4. Create a new folder (if you don’t already have one) to hold your site-specific
and contributed modules. This is usually located in $ENV{MACPERL}sitelib:,
although you can create it anywhere, as long as you add the directory to the
@INC variable via the MacPerl application preferences or use the use lib
pragma within a script.
Remember to create a proper directory structure if you are installing a
hierarchical module. For example, when installing Net::FTP, you need to
install the FTP.pm module into a subdirectory called Net, right below the
site_perl or alternative installation location.
5. Copy across the individual Perl modules to the new directory. If the modules
follow a structure, copy across all the directories and subdirectories.
6. Once the modules are copied across, try using the following script, which will
automatically split the installed module, suitable for autoloading:
use AutoSplit;
my $instdir = "$ENV{MACPERL}site_perl";
autosplit("$dir:Module.pm", "$dir:auto", 0, 1, 1);
Change the $instdir and module names accordingly. See Appendix B for
more details on the AutoSplit module.
7. Once the module is installed, try running one of the test programs, or write a
small script to use one of the modules you have installed. Check the MacPerl
error window. If you get an error like this,
# Illegal character \012 (carriage return).
File 'Midkemia:MacPerl ƒ:site_perl:Net:DNS:Header.pm'; Line 1
# (Maybe you didn't strip carriage returns after a network transfer?)then the file still has Unix-style line feeds in it. You can use BBEdit or a similar
application to convert these to Macintosh text. Alternatively, you could write a
Perl script to do it!
Perl is a relatively unstructured language. Although it does, of course, have rules,
most of the restrictions and rules that you may be used to in other languages are
not so heavily enforced. For example, you don’t need to worry too much about
telling Perl what you are expecting to do with your program (or script or application),
or what variable’s subroutines or other elements you are either going to use or introduce.
This approach leads to what is called “There Is More Than One Way to Do It”
(TIMTOWTDI, or tim toady for short) syndrome—which refers to the fact that there
are different ways of achieving the same result, all of which are legally valid.
In fact, a Perl program is as easy as
print "Hello World\n";
Note that there’s nothing before that statement to tell Perl what it needs in order to
print out that message. Compare it to a similar C program:
#include
int main()
{
printf("Hello World\n");
}
In Perl there is no “main” (well, not in the same sense as there is in C)—execution
of a Perl script starts with the first statement in the list and continues until the end of
the file. We don’t need to tell Perl to explicitly exit the program, or give a return value
to the caller; Perl will handle all of that for us.
The rest of this chapter is given over to describing how to create Perl scripts and
use Perl to execute scripts, and to describing the basic components that make up a Perl
program. The rest of this section of the book is devoted to giving more detail on each of
these elements (and more) so that you will have a complete understanding of how to
write basic Perl programs. The rest of the book looks at more advanced topics, such as
object orientation, networking, and interface and web development.
Installing and Using Perl
Perl is available for a huge array of platforms, but the chances are that you are using Perl
on one of the main three—Unix, Windows, and Mac OS. The use of Perl under all these
platforms varies slightly, so we’ll look at the Perl implementation on each one in turn.
As a basic rule, however, Perl works the same way on every platform—you create
a text file (using your favorite editor: vi, emacs, kedit, Notepad, WordPad, SimpleText,
BBEdit, Pepper), and then use Perl to execute the statements within that file. You don’t
have to worry about compiling the file into another format first—Perl executes the
statements directly from the raw text.Writing a Perl Script
Ignoring the platform-specific issues for a moment, producing a Perl script is not as
difficult as it sounds. Perl scripts are just text files, so in order to actually “write” the
script, all you need to do is create a text file using your favorite text editor. Once you’ve
written the script, you tell Perl to execute the text file you created.
Under Unix, you would use
$ perl myscript.pl
and the same works under Windows:
C:\> perl myscript.pl
Under Mac OS, you need to drag and drop the file onto the MacPerl application.
In each case, Perl reads the contents of your text file, interpreting the file as Perl
statements and expressions.
The file-naming system is part convention and part rule. Generally, Perl scripts
have a .pl extension, even under Mac OS and Unix. This helps to identify what the file
is by name, although it’s actually a requirement. Other extensions you may come
across include .pm for a Perl module, .ph for a Perl header file, and .pod for a Perl
documentation file (POD stands for Plain Old Documentation).
Perl Under Unix
Because much of Perl’s early development originated on a Unix platform, it is not
surprising that this is still one of the most strongly supported Perl environments.
Perl under Unix is available in a number of formats and distributions. The main,
or “core,” distribution comes from the main Perl developers and is available in
precompiled binary and source format. There is also a distribution of Perl that comes
from ActiveState—the original developers of the Windows port of Perl—that comes
as a binary bundled with some additional extensions and the Perl Package Manager
(PPM), an alternative to the CPAN module distributed with the core release. Currently
the ActiveState release is available only for Linux (x86), Solaris, and, of course, the
original Windows.
Installation
Perl is available for Unix in both precompiled binary and source format. Precompiled
binaries can be downloaded, extracted, and then installed without any need to compile
the source code. They are available both as compressed tar archives, RPM (RedHat
Package Manager) packages, and Solaris packages. The best place to get Perl is from the
main Perl website, www. perl.com. You should find links to most binaries on that site.
If you want to compile from the sources (useful if you want to enable certain extensions
and options), then you need to download the source and then configure and compile it.You’ll need a C compiler installed on your system—both commercial environments such as
Sun Microsystem’s Forte for C/C++ and free systems such as GNU CC should work fine.
Once you’ve downloaded the source from www.perl.com, do the following:
1. Extract the source code from the archive using tar and gunzip, for example:
$ $ gunzip -c perl.tar.gz | tar xvf -
2. Change to the newly created directory. It’s worth checking the README and
INSTALL files, which contain general Perl information and specific details on
the installation process, respectively.
3. Run the configuration script:
$ ./configure.gnu
This is, in fact, a GNU-style execution of the real Configure script. The standard
Perl Configure script is interactive, requiring input from you on a number of
questions. The GNU-compatible execution answers the questions automatically
for you, making a number of assumptions about your system, though it still
shows the process going on behind the scenes.
The former GNU style-configuration script will probably install Perl into
/usr/local, with the interpreter and support scripts ending up in /usr/local/
bin and the Perl library being placed into /usr/local/lib/perl5. This obviously
requires suitable access privileges to install the files to this location. You can
change the install directory by using the --prefix command line option:
$ ./configure.gnu --prefix=/home/mc/local
4. Run make to build the application:
$ make
The application and support files have now been compiled. It’s a good idea
at this point to run make test, which will run a standard selection of tests to
ensure that Perl has compiled properly. If there are any problems, you want
to check the build process to see if anything failed. On the mainstream systems,
such as Linux and Solaris, it’s unlikely that you will notice any test failures.
5. Once the build has completed, install the application, scripts, and modules
using make:
$ make install
Remember that the installation prefix will by default be /usr/local/, although
the exact setting will depend on your OS. Providing you didn’t specify
different directories, the usual directory specification will install Perl into the
/usr/local/bin and /usr/local/lib/perl5 directories. You will need to add
/usr/local/bin or the installation directory you chose (specified by the
$installation_prefix/bin variable in the makefile) to your $PATH environment
variable, if it is not already in it.Executing Scripts
There are two ways of executing a Perl script under Unix. You can run the Perl application,
supplying the script’s name on the command line, as in the first example, or you can
place the second example on the first line of the file (called the shebang line),
$ perl myscript.pl #!/usr/local/bin/perl
where the path given is the path to the Perl application. You must then change the file
mode of the script to be executable (usually 0755). You can change the mode using the
chmod command:
$ chmod 755 myscript.pl
Note that it is common to have different versions of Perl on your system. In this
case, the latest version will always have been installed as /user/local/bin/perl, which
is linked to the version-specific file, for example, /user/local/bin/perl5.6.0.
The Perl libraries and support files are installed in $prefix/lib/perl5. Since version
5.004, each version of Perl has installed its own subdirectory such that the actual location
becomes /user/local/lib/perl5/5.6.0/, or whatever the version number is. User-installed
(site-specific) scripts should be placed into /user/local/lib/perl5/site-perl/5.6.0.
Whenever a script is run, unless it has been redirected, standard input, output, and
errors are sent via the terminal or window, the same as in the shell environment, except
in the case of CGI scripts, where standard input is taken from the web server, standard
output is sent back to the browser, and standard error is sent to the web server’s log file.
Installing Third-Party Modules
For most modules (particularly those from CPAN), the installation process is fairly
straightforward:
1. Download the module, and extract it using tar and gunzip, for example:
$ gunzip -c module.tar.gz | tar xf -
This should create a new directory with the module contents.
2. Change to the module directory.
3. Type
$ perl5 Makefile.PL
This will check that the module contents are complete and that the necessary
prerequisite modules are already installed. It will also create a makefile that
will compile (if necessary) and install the module.
As in the original installation process, a make test will verify that the
compilation and configuration of the package works before you come to install
it. You should report any problems to the package’s author.4. To install the module, type
$ make install
This will copy the modules and any required support files into the appropriate
directories.
A better, and less interactive, solution is to use the CPAN module to do
the downloading, building, and installation for you. See Web Appendix B at
www.osborne.com for information on how to use the CPAN module.
Perl Under Windows
Perl has been supported under Windows for some time. Originally, development
concentrated on providing a Windows-compatible version from the core libraries,
and then the development split as it became apparent that providing a lot of the
built-in support for certain functions (notably fork) was unattainable. This lead to
a “core” port and a separate development handled by a company called ActiveWare.
ActiveWare worked on providing not only Perl, but also a suite of extensions that
allowed you to perform most operations normally handled by the built-in functions
that were only supported under Unix.
ActiveWare later became ActiveState, and their changes were rolled back into the
core release. Now there is only one version of the Perl interpreter that is valid on both
platforms, but there are now two distributions. The “core” distribution is identical to
that under Unix, so it comes with the standard Perl library but not the extension set
originally developed under the original ActiveWare development.
ActiveState still provides a prepackaged version of Perl for Windows that includes
the core Perl interpreter and an extended set of modules that include the Perl Package
Manager, a number of Win32-specific modules (see Table 2-1), and some general
extensions like Graham Barr’s libnet bundle and Gisle Aas’s LWP (libwww-perl)
bundle. The main ActiveState Perl distribution is called ActivePerl (and is now also
available under Solaris and Linux x86), but they also supply a number of extras, such
as the Perl Development Kit, which provides a visual package installer and debugger,
and PerlEx, which speeds up execution of Perl scripts when used under Microsoft’s
Internet Information Server.Module Description
Archive::Tar A toolkit for opening and using Unix tar files.
Compress::Zlib An interface for decompressing information entirely
within Perl.LWP Gisle Aas’s Lib WWW Perl (LWP) toolkit. This includes
modules for processing HTML, URLs, and MIMEencoded
information, and the necessary code for
downloading files by HTTP and FTP.
Win32::ChangeNotify Interface to the NT Change/Notify system for monitoring
the status of files and directories transparently.
Win32::Clipboard Access to the global system clipboard. You can add and
remove objects from the clipboard directory.
Win32::Console Terminal control of an MSDOS or Windows NT
command console.
Win32::Event Interface to the Win32 event system for IPC.
Win32::EventLog Interface to reading from and writing to the Windows
NT event log system.
Win32::File Allows you to access and set the attributes of a file.
Win32::FileSecurity Interface to the extended file security options under
Windows NT.
Win32::Internet Interface to Win32’s built-in Internet access system for
downloading files. For a cross-platform solution see
Net::FTP, Net:HTTP or the LWP modules elsewhere
in this appendix.
Win32::IPC Base methods for the different IPC techniques
supported under Win32.
Win32::Mutex Interface to the Mutex (Mutual/Exclusive) locking and
access mechanism.
Win32::NetAdmin Network administration functions for individual
machines and entire domains.
Win32::NetResource Provides a suite of Perl functions for accessing and
controlling the individual Net resources.
Win32::ODBC ODBC interface for accessing databases. See also the
DBI and DBD toolkits.
Win32::OLE Interface to OLE automation.
Win32::PerfLib Supports an interface to the Windows NT
performance system.Module Description
Win32::Pipe Named pipes and assorted functions.
Win32::Process Allows you to create manageable Win32 processes
within Perl.
Win32::Registry Provides an interface to the Windows registry. See the
Win32API::Registry module and the Win32::TieRegistry
module for a tied interface.
Win32::Semaphore Interface to the Win32 semaphores.
Win32::Service Allows the control and access of Windows NT services.
Win32::Shortcut Access (and modification) of Win32 shortcuts.
Win32::Sound Allows you to play .WAV and other file formats within
a Perl script.
Win32::TieRegistry A tied interface to the Win32 registry system.
Win32::WinError Access to the Win32 error system.
Win32API::Net Provides a complete interface to the underlying
C++ functions for managing accounts with the
NT LanManager.
Win32API::Registry Provides a low-level interface to the core API used for
manipulating the registry.Installation
There are two ways of installing Perl—the best and recommended way is to download
the ActivePerl installer from www.activestate.com, run the installer, and then reboot your
machine. This will do everything required to get Perl working on your system, including
installing the Perl binary, its libraries and modules, and modifying your PATH so that
you can find Perl in a DOS window or at the command prompt. If you are running Perl
under Windows NT or Windows 2000, or are using Microsoft’s Personal Web Server for
Windows 95/98/Me, then the installer will also set up the web server to support Perl as
a scripting host for web development. Finally, under Windows NT and Windows 2000,
the ActivePerl installer will also modify the configuration of your machine to allow Perl
scripts ending in .pl to be executed directly—that is, without the need to pass the script
names to Perl beforehand.The alternative method is to compile Perl from the core distribution. Although
some people prefer this version, it’s important to note that core distribution does not
come with any of the Win32-specific modules. You will need to download and install
those modules separately.
If you want to install a version of the Perl binary based on the latest source code,
you will need to find a C compiler capable of compiling the application. It’s then a case
of following the instructions relevant to your C and development environment. The
supported C compilers are described here. Other versions and C compilers may work,
but it’s not guaranteed.
Borland C++, version 5.02 or later: With the Borland C++ compiler, you
will need to use a different make command, since the one supplied does
not work very well and certainly doesn’t support MakeMaker extensions.
The documentation recommends the dmake application, available from
http://www-personal.umich.edu/~gsar/dmake-4.1-win32.zip.
Microsoft Visual C++, version 4.2 or later: You can use the nmake that comes
with Visual C++ to build the distribution correctly.
Mingw32 with EGCS, versions 1.0.2 and 1.1, or Mingw32 with GCC, version
2.8.1: Both EGCS and GCC supply their own make command. You can
download a copy of the EGCS version (preferred) from ftp://ftp.xraylith.
wisc.edu/pub/khan/gnu-win32/mingw32/. The GCC version is available
from http://agnes.dida.physik.uni-essen.de/~janjaap/mingw32/.
Also, be aware that Windows 95/98 as a compilation platform is not supported.
This is because the command shell available under Windows 95/98 is not capable of
working properly with the scripts and make commands required during the building
process. The best platforms for building from the core source code are Windows NT or
Windows 2000 using the standard cmd shell.
In all cases, ignore the Configure utility that you would normally use when compiling
under Unix and Unix-like operating systems. Instead, change to the win32 directory
and run the make command for your installation. For example:
c:\perl\win32> dmake
For Microsoft’s Visual C++, you will need to execute the VCVARS32.BAT batch file,
which sets up the environment for using Visual C++ on the command line; for example:
c:\perlsrc\win32>c:\progra~1\micros~1\vc98\bin\vcvars32.bat
You may need to increase the environment memory on your command.com for
this batch file to work properly—you can do this by modifying the properties for theMS-DOS Prompt shortcut. Select the shortcut within the Start menu, and then choose
the Program tab. You should modify the “Cmd Line” field to read
C:\WINDOWS\COMMAND.COM /E:4096
This boosts the environment memory for the command prompt up to 4K—more than
enough for all the variables you should need.
Remember that compiling and installing Perl from the source distribution does not
give you the integration facilities or modules that are included as standard within the
ActiveState version.
You will need to manually update your PATH variable so that you have access to
the Perl interpreter on the command line. You can do this within Windows 95/98 by
modifying the AUTOEXEC.BAT file. You will need to add a line like
SET PATH=C:\PERL\BIN\;%PATH%
This will update your search path without replacing the preexisting contents. The
C:\PERL\BIN\ is the default install location; you should modify this to wherever
you have installed the binary.
On Windows NT/2000, you will need to update the PATH variable by using the
System control panel.
Executing Scripts
Once installed correctly, there are two basic ways of executing a Perl script. You can
either type
C:\> perl hello.pl
in a command window, or you can double-click on a script in Windows Explorer. The
former method allows you to specify command line arguments; the latter method will
require that you ask the user for any required information.
Under Windows NT, if you want a more Unix-like method of executing scripts,
you can modify the PATHEXT environment variable (in the System control panel)
to include .pl as a recognized extension. This allows you to call a script just like any
other command on the command line, but with limitations. The following will work:
C:\> hello readme.txt
However, redirection and pipes to the Perl script will not work. This means that the
following examples, although perfectly valid under Unix, will not work under Windows:C:\> hello C:\> hello readme.txt|more
The other alternative, which works on all Windows platforms, is to use the pl2bat
utility. This wraps the call to your Perl script within a Windows batch file. For
example, we could use it to convert our hello.pl utility:
C:\> pl2bat hello.pl
C:\> hello
The big advantage here is that because we are using batch file, it works on any Windows
platform, and we can even add command line options to the Perl interpreter within the
batch file to alter the behavior. Furthermore, pipes and redirection work correctly with
batch files, which therefore also means the options work with our Perl script.
If you want to specify any additional command line options, you can use the normal
“shebang” line (#!) to specify these options. Although Windows will ignore this line,
the Perl interpreter still has to read the file, and so it will extract any information from
the line that it needs. So, for example, to turn warnings on within a script, you might
use a line such as
#!perl -w
Note that you must still comment out the line using a leading hash character.
Installing Third-Party Modules
Although it’s possible to use the CPAN module to do the installation for you, it requires
access to the make command and often a C compiler in order for it to work properly.
Instead, ActivePerl comes with the Perl Package Manager (PPM). This works along the
same basic premise as the CPAN module, except that PPM modules are precompiled
and ready to be installed—all the PPM tool actually does is copy the files downloaded
in a given package into their required location.
Using PPM is very easy. You start PPM from the command line:
C:\> ppm
PPM interactive shell (1.1.1) - type 'help' for available commands.
PPM>
Once there, you use search to find a suitable package, and install to install it.
For example, to install the Tk interface module,
C:\> ppm
PPM interactive shell (1.1.1) - type 'help' for available commands.
PPM> install TkAnd you then let PPM install the files for you. The number of PPM files is smaller than
CPAN, largely because the modules on CPAN are uncompiled, and those for use under
PPM need to be precompiled. To add to the headaches for developers many of the
CPAN packages rely on libraries and functions only available under Unix.
PPM packages are stored in a number of repositories. The main repository is at
ActiveState, but others are available.
most of the restrictions and rules that you may be used to in other languages are
not so heavily enforced. For example, you don’t need to worry too much about
telling Perl what you are expecting to do with your program (or script or application),
or what variable’s subroutines or other elements you are either going to use or introduce.
This approach leads to what is called “There Is More Than One Way to Do It”
(TIMTOWTDI, or tim toady for short) syndrome—which refers to the fact that there
are different ways of achieving the same result, all of which are legally valid.
In fact, a Perl program is as easy as
print "Hello World\n";
Note that there’s nothing before that statement to tell Perl what it needs in order to
print out that message. Compare it to a similar C program:
#include
int main()
{
printf("Hello World\n");
}
In Perl there is no “main” (well, not in the same sense as there is in C)—execution
of a Perl script starts with the first statement in the list and continues until the end of
the file. We don’t need to tell Perl to explicitly exit the program, or give a return value
to the caller; Perl will handle all of that for us.
The rest of this chapter is given over to describing how to create Perl scripts and
use Perl to execute scripts, and to describing the basic components that make up a Perl
program. The rest of this section of the book is devoted to giving more detail on each of
these elements (and more) so that you will have a complete understanding of how to
write basic Perl programs. The rest of the book looks at more advanced topics, such as
object orientation, networking, and interface and web development.
Installing and Using Perl
Perl is available for a huge array of platforms, but the chances are that you are using Perl
on one of the main three—Unix, Windows, and Mac OS. The use of Perl under all these
platforms varies slightly, so we’ll look at the Perl implementation on each one in turn.
As a basic rule, however, Perl works the same way on every platform—you create
a text file (using your favorite editor: vi, emacs, kedit, Notepad, WordPad, SimpleText,
BBEdit, Pepper), and then use Perl to execute the statements within that file. You don’t
have to worry about compiling the file into another format first—Perl executes the
statements directly from the raw text.Writing a Perl Script
Ignoring the platform-specific issues for a moment, producing a Perl script is not as
difficult as it sounds. Perl scripts are just text files, so in order to actually “write” the
script, all you need to do is create a text file using your favorite text editor. Once you’ve
written the script, you tell Perl to execute the text file you created.
Under Unix, you would use
$ perl myscript.pl
and the same works under Windows:
C:\> perl myscript.pl
Under Mac OS, you need to drag and drop the file onto the MacPerl application.
In each case, Perl reads the contents of your text file, interpreting the file as Perl
statements and expressions.
The file-naming system is part convention and part rule. Generally, Perl scripts
have a .pl extension, even under Mac OS and Unix. This helps to identify what the file
is by name, although it’s actually a requirement. Other extensions you may come
across include .pm for a Perl module, .ph for a Perl header file, and .pod for a Perl
documentation file (POD stands for Plain Old Documentation).
Perl Under Unix
Because much of Perl’s early development originated on a Unix platform, it is not
surprising that this is still one of the most strongly supported Perl environments.
Perl under Unix is available in a number of formats and distributions. The main,
or “core,” distribution comes from the main Perl developers and is available in
precompiled binary and source format. There is also a distribution of Perl that comes
from ActiveState—the original developers of the Windows port of Perl—that comes
as a binary bundled with some additional extensions and the Perl Package Manager
(PPM), an alternative to the CPAN module distributed with the core release. Currently
the ActiveState release is available only for Linux (x86), Solaris, and, of course, the
original Windows.
Installation
Perl is available for Unix in both precompiled binary and source format. Precompiled
binaries can be downloaded, extracted, and then installed without any need to compile
the source code. They are available both as compressed tar archives, RPM (RedHat
Package Manager) packages, and Solaris packages. The best place to get Perl is from the
main Perl website, www. perl.com. You should find links to most binaries on that site.
If you want to compile from the sources (useful if you want to enable certain extensions
and options), then you need to download the source and then configure and compile it.You’ll need a C compiler installed on your system—both commercial environments such as
Sun Microsystem’s Forte for C/C++ and free systems such as GNU CC should work fine.
Once you’ve downloaded the source from www.perl.com, do the following:
1. Extract the source code from the archive using tar and gunzip, for example:
$ $ gunzip -c perl.tar.gz | tar xvf -
2. Change to the newly created directory. It’s worth checking the README and
INSTALL files, which contain general Perl information and specific details on
the installation process, respectively.
3. Run the configuration script:
$ ./configure.gnu
This is, in fact, a GNU-style execution of the real Configure script. The standard
Perl Configure script is interactive, requiring input from you on a number of
questions. The GNU-compatible execution answers the questions automatically
for you, making a number of assumptions about your system, though it still
shows the process going on behind the scenes.
The former GNU style-configuration script will probably install Perl into
/usr/local, with the interpreter and support scripts ending up in /usr/local/
bin and the Perl library being placed into /usr/local/lib/perl5. This obviously
requires suitable access privileges to install the files to this location. You can
change the install directory by using the --prefix command line option:
$ ./configure.gnu --prefix=/home/mc/local
4. Run make to build the application:
$ make
The application and support files have now been compiled. It’s a good idea
at this point to run make test, which will run a standard selection of tests to
ensure that Perl has compiled properly. If there are any problems, you want
to check the build process to see if anything failed. On the mainstream systems,
such as Linux and Solaris, it’s unlikely that you will notice any test failures.
5. Once the build has completed, install the application, scripts, and modules
using make:
$ make install
Remember that the installation prefix will by default be /usr/local/, although
the exact setting will depend on your OS. Providing you didn’t specify
different directories, the usual directory specification will install Perl into the
/usr/local/bin and /usr/local/lib/perl5 directories. You will need to add
/usr/local/bin or the installation directory you chose (specified by the
$installation_prefix/bin variable in the makefile) to your $PATH environment
variable, if it is not already in it.Executing Scripts
There are two ways of executing a Perl script under Unix. You can run the Perl application,
supplying the script’s name on the command line, as in the first example, or you can
place the second example on the first line of the file (called the shebang line),
$ perl myscript.pl #!/usr/local/bin/perl
where the path given is the path to the Perl application. You must then change the file
mode of the script to be executable (usually 0755). You can change the mode using the
chmod command:
$ chmod 755 myscript.pl
Note that it is common to have different versions of Perl on your system. In this
case, the latest version will always have been installed as /user/local/bin/perl, which
is linked to the version-specific file, for example, /user/local/bin/perl5.6.0.
The Perl libraries and support files are installed in $prefix/lib/perl5. Since version
5.004, each version of Perl has installed its own subdirectory such that the actual location
becomes /user/local/lib/perl5/5.6.0/, or whatever the version number is. User-installed
(site-specific) scripts should be placed into /user/local/lib/perl5/site-perl/5.6.0.
Whenever a script is run, unless it has been redirected, standard input, output, and
errors are sent via the terminal or window, the same as in the shell environment, except
in the case of CGI scripts, where standard input is taken from the web server, standard
output is sent back to the browser, and standard error is sent to the web server’s log file.
Installing Third-Party Modules
For most modules (particularly those from CPAN), the installation process is fairly
straightforward:
1. Download the module, and extract it using tar and gunzip, for example:
$ gunzip -c module.tar.gz | tar xf -
This should create a new directory with the module contents.
2. Change to the module directory.
3. Type
$ perl5 Makefile.PL
This will check that the module contents are complete and that the necessary
prerequisite modules are already installed. It will also create a makefile that
will compile (if necessary) and install the module.
As in the original installation process, a make test will verify that the
compilation and configuration of the package works before you come to install
it. You should report any problems to the package’s author.4. To install the module, type
$ make install
This will copy the modules and any required support files into the appropriate
directories.
A better, and less interactive, solution is to use the CPAN module to do
the downloading, building, and installation for you. See Web Appendix B at
www.osborne.com for information on how to use the CPAN module.
Perl Under Windows
Perl has been supported under Windows for some time. Originally, development
concentrated on providing a Windows-compatible version from the core libraries,
and then the development split as it became apparent that providing a lot of the
built-in support for certain functions (notably fork) was unattainable. This lead to
a “core” port and a separate development handled by a company called ActiveWare.
ActiveWare worked on providing not only Perl, but also a suite of extensions that
allowed you to perform most operations normally handled by the built-in functions
that were only supported under Unix.
ActiveWare later became ActiveState, and their changes were rolled back into the
core release. Now there is only one version of the Perl interpreter that is valid on both
platforms, but there are now two distributions. The “core” distribution is identical to
that under Unix, so it comes with the standard Perl library but not the extension set
originally developed under the original ActiveWare development.
ActiveState still provides a prepackaged version of Perl for Windows that includes
the core Perl interpreter and an extended set of modules that include the Perl Package
Manager, a number of Win32-specific modules (see Table 2-1), and some general
extensions like Graham Barr’s libnet bundle and Gisle Aas’s LWP (libwww-perl)
bundle. The main ActiveState Perl distribution is called ActivePerl (and is now also
available under Solaris and Linux x86), but they also supply a number of extras, such
as the Perl Development Kit, which provides a visual package installer and debugger,
and PerlEx, which speeds up execution of Perl scripts when used under Microsoft’s
Internet Information Server.Module Description
Archive::Tar A toolkit for opening and using Unix tar files.
Compress::Zlib An interface for decompressing information entirely
within Perl.LWP Gisle Aas’s Lib WWW Perl (LWP) toolkit. This includes
modules for processing HTML, URLs, and MIMEencoded
information, and the necessary code for
downloading files by HTTP and FTP.
Win32::ChangeNotify Interface to the NT Change/Notify system for monitoring
the status of files and directories transparently.
Win32::Clipboard Access to the global system clipboard. You can add and
remove objects from the clipboard directory.
Win32::Console Terminal control of an MSDOS or Windows NT
command console.
Win32::Event Interface to the Win32 event system for IPC.
Win32::EventLog Interface to reading from and writing to the Windows
NT event log system.
Win32::File Allows you to access and set the attributes of a file.
Win32::FileSecurity Interface to the extended file security options under
Windows NT.
Win32::Internet Interface to Win32’s built-in Internet access system for
downloading files. For a cross-platform solution see
Net::FTP, Net:HTTP or the LWP modules elsewhere
in this appendix.
Win32::IPC Base methods for the different IPC techniques
supported under Win32.
Win32::Mutex Interface to the Mutex (Mutual/Exclusive) locking and
access mechanism.
Win32::NetAdmin Network administration functions for individual
machines and entire domains.
Win32::NetResource Provides a suite of Perl functions for accessing and
controlling the individual Net resources.
Win32::ODBC ODBC interface for accessing databases. See also the
DBI and DBD toolkits.
Win32::OLE Interface to OLE automation.
Win32::PerfLib Supports an interface to the Windows NT
performance system.Module Description
Win32::Pipe Named pipes and assorted functions.
Win32::Process Allows you to create manageable Win32 processes
within Perl.
Win32::Registry Provides an interface to the Windows registry. See the
Win32API::Registry module and the Win32::TieRegistry
module for a tied interface.
Win32::Semaphore Interface to the Win32 semaphores.
Win32::Service Allows the control and access of Windows NT services.
Win32::Shortcut Access (and modification) of Win32 shortcuts.
Win32::Sound Allows you to play .WAV and other file formats within
a Perl script.
Win32::TieRegistry A tied interface to the Win32 registry system.
Win32::WinError Access to the Win32 error system.
Win32API::Net Provides a complete interface to the underlying
C++ functions for managing accounts with the
NT LanManager.
Win32API::Registry Provides a low-level interface to the core API used for
manipulating the registry.Installation
There are two ways of installing Perl—the best and recommended way is to download
the ActivePerl installer from www.activestate.com, run the installer, and then reboot your
machine. This will do everything required to get Perl working on your system, including
installing the Perl binary, its libraries and modules, and modifying your PATH so that
you can find Perl in a DOS window or at the command prompt. If you are running Perl
under Windows NT or Windows 2000, or are using Microsoft’s Personal Web Server for
Windows 95/98/Me, then the installer will also set up the web server to support Perl as
a scripting host for web development. Finally, under Windows NT and Windows 2000,
the ActivePerl installer will also modify the configuration of your machine to allow Perl
scripts ending in .pl to be executed directly—that is, without the need to pass the script
names to Perl beforehand.The alternative method is to compile Perl from the core distribution. Although
some people prefer this version, it’s important to note that core distribution does not
come with any of the Win32-specific modules. You will need to download and install
those modules separately.
If you want to install a version of the Perl binary based on the latest source code,
you will need to find a C compiler capable of compiling the application. It’s then a case
of following the instructions relevant to your C and development environment. The
supported C compilers are described here. Other versions and C compilers may work,
but it’s not guaranteed.
Borland C++, version 5.02 or later: With the Borland C++ compiler, you
will need to use a different make command, since the one supplied does
not work very well and certainly doesn’t support MakeMaker extensions.
The documentation recommends the dmake application, available from
http://www-personal.umich.edu/~gsar/dmake-4.1-win32.zip.
Microsoft Visual C++, version 4.2 or later: You can use the nmake that comes
with Visual C++ to build the distribution correctly.
Mingw32 with EGCS, versions 1.0.2 and 1.1, or Mingw32 with GCC, version
2.8.1: Both EGCS and GCC supply their own make command. You can
download a copy of the EGCS version (preferred) from ftp://ftp.xraylith.
wisc.edu/pub/khan/gnu-win32/mingw32/. The GCC version is available
from http://agnes.dida.physik.uni-essen.de/~janjaap/mingw32/.
Also, be aware that Windows 95/98 as a compilation platform is not supported.
This is because the command shell available under Windows 95/98 is not capable of
working properly with the scripts and make commands required during the building
process. The best platforms for building from the core source code are Windows NT or
Windows 2000 using the standard cmd shell.
In all cases, ignore the Configure utility that you would normally use when compiling
under Unix and Unix-like operating systems. Instead, change to the win32 directory
and run the make command for your installation. For example:
c:\perl\win32> dmake
For Microsoft’s Visual C++, you will need to execute the VCVARS32.BAT batch file,
which sets up the environment for using Visual C++ on the command line; for example:
c:\perlsrc\win32>c:\progra~1\micros~1\vc98\bin\vcvars32.bat
You may need to increase the environment memory on your command.com for
this batch file to work properly—you can do this by modifying the properties for theMS-DOS Prompt shortcut. Select the shortcut within the Start menu, and then choose
the Program tab. You should modify the “Cmd Line” field to read
C:\WINDOWS\COMMAND.COM /E:4096
This boosts the environment memory for the command prompt up to 4K—more than
enough for all the variables you should need.
Remember that compiling and installing Perl from the source distribution does not
give you the integration facilities or modules that are included as standard within the
ActiveState version.
You will need to manually update your PATH variable so that you have access to
the Perl interpreter on the command line. You can do this within Windows 95/98 by
modifying the AUTOEXEC.BAT file. You will need to add a line like
SET PATH=C:\PERL\BIN\;%PATH%
This will update your search path without replacing the preexisting contents. The
C:\PERL\BIN\ is the default install location; you should modify this to wherever
you have installed the binary.
On Windows NT/2000, you will need to update the PATH variable by using the
System control panel.
Executing Scripts
Once installed correctly, there are two basic ways of executing a Perl script. You can
either type
C:\> perl hello.pl
in a command window, or you can double-click on a script in Windows Explorer. The
former method allows you to specify command line arguments; the latter method will
require that you ask the user for any required information.
Under Windows NT, if you want a more Unix-like method of executing scripts,
you can modify the PATHEXT environment variable (in the System control panel)
to include .pl as a recognized extension. This allows you to call a script just like any
other command on the command line, but with limitations. The following will work:
C:\> hello readme.txt
However, redirection and pipes to the Perl script will not work. This means that the
following examples, although perfectly valid under Unix, will not work under Windows:C:\> hello
The other alternative, which works on all Windows platforms, is to use the pl2bat
utility. This wraps the call to your Perl script within a Windows batch file. For
example, we could use it to convert our hello.pl utility:
C:\> pl2bat hello.pl
C:\> hello
The big advantage here is that because we are using batch file, it works on any Windows
platform, and we can even add command line options to the Perl interpreter within the
batch file to alter the behavior. Furthermore, pipes and redirection work correctly with
batch files, which therefore also means the options work with our Perl script.
If you want to specify any additional command line options, you can use the normal
“shebang” line (#!) to specify these options. Although Windows will ignore this line,
the Perl interpreter still has to read the file, and so it will extract any information from
the line that it needs. So, for example, to turn warnings on within a script, you might
use a line such as
#!perl -w
Note that you must still comment out the line using a leading hash character.
Installing Third-Party Modules
Although it’s possible to use the CPAN module to do the installation for you, it requires
access to the make command and often a C compiler in order for it to work properly.
Instead, ActivePerl comes with the Perl Package Manager (PPM). This works along the
same basic premise as the CPAN module, except that PPM modules are precompiled
and ready to be installed—all the PPM tool actually does is copy the files downloaded
in a given package into their required location.
Using PPM is very easy. You start PPM from the command line:
C:\> ppm
PPM interactive shell (1.1.1) - type 'help' for available commands.
PPM>
Once there, you use search to find a suitable package, and install to install it.
For example, to install the Tk interface module,
C:\> ppm
PPM interactive shell (1.1.1) - type 'help' for available commands.
PPM> install TkAnd you then let PPM install the files for you. The number of PPM files is smaller than
CPAN, largely because the modules on CPAN are uncompiled, and those for use under
PPM need to be precompiled. To add to the headaches for developers many of the
CPAN packages rely on libraries and functions only available under Unix.
PPM packages are stored in a number of repositories. The main repository is at
ActiveState, but others are available.
introduction
Perl is many different things to many different people. The most fundamental
aspect of Perl is that it’s a high-level programming language written originally
by Larry Wall and now supported and developed by a cast of thousands. The
Perl language semantics are largely based on the C programming language, while also
inheriting many of the best features of sed, awk, the Unix shell, and at least a dozen
other tools and languages.
Although it is a bad idea to pigeonhole any language and assign it to a specific list
of tasks, Perl is particularly strong at process, file, and text manipulation. This makes
it especially useful for system utilities, software tools, systems management tasks,
database access, graphical programming, networking, and web programming. These
strengths make it particularly attractive to CGI script authors, systems administrators,
mathematicians, journalists, and just about anybody who needs to write applications
and utilities very quickly.
Perl has its roots firmly planted in the Unix environment, but it has since become
a cross-platform development tool. Perl runs on IBM mainframes; AS/400s; Windows
NT, 95, and 98; OS/2; Novell Netware; Cray supercomputers; Digital’s VMS; Tandem
Guardian; HP MPE/ix; Mac OS; and all flavors of Unix, including Linux. In addition,
Perl has been ported to dozens of smaller operating systems, including BeOS, Acorn’s
RISCOS, and even machines such as the Amiga.
Larry Wall is a strong proponent of free software, and Perl is no exception. Perl,
including the source code, the standard Perl library, the optional modules, and all of
the documentation, is provided free and is supported entirely by its user community.
Before we get into the details of how to program in Perl, it’s worth taking the
time to familiarize yourself with where Perl has come from, what it can be used
for, and how it stacks up against other languages. We’ll also look at some popular
“mythconceptions” about what Perl is and at some success stories of how Perl has
helped a variety of organizations solve an equally varied range of problems.Versions and Naming Conventions
The current version of Perl (at the time of writing—Nov 2000) was Perl 5.6, with a
develop version, v5.7, already in production. Some sites are migrating to v5.6, others
seem to be dragging their heels, although there are no major compatibility problems.
Up until March 2000, the situation concerning the available versions of Perl was
quite complex, but we’ll start with the “current” version first. From the release of Perl
5.6 there are two very simple strands. Even version numbers, such as 5.6 and 5.8 are
considered to be “stable” releases of the language. Odd version numbers, such as 5.7
and 5.9, are development releases.
Perl 5.6 was a long time coming—over two years since the last major release—
but it also set a landmark for Perl’s development. It was the first version that really
reunited the core and Win32 versions of Perl, as well as providing some compatibility
enhancements. For example, the Windows ports now support fork, something not
natively provided by the Windows operating system. Also updated were the Perl compiler
and the threading system (which actually supports the Windows fork function),
and the addition of a new keyword, our, which handles global variables in the same
way as my.
Discussions have already started for Perl 6. Unlike Perl 5, which was a complete
rewrite of Perl 4 and was developed and coded almost entirely by Larry, Perl 6 will have
its feature set determined by the people that use it, through a series of RFCs (Requests
for Comments). The language’s core code will be developed by a team of programmers
with input and assistance from Larry, and with features agreed upon by committees,
rather than solely by Larry. This will make Perl 6 a language designed by the people
that use it, rather than by the person who invented it.
Perl, perl or PeRl?
There is also a certain amount of confusion regarding the capitalization of Perl.
Should it be written Perl or perl? Larry Wall now uses “Perl” to signify the language
proper and “perl” to signify the implementation of the language. Therefore, perl can
parse Perl. In essence, however, it really doesn’t make a huge amount of difference.
That said, you will find that the executable version of perl is installed with its name
in lowercase!
Life Before Perl 5.6
Before Perl 5.6, version numbers were far more confusing. Before version 5 came
version 4, the highest incarnation of which was 4.036, released in 1993. Version 5 is
still in development, with version 5.005_03 being the last stable release before thecurrent 5.6. However, many sites were using Perl 5.005_56—this was a developmental
release, but stable enough that some sites used it in preference to 5.005_02. Although
there were changes between these versions, they were bug fixes rather than the
significant improvements in Perl 5.6.
As to naming, you will see references to perl4 and perl5, and more recently, perl5.6.
Since most people will be using at least perl5, it’s probably safe to refer to Perl simply
as Perl!Perl History
Perl is a relatively old language, with the first version having been released in 1988.
The basic history is shown in Table 1-1.
If you want a more detailed history of Perl, check out the perlhist documentation
installed with Perl, or visit CPAST, the Comprehensive Perl Arcana Society Tapestry at
history.perl.org.Version Date Version Details
Perl 0 Introduced Perl to Larry Wall’s office associates
Perl 1 Jan 1988 Introduced Perl to the world
Perl 2 Jun 1988 Introduced Harry Spencer’s regular expression
package
Perl 3 Oct 1989 Introduced the ability to handle binary data
Perl 4 Mar 1991 Introduced the first “Camel” book (Programming
Perl, by Larry Wall, Tom Christiansen, and
Randal L Schwartz; O’Reilly & Associates). The
book drove the name change, just so it could refer
to Perl 4, instead of Perl 3.
Perl 4.036 Feb 1993 The last stable release of Perl 4
Perl 5 Oct 1994 The first stable release of Perl 5, which introduced
a number of new features and a complete rewrite.
Perl 5.005_02 Aug 1998 The next major stable release
Perl 5.005_03 Mar 1999 The last stable release before 5.6
Perl 5.6 Mar 2000 Introduced unified fork support, better threading,
an updated Perl compiler, and the our keywordMain Perl Features
Perl contains many features that most Perl programmers do not even know about, let
alone use. Some of the most basic features are described here.
Perl Is Free
It may not seem like a major feature, but, in fact, being free is very important. Some
languages, such as C (which is free with compilers such as GNU’s gcc), have been
commercialized by Metrowerks, Microsoft, and other companies. Other languages,
such as Visual Basic, are entirely commercial. Perl’s source code is open and free—
anybody can download the C source that constitutes a Perl interpreter. Furthermore,
you can easily extend the core functionality of Perl both within the realms of the
interpreted language and by modifying the Perl source code.
Perl Is Simple to Learn, Concise, and Easy to Read
Because of its history and roots, most people with any programming experience will
be able to program with Perl. It has a syntax similar to C and shell script, among others,
but with a less restrictive format. Most programs are quicker to write in Perl because
of its use of built-in functions and a huge standard and contributed library. Most programs
are also quicker to execute than other languages because of Perl’s internal architecture
(see the section, “Perl is Fast” that follows). Perl can be easy to read, because the code
can be written in a clear and concise format that almost reads like an English sentence.
Unfortunately, Perl also has a bad habit of looking a bit like line noise to uninitiated.
Whether or not your Perl looks good and clean really depends on how you format
it—good Perl is easy read. It is also worth reading the Perl style guidelines (in the Perl
style manual page that comes with Perl) to see how Larry Wall, Perl’s creator, likes
things done.
Perl Is Fast
As we will see shortly, Perl is not an interpreter in the strictest sense—when you execute
a Perl program it is actually compiled into a highly optimized language before it is
executed. Compared to most scripting languages, this makes execution almost as fast
as compiled C code. But, because the code is still interpreted, there is no compilation
process, and applications can be written and edited much faster than with other
languages, without any of the performance problems normally associated with an
interpreted language.
Perl Is Extensible
You can write Perl-based packages and modules that extend the functionality of the
language. You can also call external C code directly from Perl to extend the functionalityfurther. The reverse is also true: the Perl interpreter can be incorporated directly into
many languages, including C. This allows your C programs to use the functionality of
the Perl interpreter without calling an external program.
Perl Has Flexible Data Types
You can create simple variables that contain text or numbers, and Perl will treat the
variable data accordingly at the time it is used. This means that unlike C, you don’t
have to worry about converting text and numbers, and you can embed and merge strings
without requiring external functions to concatenate or combine the results. You can
also handle arrays of values as simple lists, as typical indexed arrays, and even as stacks
of information. You can also create associative arrays (otherwise known as hashes)
which allow you to refer to the items in the array by a unique string, rather than a
simple number. Finally, Perl also supports references, and through references objects.
References allow you to create complex data structures made up of a combination
of hashes, lists and scalars.
Perl Is Object Oriented
Perl supports all of the object-oriented features—inheritance, polymorphism, and
encapsulation. There are no restrictions on when or where you make use of objectoriented
features. There is no boundary as there is with C and C++.
Perl Is Collaborative
There is a huge network of Perl programmers worldwide. Most programmers supply,
and use, the modules and scripts available via CPAN, the Comprehensive Perl Archive
Network (see Web Appendix B at www.osborne.com). This is a repository of the
best modules and scripts available. Using an existing prewritten module can save you
hundreds, perhaps even thousands, of hours of development time.
Compiler or Interpreter
Different languages work in different ways; they are either compiled or interpreted.
A program in a compiled language is translated from the original source into a platformspecific
machine code. This machine code is referred to as an executable. There is no
direct relation between the machine code and the original source: it is not possible to
reverse the compilation process and produce the source code. This means that the
compiled executable is safe from intellectual property piracy.
With an interpreted language, on the other hand, the interpreter reads the original
source code and interprets each of the statements in order to perform the different
operations. The source code is therefore executed at run time. This has some advantages:
Because there is no compilation process, the development of interpreted code should
be significantly quicker. Interpreted code also tends to be smaller and easier to
distribute. The disadvantages are that the original source must be supplied in order
to execute the program, and an interpreted program is generally slower than a
compiled executable because of the way the code is executed.
Perl fits neither of these descriptions in the real sense. The internals of Perl are
such that at the time of executing a Perl script, the individual elements of the script are
compiled into a tree of opcodes. Opcodes are similar in concept to machine code—the
binary format required by the processor in your machine. However, whereas machine
code is executed directly by hardware, opcodes are executed by a Perl virtual machine.
The opcodes are highly optimized objects designed to perform a specific function.
When the script is executed you are essentially executing compiled C code, translated
from the Perl source. This enables Perl to provide all the advantages of a scripting
language while offering the fast execution of a compiled program. This mode of
operation—translation and then execution by a virtual machine is actually how most
modern scripting languages work, including Java (using Just In Time technology)
and Python.
Keeping all of that in mind, however, there have been some advances in the most
recent versions of a Perl compiler that takes native Perl scripts and converts them into
directly executable machine code. We’ll cover the compiler and Perl internals later in
this book.
Similar Programming Languages
We already know that Perl has its history in a number of different languages. It shares
several features and abilities with many of the standard tools supplied with any Unix
workstation. It also shares some features and abilities with many related languages,
even if it doesn’t necessarily share the same heritage.
With regard to specific features, abilities, and performance, Perl compares favorably
against some languages and less favorably against others. A lot of the advantages and
disadvantages are a matter of personal preference. For example, for text handling, there
is very little to choose between awk and Perl. However, personally I prefer Perl for
those tasks that involve file handling directly within the code, and awk when using it
as a filter as part of a shell script.
Unix Shells
Any of the Unix shells—sh, csh, ksh, or even bash—share the same basic set of
facilities. They are particularly good at running external programs and at most forms
of file management where the shell’s ability to work directly with many of the
standard Unix utilities enables rapid development of systems management tools.
However, where most shells fail is in their variable- and data-handling routines. In
nearly all cases you need to use the facilities provided by shell tools such as cut, paste,
and sort to achieve the same level of functionality as that provided natively by Perl.Tcl
Tcl (Tool Command Language) was developed as an embeddable scripting language.
A lot of the original design centered around a macro-like language for helping with
shell-based applications. Tcl was never really developed as a general-purpose scripting
language, although many people use it as such. In fact, Tcl was designed with the
philosophy that you should actually use two or more languages when developing large
software systems.
Tcl’s variables are very different from those in Perl. Because it was designed with
the typical shell-based string handling in mind, strings are null terminated (as they are
in C). This means that Tcl cannot be used for handling binary data. Compared to Perl,
Tcl is also generally slower on iterative operations over strings. You cannot pass arrays
by value or by reference; they can only be passed by name. This makes programming
more complex, although not impossible.
Lists in Tcl are actually stored as a single string, and arrays are stored within what
Perl would treat as a hash. Accessing a true Tcl array is therefore slightly slower, as it has
to look up associative entries in order to decipher the true values. The data-handling
problems also extend to numbers, which Tcl stores as strings and converts to numbers
only when a calculation is required. This slows mathematical operations significantly.
Unlike Perl, which parses the script first before optimizing and then executing,
Tcl is a true interpreter, and each line is interpreted and optimized individually at
execution time. This reduces the optimization options available to Tcl. Perl, on the other
hand, can optimize source lines, code blocks, and even entire functions if the compilation
process allows. The same Tcl interpretation technique also means that the only way
to debug Tcl code and search for syntactic errors is to actually execute the code. Because
Perl goes through the precompilation stage, it can check for syntactic and other
possible or probable errors without actually executing the code.
Finally, the code base of the standard Tcl package does not include many of the
functions and abilities of the Perl language. This is especially important if you are
trying to write a cross-platform POSIX-compliant application. Perl supports the entire
POSIX function set, but Tcl supports a much smaller subset of the POSIX function
set, even using external packages.
It should be clear from this description that Perl is a better alternative to Tcl in
situations where you want easy access to the rest of the OS. Most significantly, Tcl will
never be a general-purpose scripting language. Tcl will, on the other hand, be a good
solution if you want to embed a scripting language inside another language.
Python
Python was developed as an object-oriented language and is well thought out. It is an
interpreted, byte-compiled, extensible, and largely procedural programming language.
Like Perl, it’s good at text processing and even general-purpose programming. Python
also has a good history in the realm of GUI-based application development. Compared
to Perl, Python has fewer users, but it is gaining acceptance as a practical rapid
application development tool.
Unlike Perl, Python does not resemble C, and it doesn’t resemble Unix-style tools
like awk either. Python was designed from scratch to be object oriented and has clear
module semantics. This can make it confusing to use, as the name spaces get complex
to resolve. On the other hand, this makes it much more structured, which can ease
development for those with structured minds.
I’m not aware of anything that is better in Python than in Perl. They both share
object features, and the two are almost identical in execution speed. However, the
reverse is not true: Perl has better regular expression features, and the level of integration
between Perl and the Unix environment is hard to beat (although it can probably be
solved within Python using a suitably written external module).
In general, there is not a lot to tip the scales in favor of one of the two languages.
Perl will appeal to those people who already know C or Unix shell utilities. Perl is
also older and more widespread, and there is a much larger library of contributed
modules and scripts. Python, on the other hand, may appeal to those people who
have experience with more object-oriented languages, such as Java or Modula-2.
Both languages provide easy control and access when it comes to the external
environment in which they work. Perl arguably fills the role better, though, because
many of the standard system functions you are used to are supported natively by
the language, without requiring external modules. The technical support for the two
languages is also very similar, with both using websites and newsgroups to help
users program in the new language.
Finally, it’s worth mentioning that of all the scripting languages available, Perl and
Python are two of the most stable platforms for development. There are, however,
some minor differences. First, Perl provides quite advanced functions and mechanisms
for tracking errors and faults in the scripts. Making extensive use of these facilities can
still cause problems, however. For example, calling the system truncate() function
within Perl will cause the whole interpreter to crash. Python, on the other hand, uses
a system of error trapping that will immediately identify a problem like this before
it occurs, allowing you to account for it in your applications. This is largely due to the
application-development nature of the language.
Java
At first viewing, Java seems to be a friendlier, interpreted version of C++. Depending
on your point of view, this can either be an advantage or a disadvantage. Java probably
inherits less than a third of the complexity of C++, but it retains much of the complexity
of its brethren.Java was designed primarily as an implementation-independent language, originally
with web-based intentions, but now as a more general-purpose solution to a variety of
problems. Like Perl, Java is byte compiled, but unlike Perl, programs are supplied in
byte-compiled format and then executed via a Java virtual machine at execution time.
Because of its roots and its complexity, Java cannot really be considered as a direct
competitor to Perl. It is difficult to use Java as a rapid application development tool
and virtually impossible to use it for most of the simple text-processing and systemadministration
tasks that Perl is best known for.
C/C++
Perl itself is written in C. (You can download and view the Perl source code if you so
wish, but it’s not for the faint-hearted!) Many of the structures and semantics of Perl
and C are very similar. For example, both use semicolons as end-of-line terminators.
They also share the same code block and indentation features. However, Perl tends to
be stricter when it comes to code block definitions—it always requires curly brackets,
for example—but most C programmers will be comfortable with the Perl environment.
Perl can be object oriented like C++. Both share the same abilities of inheritance,
polymorphism, and encapsulation. However, object orientation in Perl is easier to
use, compared to the complexities of constructors and inheritance found in C++. In
addition to all this, there is no distinction between the standard and object-oriented
implementations of Perl as there is with C and C++. This means you can mix and
match different variables, objects, and other data types within a single Perl application—
something that would be difficult to achieve easily with C and C++.
Because Perl is basically an interpreted language (as mentioned earlier), development
is generally quicker than is writing in native C. Perl also has many more built-in
facilities and abilities that would otherwise need to be handwritten in C/C++. For example,
regular expressions and many of the data-handling features would require a significant
amount of programming to reproduce in C with the same ease of use available in Perl.
Because of Perl’s roots in C, it is also possible to extend Perl with C source code and
vice versa: you can embed Perl programs in C source code.
awk/gawk
Although a lot of syntax is different, awk, and gawk (the GNU projects version) are
functionally subsets of Perl. It’s also clear from the history of Perl that many of the
features have been inherited directly from those of awk. Indeed, awk was designed
as a reporting language with the emphasis on making the process of reporting via
the shell significantly easier. Without awk, you would have to employ a number of
external utilities, such as cut, expr, and sort, and the solution would be neither quick
nor elegant.
There are some things that Perl has built-in support for that awk does not. For
example, awk has no network socket class, and it is largely ignorant of external files,when compared to the file manipulation and management functions found in Perl.
However, some advantages awk has over Perl are summarized here:
awk is simpler, and the syntax is more structured and regular.
Although it is gaining acceptance, Perl has yet to be included as standard with
many operating systems. Awk has been supplied with Unix almost since it was
first released.
awk can be smaller and therefore much quicker to execute for small programs.
awk supports more advanced regular expressions. You can use a regular
expression for replacement, and you can search text in substitutions.Popular “Mythconceptions”
Despite its history and wide use in many different areas, there are still a number of
myths about what Perl is, where it should be used, and even why it was invented. Here’s
a quick list of the popular mythconceptions of the Perl language.
It’s Only for the Web
Probably the most famous of the myths is that Perl is a language used, designed, and
created exclusively for developing web-based applications. In fact, this could not be
more wrong. Version 1.0 of Perl, the first released to the world, shipped in 1988—
several years before the web and HTML as we know it today were in general use. In
fact, Perl was inherited as a good design tool for web server applications based on
its ease of use and flexibility. The text-handling features are especially useful when
working within the web environment. There are libraries of database interfaces,
client-server modules, networking features, and even GUI toolkits to enable you to
write entire applications directly within Perl.
It’s Not Maintenance Friendly
Any good (or bad) programmer will tell you that anybody can write unmaintainable
code in any language. Many companies and individuals write maintainable programs
using Perl. A lot of people would argue that Perl’s structured style, easily readable
source code, and modular format make it more maintainable than languages such as
C, C++, and Java.
It’s Only for Hackers
Perl is used by a variety of companies, organizations, and individuals. Everybody from
programming beginners through “hackers” up to multinational corporations use Perl
to solve their problems. It can hardly be classed as a hackers-only language. Moreover,it is maintained by the same range of people, which means you get the best of both
worlds—real-world features, with top-class behind-the-scenes algorithms.
It’s a Scripting Language
In Perl, there is no difference between a script and program. Many large programs
and projects have been written entirely in Perl. A good example is Majordomo, the
main mailing-list manager used on the Internet. It’s written entirely in Perl. See the
upcoming section “Perl Success Stories” for more examples of where Perl has made
a difference, despite its scripting label.
There’s No Support
The Perl community is one of the largest on the Internet, and you should be able to find
someone, somewhere, who can answer your questions or help you with your problems.
The Perl Clinic (see Appendix C) offers free advice and support to Perl programmers.
All Perl Programs Are Free
Although you generally write and use Perl programs in their native source form, this
does not mean that everything you write is free. Perl programs are your own intellectual
property and can be bought, sold, and licensed just like any other program. If you are
worried about somebody stealing your code, source filters and bytecode compilers will
render your code useful only for execution and unreadable by the casual software pirate.
There’s No Development Environment
Development environments are only really required when you need to compile source
code into object files. Because Perl scripts are written in normal text, you can use any
editor to write and use Perl programs. Under Unix, the favorites are emacs and vi, and
both have Perl modes to make syntax checking and formatting easier. Under Windows
NT, you can also use emacs, or you can use Solutionsoft’s Perl Builder, which is an
interactive environment for Perl programs. Alternatively, you can use the ActiveState
debugger, which will provide you with a direct environment for executing and editing
Perl statements. There are also many improvements being made in the ActiveState
distribution that will allow Perl to be used as part of Microsoft’s Visual Studio product
under a project called VisualPerl. On the Mac, the BBEdit and Pepper editors have a
Perl mode that colors the syntax of the Perl source to make it easier to read.
Additionally, because Perl programs are text based, you can use any source-code
revision-control system. The most popular solution is CVS, or Concurrent Versioning
System, which is now supported under Unix, MacOS and Windows.
aspect of Perl is that it’s a high-level programming language written originally
by Larry Wall and now supported and developed by a cast of thousands. The
Perl language semantics are largely based on the C programming language, while also
inheriting many of the best features of sed, awk, the Unix shell, and at least a dozen
other tools and languages.
Although it is a bad idea to pigeonhole any language and assign it to a specific list
of tasks, Perl is particularly strong at process, file, and text manipulation. This makes
it especially useful for system utilities, software tools, systems management tasks,
database access, graphical programming, networking, and web programming. These
strengths make it particularly attractive to CGI script authors, systems administrators,
mathematicians, journalists, and just about anybody who needs to write applications
and utilities very quickly.
Perl has its roots firmly planted in the Unix environment, but it has since become
a cross-platform development tool. Perl runs on IBM mainframes; AS/400s; Windows
NT, 95, and 98; OS/2; Novell Netware; Cray supercomputers; Digital’s VMS; Tandem
Guardian; HP MPE/ix; Mac OS; and all flavors of Unix, including Linux. In addition,
Perl has been ported to dozens of smaller operating systems, including BeOS, Acorn’s
RISCOS, and even machines such as the Amiga.
Larry Wall is a strong proponent of free software, and Perl is no exception. Perl,
including the source code, the standard Perl library, the optional modules, and all of
the documentation, is provided free and is supported entirely by its user community.
Before we get into the details of how to program in Perl, it’s worth taking the
time to familiarize yourself with where Perl has come from, what it can be used
for, and how it stacks up against other languages. We’ll also look at some popular
“mythconceptions” about what Perl is and at some success stories of how Perl has
helped a variety of organizations solve an equally varied range of problems.Versions and Naming Conventions
The current version of Perl (at the time of writing—Nov 2000) was Perl 5.6, with a
develop version, v5.7, already in production. Some sites are migrating to v5.6, others
seem to be dragging their heels, although there are no major compatibility problems.
Up until March 2000, the situation concerning the available versions of Perl was
quite complex, but we’ll start with the “current” version first. From the release of Perl
5.6 there are two very simple strands. Even version numbers, such as 5.6 and 5.8 are
considered to be “stable” releases of the language. Odd version numbers, such as 5.7
and 5.9, are development releases.
Perl 5.6 was a long time coming—over two years since the last major release—
but it also set a landmark for Perl’s development. It was the first version that really
reunited the core and Win32 versions of Perl, as well as providing some compatibility
enhancements. For example, the Windows ports now support fork, something not
natively provided by the Windows operating system. Also updated were the Perl compiler
and the threading system (which actually supports the Windows fork function),
and the addition of a new keyword, our, which handles global variables in the same
way as my.
Discussions have already started for Perl 6. Unlike Perl 5, which was a complete
rewrite of Perl 4 and was developed and coded almost entirely by Larry, Perl 6 will have
its feature set determined by the people that use it, through a series of RFCs (Requests
for Comments). The language’s core code will be developed by a team of programmers
with input and assistance from Larry, and with features agreed upon by committees,
rather than solely by Larry. This will make Perl 6 a language designed by the people
that use it, rather than by the person who invented it.
Perl, perl or PeRl?
There is also a certain amount of confusion regarding the capitalization of Perl.
Should it be written Perl or perl? Larry Wall now uses “Perl” to signify the language
proper and “perl” to signify the implementation of the language. Therefore, perl can
parse Perl. In essence, however, it really doesn’t make a huge amount of difference.
That said, you will find that the executable version of perl is installed with its name
in lowercase!
Life Before Perl 5.6
Before Perl 5.6, version numbers were far more confusing. Before version 5 came
version 4, the highest incarnation of which was 4.036, released in 1993. Version 5 is
still in development, with version 5.005_03 being the last stable release before thecurrent 5.6. However, many sites were using Perl 5.005_56—this was a developmental
release, but stable enough that some sites used it in preference to 5.005_02. Although
there were changes between these versions, they were bug fixes rather than the
significant improvements in Perl 5.6.
As to naming, you will see references to perl4 and perl5, and more recently, perl5.6.
Since most people will be using at least perl5, it’s probably safe to refer to Perl simply
as Perl!Perl History
Perl is a relatively old language, with the first version having been released in 1988.
The basic history is shown in Table 1-1.
If you want a more detailed history of Perl, check out the perlhist documentation
installed with Perl, or visit CPAST, the Comprehensive Perl Arcana Society Tapestry at
history.perl.org.Version Date Version Details
Perl 0 Introduced Perl to Larry Wall’s office associates
Perl 1 Jan 1988 Introduced Perl to the world
Perl 2 Jun 1988 Introduced Harry Spencer’s regular expression
package
Perl 3 Oct 1989 Introduced the ability to handle binary data
Perl 4 Mar 1991 Introduced the first “Camel” book (Programming
Perl, by Larry Wall, Tom Christiansen, and
Randal L Schwartz; O’Reilly & Associates). The
book drove the name change, just so it could refer
to Perl 4, instead of Perl 3.
Perl 4.036 Feb 1993 The last stable release of Perl 4
Perl 5 Oct 1994 The first stable release of Perl 5, which introduced
a number of new features and a complete rewrite.
Perl 5.005_02 Aug 1998 The next major stable release
Perl 5.005_03 Mar 1999 The last stable release before 5.6
Perl 5.6 Mar 2000 Introduced unified fork support, better threading,
an updated Perl compiler, and the our keywordMain Perl Features
Perl contains many features that most Perl programmers do not even know about, let
alone use. Some of the most basic features are described here.
Perl Is Free
It may not seem like a major feature, but, in fact, being free is very important. Some
languages, such as C (which is free with compilers such as GNU’s gcc), have been
commercialized by Metrowerks, Microsoft, and other companies. Other languages,
such as Visual Basic, are entirely commercial. Perl’s source code is open and free—
anybody can download the C source that constitutes a Perl interpreter. Furthermore,
you can easily extend the core functionality of Perl both within the realms of the
interpreted language and by modifying the Perl source code.
Perl Is Simple to Learn, Concise, and Easy to Read
Because of its history and roots, most people with any programming experience will
be able to program with Perl. It has a syntax similar to C and shell script, among others,
but with a less restrictive format. Most programs are quicker to write in Perl because
of its use of built-in functions and a huge standard and contributed library. Most programs
are also quicker to execute than other languages because of Perl’s internal architecture
(see the section, “Perl is Fast” that follows). Perl can be easy to read, because the code
can be written in a clear and concise format that almost reads like an English sentence.
Unfortunately, Perl also has a bad habit of looking a bit like line noise to uninitiated.
Whether or not your Perl looks good and clean really depends on how you format
it—good Perl is easy read. It is also worth reading the Perl style guidelines (in the Perl
style manual page that comes with Perl) to see how Larry Wall, Perl’s creator, likes
things done.
Perl Is Fast
As we will see shortly, Perl is not an interpreter in the strictest sense—when you execute
a Perl program it is actually compiled into a highly optimized language before it is
executed. Compared to most scripting languages, this makes execution almost as fast
as compiled C code. But, because the code is still interpreted, there is no compilation
process, and applications can be written and edited much faster than with other
languages, without any of the performance problems normally associated with an
interpreted language.
Perl Is Extensible
You can write Perl-based packages and modules that extend the functionality of the
language. You can also call external C code directly from Perl to extend the functionalityfurther. The reverse is also true: the Perl interpreter can be incorporated directly into
many languages, including C. This allows your C programs to use the functionality of
the Perl interpreter without calling an external program.
Perl Has Flexible Data Types
You can create simple variables that contain text or numbers, and Perl will treat the
variable data accordingly at the time it is used. This means that unlike C, you don’t
have to worry about converting text and numbers, and you can embed and merge strings
without requiring external functions to concatenate or combine the results. You can
also handle arrays of values as simple lists, as typical indexed arrays, and even as stacks
of information. You can also create associative arrays (otherwise known as hashes)
which allow you to refer to the items in the array by a unique string, rather than a
simple number. Finally, Perl also supports references, and through references objects.
References allow you to create complex data structures made up of a combination
of hashes, lists and scalars.
Perl Is Object Oriented
Perl supports all of the object-oriented features—inheritance, polymorphism, and
encapsulation. There are no restrictions on when or where you make use of objectoriented
features. There is no boundary as there is with C and C++.
Perl Is Collaborative
There is a huge network of Perl programmers worldwide. Most programmers supply,
and use, the modules and scripts available via CPAN, the Comprehensive Perl Archive
Network (see Web Appendix B at www.osborne.com). This is a repository of the
best modules and scripts available. Using an existing prewritten module can save you
hundreds, perhaps even thousands, of hours of development time.
Compiler or Interpreter
Different languages work in different ways; they are either compiled or interpreted.
A program in a compiled language is translated from the original source into a platformspecific
machine code. This machine code is referred to as an executable. There is no
direct relation between the machine code and the original source: it is not possible to
reverse the compilation process and produce the source code. This means that the
compiled executable is safe from intellectual property piracy.
With an interpreted language, on the other hand, the interpreter reads the original
source code and interprets each of the statements in order to perform the different
operations. The source code is therefore executed at run time. This has some advantages:
Because there is no compilation process, the development of interpreted code should
be significantly quicker. Interpreted code also tends to be smaller and easier to
distribute. The disadvantages are that the original source must be supplied in order
to execute the program, and an interpreted program is generally slower than a
compiled executable because of the way the code is executed.
Perl fits neither of these descriptions in the real sense. The internals of Perl are
such that at the time of executing a Perl script, the individual elements of the script are
compiled into a tree of opcodes. Opcodes are similar in concept to machine code—the
binary format required by the processor in your machine. However, whereas machine
code is executed directly by hardware, opcodes are executed by a Perl virtual machine.
The opcodes are highly optimized objects designed to perform a specific function.
When the script is executed you are essentially executing compiled C code, translated
from the Perl source. This enables Perl to provide all the advantages of a scripting
language while offering the fast execution of a compiled program. This mode of
operation—translation and then execution by a virtual machine is actually how most
modern scripting languages work, including Java (using Just In Time technology)
and Python.
Keeping all of that in mind, however, there have been some advances in the most
recent versions of a Perl compiler that takes native Perl scripts and converts them into
directly executable machine code. We’ll cover the compiler and Perl internals later in
this book.
Similar Programming Languages
We already know that Perl has its history in a number of different languages. It shares
several features and abilities with many of the standard tools supplied with any Unix
workstation. It also shares some features and abilities with many related languages,
even if it doesn’t necessarily share the same heritage.
With regard to specific features, abilities, and performance, Perl compares favorably
against some languages and less favorably against others. A lot of the advantages and
disadvantages are a matter of personal preference. For example, for text handling, there
is very little to choose between awk and Perl. However, personally I prefer Perl for
those tasks that involve file handling directly within the code, and awk when using it
as a filter as part of a shell script.
Unix Shells
Any of the Unix shells—sh, csh, ksh, or even bash—share the same basic set of
facilities. They are particularly good at running external programs and at most forms
of file management where the shell’s ability to work directly with many of the
standard Unix utilities enables rapid development of systems management tools.
However, where most shells fail is in their variable- and data-handling routines. In
nearly all cases you need to use the facilities provided by shell tools such as cut, paste,
and sort to achieve the same level of functionality as that provided natively by Perl.Tcl
Tcl (Tool Command Language) was developed as an embeddable scripting language.
A lot of the original design centered around a macro-like language for helping with
shell-based applications. Tcl was never really developed as a general-purpose scripting
language, although many people use it as such. In fact, Tcl was designed with the
philosophy that you should actually use two or more languages when developing large
software systems.
Tcl’s variables are very different from those in Perl. Because it was designed with
the typical shell-based string handling in mind, strings are null terminated (as they are
in C). This means that Tcl cannot be used for handling binary data. Compared to Perl,
Tcl is also generally slower on iterative operations over strings. You cannot pass arrays
by value or by reference; they can only be passed by name. This makes programming
more complex, although not impossible.
Lists in Tcl are actually stored as a single string, and arrays are stored within what
Perl would treat as a hash. Accessing a true Tcl array is therefore slightly slower, as it has
to look up associative entries in order to decipher the true values. The data-handling
problems also extend to numbers, which Tcl stores as strings and converts to numbers
only when a calculation is required. This slows mathematical operations significantly.
Unlike Perl, which parses the script first before optimizing and then executing,
Tcl is a true interpreter, and each line is interpreted and optimized individually at
execution time. This reduces the optimization options available to Tcl. Perl, on the other
hand, can optimize source lines, code blocks, and even entire functions if the compilation
process allows. The same Tcl interpretation technique also means that the only way
to debug Tcl code and search for syntactic errors is to actually execute the code. Because
Perl goes through the precompilation stage, it can check for syntactic and other
possible or probable errors without actually executing the code.
Finally, the code base of the standard Tcl package does not include many of the
functions and abilities of the Perl language. This is especially important if you are
trying to write a cross-platform POSIX-compliant application. Perl supports the entire
POSIX function set, but Tcl supports a much smaller subset of the POSIX function
set, even using external packages.
It should be clear from this description that Perl is a better alternative to Tcl in
situations where you want easy access to the rest of the OS. Most significantly, Tcl will
never be a general-purpose scripting language. Tcl will, on the other hand, be a good
solution if you want to embed a scripting language inside another language.
Python
Python was developed as an object-oriented language and is well thought out. It is an
interpreted, byte-compiled, extensible, and largely procedural programming language.
Like Perl, it’s good at text processing and even general-purpose programming. Python
also has a good history in the realm of GUI-based application development. Compared
to Perl, Python has fewer users, but it is gaining acceptance as a practical rapid
application development tool.
Unlike Perl, Python does not resemble C, and it doesn’t resemble Unix-style tools
like awk either. Python was designed from scratch to be object oriented and has clear
module semantics. This can make it confusing to use, as the name spaces get complex
to resolve. On the other hand, this makes it much more structured, which can ease
development for those with structured minds.
I’m not aware of anything that is better in Python than in Perl. They both share
object features, and the two are almost identical in execution speed. However, the
reverse is not true: Perl has better regular expression features, and the level of integration
between Perl and the Unix environment is hard to beat (although it can probably be
solved within Python using a suitably written external module).
In general, there is not a lot to tip the scales in favor of one of the two languages.
Perl will appeal to those people who already know C or Unix shell utilities. Perl is
also older and more widespread, and there is a much larger library of contributed
modules and scripts. Python, on the other hand, may appeal to those people who
have experience with more object-oriented languages, such as Java or Modula-2.
Both languages provide easy control and access when it comes to the external
environment in which they work. Perl arguably fills the role better, though, because
many of the standard system functions you are used to are supported natively by
the language, without requiring external modules. The technical support for the two
languages is also very similar, with both using websites and newsgroups to help
users program in the new language.
Finally, it’s worth mentioning that of all the scripting languages available, Perl and
Python are two of the most stable platforms for development. There are, however,
some minor differences. First, Perl provides quite advanced functions and mechanisms
for tracking errors and faults in the scripts. Making extensive use of these facilities can
still cause problems, however. For example, calling the system truncate() function
within Perl will cause the whole interpreter to crash. Python, on the other hand, uses
a system of error trapping that will immediately identify a problem like this before
it occurs, allowing you to account for it in your applications. This is largely due to the
application-development nature of the language.
Java
At first viewing, Java seems to be a friendlier, interpreted version of C++. Depending
on your point of view, this can either be an advantage or a disadvantage. Java probably
inherits less than a third of the complexity of C++, but it retains much of the complexity
of its brethren.Java was designed primarily as an implementation-independent language, originally
with web-based intentions, but now as a more general-purpose solution to a variety of
problems. Like Perl, Java is byte compiled, but unlike Perl, programs are supplied in
byte-compiled format and then executed via a Java virtual machine at execution time.
Because of its roots and its complexity, Java cannot really be considered as a direct
competitor to Perl. It is difficult to use Java as a rapid application development tool
and virtually impossible to use it for most of the simple text-processing and systemadministration
tasks that Perl is best known for.
C/C++
Perl itself is written in C. (You can download and view the Perl source code if you so
wish, but it’s not for the faint-hearted!) Many of the structures and semantics of Perl
and C are very similar. For example, both use semicolons as end-of-line terminators.
They also share the same code block and indentation features. However, Perl tends to
be stricter when it comes to code block definitions—it always requires curly brackets,
for example—but most C programmers will be comfortable with the Perl environment.
Perl can be object oriented like C++. Both share the same abilities of inheritance,
polymorphism, and encapsulation. However, object orientation in Perl is easier to
use, compared to the complexities of constructors and inheritance found in C++. In
addition to all this, there is no distinction between the standard and object-oriented
implementations of Perl as there is with C and C++. This means you can mix and
match different variables, objects, and other data types within a single Perl application—
something that would be difficult to achieve easily with C and C++.
Because Perl is basically an interpreted language (as mentioned earlier), development
is generally quicker than is writing in native C. Perl also has many more built-in
facilities and abilities that would otherwise need to be handwritten in C/C++. For example,
regular expressions and many of the data-handling features would require a significant
amount of programming to reproduce in C with the same ease of use available in Perl.
Because of Perl’s roots in C, it is also possible to extend Perl with C source code and
vice versa: you can embed Perl programs in C source code.
awk/gawk
Although a lot of syntax is different, awk, and gawk (the GNU projects version) are
functionally subsets of Perl. It’s also clear from the history of Perl that many of the
features have been inherited directly from those of awk. Indeed, awk was designed
as a reporting language with the emphasis on making the process of reporting via
the shell significantly easier. Without awk, you would have to employ a number of
external utilities, such as cut, expr, and sort, and the solution would be neither quick
nor elegant.
There are some things that Perl has built-in support for that awk does not. For
example, awk has no network socket class, and it is largely ignorant of external files,when compared to the file manipulation and management functions found in Perl.
However, some advantages awk has over Perl are summarized here:
awk is simpler, and the syntax is more structured and regular.
Although it is gaining acceptance, Perl has yet to be included as standard with
many operating systems. Awk has been supplied with Unix almost since it was
first released.
awk can be smaller and therefore much quicker to execute for small programs.
awk supports more advanced regular expressions. You can use a regular
expression for replacement, and you can search text in substitutions.Popular “Mythconceptions”
Despite its history and wide use in many different areas, there are still a number of
myths about what Perl is, where it should be used, and even why it was invented. Here’s
a quick list of the popular mythconceptions of the Perl language.
It’s Only for the Web
Probably the most famous of the myths is that Perl is a language used, designed, and
created exclusively for developing web-based applications. In fact, this could not be
more wrong. Version 1.0 of Perl, the first released to the world, shipped in 1988—
several years before the web and HTML as we know it today were in general use. In
fact, Perl was inherited as a good design tool for web server applications based on
its ease of use and flexibility. The text-handling features are especially useful when
working within the web environment. There are libraries of database interfaces,
client-server modules, networking features, and even GUI toolkits to enable you to
write entire applications directly within Perl.
It’s Not Maintenance Friendly
Any good (or bad) programmer will tell you that anybody can write unmaintainable
code in any language. Many companies and individuals write maintainable programs
using Perl. A lot of people would argue that Perl’s structured style, easily readable
source code, and modular format make it more maintainable than languages such as
C, C++, and Java.
It’s Only for Hackers
Perl is used by a variety of companies, organizations, and individuals. Everybody from
programming beginners through “hackers” up to multinational corporations use Perl
to solve their problems. It can hardly be classed as a hackers-only language. Moreover,it is maintained by the same range of people, which means you get the best of both
worlds—real-world features, with top-class behind-the-scenes algorithms.
It’s a Scripting Language
In Perl, there is no difference between a script and program. Many large programs
and projects have been written entirely in Perl. A good example is Majordomo, the
main mailing-list manager used on the Internet. It’s written entirely in Perl. See the
upcoming section “Perl Success Stories” for more examples of where Perl has made
a difference, despite its scripting label.
There’s No Support
The Perl community is one of the largest on the Internet, and you should be able to find
someone, somewhere, who can answer your questions or help you with your problems.
The Perl Clinic (see Appendix C) offers free advice and support to Perl programmers.
All Perl Programs Are Free
Although you generally write and use Perl programs in their native source form, this
does not mean that everything you write is free. Perl programs are your own intellectual
property and can be bought, sold, and licensed just like any other program. If you are
worried about somebody stealing your code, source filters and bytecode compilers will
render your code useful only for execution and unreadable by the casual software pirate.
There’s No Development Environment
Development environments are only really required when you need to compile source
code into object files. Because Perl scripts are written in normal text, you can use any
editor to write and use Perl programs. Under Unix, the favorites are emacs and vi, and
both have Perl modes to make syntax checking and formatting easier. Under Windows
NT, you can also use emacs, or you can use Solutionsoft’s Perl Builder, which is an
interactive environment for Perl programs. Alternatively, you can use the ActiveState
debugger, which will provide you with a direct environment for executing and editing
Perl statements. There are also many improvements being made in the ActiveState
distribution that will allow Perl to be used as part of Microsoft’s Visual Studio product
under a project called VisualPerl. On the Mac, the BBEdit and Pepper editors have a
Perl mode that colors the syntax of the Perl source to make it easier to read.
Additionally, because Perl programs are text based, you can use any source-code
revision-control system. The most popular solution is CVS, or Concurrent Versioning
System, which is now supported under Unix, MacOS and Windows.
Subscribe to:
Posts (Atom)