Please find below my highly synthetic notes on  “Learning Perl” (7th, 2017, O’Reilly Media) by Randal L. Schwartz, brian d foy, and Tom Phoenix, the famous Llama book [1].

  1. Introduction
  2. Scalar Data
  3. Lists and Arrays
  4. Subroutines
  5. Input and Output
  6. Hashes
  7. Regular Expressions
  8. Matching with Regular Expressions
  9. Processing Text with Regular Expressions
  10. More Control Structures
  11. Perl Modules
  12. File Tests
  13. Directory Operations
  14. Strings and Sorting
  15. Process Management
  16. Some Advanced Perl Techniques

1. Introduction

2. Scalar Data

scalar is a single item.

A scalar is the simplest kind of data Perl manipulates. A scalar is usually a number (255 or 3.25e20) or a string of characters (hello or the Gettysburg address).

Numbers

All Numbers Have the Same Format Internally

Internally Perl relies on C libraries and uses double-precision floating-point values to store the numbers.

In Perl you can specify both integers (255 or 2,001) and floating-point numbers (3.14159 or 1.35 × 1,025). But internally Perl computes with double-precision floating-point values.

In other words, an integer constant in a Perl program is treated as an equivalent floating-point value.

Integer Literals

literal is how a value is represented in a source code, it’s the data typed directly into a program.

0
2001
-40
137
61298040283768

Use embedded underscores for reader-friendliness:

61_298_040_283_768
Nondecimal Integer Literals

Octal (base 8) literals start with a leading 0 and use the digits from 0 to 7:

0377  # same as 255 decimal

Hexadecimal (base 16) literals start with a leading 0x and use the digits from 0 to 9 and the letters A through F (or a through f) to represent values from 0 to 15:

0xff  # FF hex, also 255 decimal

Binary (base 2) literals start with a leading 0b and use only the digits 0 and 1:

0b11111111  # also 255 decimal

 The “leading zero” indicator works only for literals—not for automatic string-to-number conversions.

Use underscores to improve readability of long nondecimal literals:

0x1377_0B77
0x50_65_72_7C
Floating-Point Literals

Perl follows C-like notation for floating-point literals:

  1.25
255.000
255.0
  7.25e45  # 7.25 times 10 to the power of 45
 -6.5e24   # negative 6.5 times 10 to the 24th
-12e-24    # negative 12 times 10 to the -24th
 -1.2E-23  # another way to say that the E may be uppercase

Perl after v5.22 supports hexadecimal floating-point literals. Instead of an e to mark the exponent, use a p for the power of 2 exponent:

0x1f.0p3
Numeric Operators

Operators are Perl’s verbs:

2 + 3  # 2 plus 3, or 5
5.1 - 2.4  # 5.1 minus 2.4, or 2.7
3 * 12  # 3 times 12, or 36
14 / 2  # 14 divided by 2, or 7
10.2 / 0.3  # 10.2 divided by 0.3, or 34
10 / 3  # always, a floating-point division, i.e. 3.(3)

Perl’s numeric operators return what you would expect from doing the same operation on a calculator.

Modulus operator (%) in Perl returns remainder of the division. Operands are cast to integers first:

10.5 % 3.2  # -> 10 % 3 -> 1

Fortran-like exponentiation operator is represented by an asterisk:

2**3  # two to the third power, or eight

Strings

Perl has the ability to have any character in a string. This allows to create, scan and manipulate raw binary data as strings.

Add utf8 pragma to use Unicode in a program:

use utf8;

Make sure to save your files with the UTF 8 encoding.

 A pragma is something that tells the Perl compiler how to act.

There are two flavours of literal strings in Perl: single-quoted and double-quoted string literals.

Single-Quoted String Literals
Double-Quoted String Literals
String Operators
Automatic Conversion Between Numbers and Strings

Perl’s Built-In Warnings

Interpreting Nondecimal Numerals

Scalar Variables

Choosing Good Variable Names
Scalar Assignment
Compound Assignment Operators

Output with print

Interpolation of Scalar Variables into Strings
Creating Characters by Code Point
Operator Precedence and Associativity
Comparison Operators

The if Control Structure

Boolean Values

Getting User Input

The chomp Operator

The while Control Structure

The undef Value

The defined Function

3. Lists and Arrays

4. Subroutines

5. Input and Output

Input from Standard Input

Obtain next line of input in scalar context:

$line = <STDIN>;  # read next line
chomp($line);     # chomp it (remove newline character)

# idiomatic Perl
chomp($line = <STDIN>);

Line-input operator <> returns undef when it reaches end-of-file. Use this to drop out of loops:

while (defined($line = <STDIN>)) {
    print "Line: $line";
}

# idiomatic Perl
# works only if there is nothing but line-input operator in conditional
while (<STDIN>) {
    print "Line: $_";
}

Line-input operator in list context results in all of remaining lines of input as list:

foreach (<STDIN>) {
    print "Line: $_";
}

 It is best to use line-input operator in a scalar context to read input line-by-line. In list context Perl fetches all input at once.

Input from the Diamond Operator

Diamond operator is a special kind of line-input operator. Input can come from the user’s choice, not only from the keyboard.

while (defined($line = <>)) {
    chomp($line);
    print "Line: $line\n";
}

# idiomatic Perl
while (<>) {
    chomp;  # by default chomp works on $_
    print "Line: $line\n";
}

 Current file name is kept in Perl’s special variable $ARGV. This name is “-” if input comes from the standard input stream STDIN.

The Double Diamond

Double diamond operator supports special characters in the filename, e.g. |. Using double diamond will avoid performing a “pipe open” and running an external program.

use v5.22;

while (<<>>) {
    chomp;
    print "Line: $_";
}

The Invocation Arguments

Perl stores invocation arguments in a special array @ARGV.

Use it as any other array to:

  • shift items,
  • iterate over it with foreach,
  • check if any arguments start with a hyphen.

 Use modules Getopt::Long and Getopt::Std to process options in a standard way.

Tinker with the array @ARGV after the program start and before the diamond <> invocation:

@ARGV = qw# larry moe curly #;  # force these three files to be read
while (<>) {
    chomp;
    print "Line: $_\n";
}

Output to Standard Output

Printing array results in a list of items, with no spaces in between:

my @array = qw/ fred barney betty /;
print @array;  # -> fredbarneybetty

Interpolating array prints contents of an array separated by spaces:

print "@array";  # -> fred barney betty

Default separator is a space character. The separator is stored in a special variable called $”.

Other examples include:

# implementation of /bin/cat
print <>;

# implementation of /bin/sort
print sort <>;

 Perl power tools project implements all classic Unix utilities in Perl => Makes these standard utilities available on non-Unix systems [2].

Formatted Output with printf

Template string in printf holds multiple conversions (percent sign and letter):

printf "Hello, %s! Your password expires in %d days!\n",
    $username, $days_to_exp;

Common conversions are:

%g number in floating point, integer or exponential notation (chosen automatically)
%d decimal integer
%x hexadecimal
%o octal
%s string
%f floating-point with round off
%% literal percent sign

Automatic choice of floating point, integer and exponential notation:

printf "%g %g %g\n", 5/2, 51/17, 51**17;  # 2.5 3 1.0683e+29

Decimal integer conversion truncates the number:

printf "Pass expires in %d days!\n", 17.85;  # ...in 17 days!

Hexadecimal and octal conversions are:

printf "Hex: %h\n", 17;  # Hex: 0x11
printf "Oct: %o\n", 17;  # Oct: 021

String conversion with width specification is:

# positive width = right-justified string
printf "Surname: %10s\n", $surname;  # Surname: ```Nabokov;

# negative width = left-justified string
printf "Surname: %-10s\n", $surname;  # Surname: Nabokov```;

Floating-point conversion with round off is:

# positive width = right-justified number
printf "12f\n",   6*7 + 2/3;  # ```42.666667
printf "12.3f\n", 6*7 + 2/3;  # ``````42.667
printf "12.0f\n", 6*7 + 2/3;  # ``````````43

Asterisk * inside format string uses its argument as width:

printf "%*s\n", 10, Nabokov;  # ```Nabokov

Use two asterisks to specify total width and number of decimal places in a float:

printf "%*.*f\n", 6, 2, 3.1415926535;  # ```3.14
printf "%*.*f\n", 6, 3, 3.1415926535;  # ``3.142

 See sprintf documentation for more options.

Arrays and printf

To generate a format string on the fly store it in a variable first:

my @items = qw( wilma dino pebbles );

# use x operator to replicate given string a number of times
# @items in scalar context returns array length
my $fmt   = "The items are:\n" . ("%10s\n" x @items);

print "(debug) Format: $fmt";
printf $fmt, @items;

Filehandles

Filehandle names an I/O connection between a Perl process and the outside world.

 Filehandle is not necessarily a filename.

Perl recommends to use all uppercase letters in the name of a filehandle.

Six special filehandle names reserved by Perl are:

  • STDIN
  • STDOUT
  • STDERR
  • DATA
  • ARGV
  • ARGVOUT

If a user calls a Perl script as:

$ ./process.pl <in.txt >out.txt 2>err.txt

Inside Perl script files ‘in.txt’, ‘out.txt’ and ‘err.txt’ will be available as STDIN, STDOUT and STDERR filehandles.

Opening a Filehandle

Use Perl’s open operator to create a custom connection:

open CONFIG,  '.ssh/config';
open CONFIG,  '<.git/config';
open RESULTS, '>results.txt';  # wipes out file contents
open LOG,     '>>make.log';    # opens file for appending

Three-argument version of open is a safer option:

open CONFIG,  '<', '.git/config';
open RESULTS, '>',  $fh_results;
open LOG,     '>>', &logfile_name();

Specify an encoding along with the mode:

open CONFIG, '<:encoding(UTF-8)',      '.git/config';     # UTF 8
open LOG,    '>>:encoding(iso-8859-1)', &logfile_name();  # Latin-1

Binmoding Filehandles

Turn off processing of line ending:

# don't translate line endings
binmode STDOUT;
binmode STDERR;

Specify a layer to ensure filehandles know about intended encodings:

binmode STDOUT, ':encoding(UTF-8)';
binmode STDIN,  ':encoding(KOI8-R)';

Bad Filehandles

Operator open returns true, if it succeeded or false otherwise:

# capture return value
my $status = open LOG, '>>', 'configure.log';

if (! $status ) {
    # open failed, clean up and recover
    ...
}

Closing a Filehandle

close LOG;

Fatal Errors with die

Operator die terminates the program:

if (! open LOG, '>>', 'logfile') {
    die "Cannot create logfile: $!";
}

Special variable $! contains a human-readable error message.

die automatically appends program name and line number, where it failed.

 Always check the status of open => The rest of the program relies upon it.

Warning Messages with warn

Use warn function to issue a warning and proceed with the code.

Automatically die-ing

Pragma autodie automatically calls die, if a file open fails:

use autodie;

open LOG, '>>', 'logfile';

# Potential error message:
#   Can't open ('>>', 'logfile'): No such file or directory at test line 3

Using Filehandles

Open a filehandle and read from it:

if ( ! open PASSWD, "/etc/passwd") {
    die "Unable to open passwords file: $!";
}

# line-input operator <> around filehandle
while (<PASSWD>) {
    chomp;
    ...
}

Use filehandle open for writing or appending:

# output into LOG
print LOG "Captain's log, stardate 3.14159\n";

# output into STDERR
printf STDERR "%d percent complete.\n", $done/$total * 100;

 There is no comma between a filehandle and values to print.

Changing the Default Output Filehandle

Functions print and printf output into STDOUT. Operator select changes the default behaviour:

select BEDROCK;

print "I hope Mr. Slate doesn't find out about this.\n";
print "Wilma!\n";

Gentlemen set it back to STDOUT when they are done:

select LOG;
$| = 1;  # disable output buffering
select STDOUT;
# ... time passes, babies are born, Universe expands, entropy grows and then...
print LOG "This gets written to the LOG at once!\n";

Reopening a Standard Filehandle

Send errors to a custom error log:

if ( ! open STDERR, ">>/home/mabalenk/.error_log" ) {
    die "Can't open error log for append: $!";
}

Output with say

Built-in say is the same as print, but it puts a newline at the end:

use v5.10;

# all these forms produce the same output
print "Hello!\n";      # -> "Hello!\n"
print "Hello!", "\n";  # -/-
say   "Hello!";        # -/-

To interpolate an array quote it:

use v5.10;

my   @arr = qw( a b c d );
say  @arr;   # -> "abcd\n"
say "@arr";  # -> "a b c d\n"

Specify a filehandle with say:

use v5.10;

# note: no comma after filehandle name
say ROBOTLOG "Klatu barada nikto!";

Filehandles in a Scalar

Use scalar in place of a bareword:

my $fh_params;
open $fh_params, '<', 'sim_params.txt'
    or die "Could not open $fh_params: $!";

Combine two statements (declaring and opening):

open my $fh_params, '<', 'sim_params.txt'
    or die "Could not open $fh_params: $!";

Once filehandle is in a scalar, use it similarly to a bareword:

open my $fh_params, '>', 'sim_params.txt';

# note: no comma after filehandle name
say     $fh_params  "C_0: " . (2.99*10**8);
close   $fh_params;

open    $fh_params, '<', 'sim_params.txt';
while (<$fh_params>) {
    chomp;
    ...
}
close   $fh_params;

6. Hashes

What Is a Hash?

Hash is a data structure. It allows to look up hash values by name. Hash indices are called keys. They aren’t numbers, but arbitrary, unique strings.

There is neither a fixed order in a hash, nor a first element.

Hash is a collection of key-value pairs.

Hash keys are always unique. They are converted to strings. Same value can be stored more than once.

Why Use a Hash?

Use hash when one set of data “is related” to another set of data.

Hash Element Access

To access an element of a hash use this syntax:

$hash{$some_key}

Use curly braces instead of square brackets around the key. Key expression is now a string, not a number:

$family_name{'fred'}   = 'flinstone';
$family_name{'barney'} = 'rubble';

# iterate over hash values
foreach my $person (qw< barney fred >) {
    print "I've heard of $person $family_name{$person}.\n";
}

When choosing a hash name, think of the word “for” between the name of the hash and the key.

Hash elements spring into existence when you first assign to them:

$family_name{'wilma'}  = 'flinstone';             # adds new key-value pair
$family_name{'betty'} .= $family_name{'barney'};  # creates an element if needed

This feature is called autovivification.

Accessing an element outside a hash returns undef.

$granite = $family_name{'larry'};  # no larry here => undef

The Hash as a Whole

Refer to the entire hash with a percent sign (%) as a prefix.

It is possible to convert a hash into a list and back again.

Assigning to a hash is a list-context assignment (list is key-value pairs):

%some_hash = ('foo', 35, 'bar', 12.4, 2.5, 'hello',
    'wilma', 1.72e30, 'betty', "bye\n");

Value of hash in a list context is a simple list of key-value pairs:

@any_array = %some_hash;

Turning the hash back into a list of key-value pairs is called unwinding in Perl.

 Use a hash either when items order is not important or when you have an easy way to control their order.

But key-value pairs in a hash stay together, i.e. value will follow its key.

Hash Assignment

To copy a hash simply assign one hash to another:

my %new_hash = %old_hash;

This is a computationally expensive operation in Perl: first an %old_hash is unwound into a list, then it is assigned to a %new_hash one key-value pair at a time.

To inverse a hash write:

# swap keys with values
my %ip_address = reverse %host_name;

Perl uses the rule: “last one wins”. Later items in the list overwrite earlier ones.

The Big Arrow

In Perl grammar any comma , can be written as a big arrow => (fat comma).

Alternative way to set up a hash of last names is:

# easier to see keys and corresponding values
my %last_name = (
    'fred'   => 'flinstone',
    'dino'   =>  undef,
    'barney' => 'rubble',
    'betty'  => 'rubble',  # extra comma is harmless, but convenient
);

Perl shortcut: it’s possible to omit the quote marks on some hash keys, when you use a fat comma:

# omit quotes on _bareword_ keys
my %last_name = (
    fred   => 'flinstone',
    dino   =>  undef,
    barney => 'rubble',
    betty  => 'rubble',
);

Use this shortcut in curly braces of a hash element reference:

$score{fred}  # instead of $score{'fred'}

Hash Functions

The keys and values Functions

The keys function yields a list of all keys in a hash, values function returns the corresponding values:

my %hash = (
    'a' => 1,
    'b' => 2,
    'c' => 3,
);
my @k = keys   %hash;
my @v = values %hash;

Use these functions in a scalar context to retrieve the number of elements in a hash:

my $count = keys %hash;  # returns 3 => hash contains three key-value pairs

Use hash in a Boolean context to find out, if a hash is not empty:

# if hash is not empty
if (%hash) {
    print "True!\n";
}

The each function

The each function returns a key-value pair as a two element list. It is a common way to iterate over a hash. Use this function in a while loop:

while ( ($key, $value) = each %hash ) {
    print "$key => $value\n";
}

To go through the hash in order, sort the keys:

foreach $key (sort keys %hash) {
    $value = $hash{$key};
    print "$key => $value\n";

    # shorter alternative
    print "$key => $hash{$key}\n";
}

Typical Use of a Hash

Library database to keep track of how many books each person has checked out is a good example of a hash use:

$books{'fred'}  = 3;
$books{'wilma'} = 1;

See, whether an element of the hash is true or false:

if ($books{$someone}) {
    print "$someone has at least one book checked out.\n";
}

$books{'barney'}  = 0;      # no books currently checked out
$books{'pebbles'} = undef;  # no books ever checked out; new library card

The exists Function

The exists function returns true, iff the given key exists in the hash:

if (exists $books{'dino'}) {
    print "There is a library card for dino!\n";
}

The delete Function

The delete function removes the given key-value pair from the hash:

my $person = 'betty';
delete $books{$person};  # revoke library card for $person

Hash Element Interpolation

Single hash element can be interpolated into a double-quoted string:

# for each patron, in order
foreach $person (sort keys %books) {
    if ($books{$person}) {
        print "$person has $books{$person} items\n";  # fred has 3 items
    }
}

There is no support for entire hash interpolation, e.g. %books. Beware of the magical characters ($, @, “, \, ‘, [, {, ->, ::) that need back-slashing.

The %ENV Hash

Perl program environment is stored in the %ENV hash:

print "PATH is $ENV{'PATH'}\n";

 In Perl, dollar sign ‘$’ means there is one of something, at sign ‘@’ means there is a list of something and percent sign ‘%’ means there is an entire hash.

7. Regular Expressions

 Mastering Regular Expressions (3rd, 2006, O’Reilly Media) by Jeffrey Friedl [3]
 Watch regexes with Regexp::Debugger
 Know your character classes under different semantics

Table 7-1. Regular expression quantifiers and their generalised forms
Number to match Metacharacter Generalised form
Optional ? {0,1}
Zero or more * {0,}
One or more + {1,}
Minimum with no maximum {3,}
Minimum with maximum {3,5}
Exactly {3}
Table 7-2. ASCII character class shortcuts
Shortcut Matches Note
\d decimal digit  
\D not a decimal digit  
\s whitespace  
\S not whitespace  
\h horizontal whitespace (v5.10 and later)
\H not horizontal whitespace (v5.10 and later)
\v vertical whitespace (v5.10 and later)
\V not vertical whitespace (v5.10 and later)
\R generalised line ending (v5.10 and later)
\w “word” character  
\W not a “word” character  
\n newline (not really a shortcut)
\N non-newline (stable in v5.18)

8. Matching with Regular Expressions

 Unicode’s case folding rules

Table 8-0. Matching regular expressions
ExpressionNote
m/pattern/s match any character, even newline
m/pattern/i case insensitive matching
m/pattern/x make whitespace inside pattern insignificant
m/pattern/m multiline matching
m/\Apattern/ beginning of line
m/pattern\Z/ end of line
m/\b{wb}/ word boundary
m/\b{sb}/ sentence boundary
m/\b{lb}/ line boundary
m/(pat)tern/ capture group between (), available as $1
m/(pat)tern \1/ capture group, reuse capture group in matching expression
m/(?:pat)tern/ non-capturing parentheses, do not save result into $1
m/(?<LABEL>pattern)/ named capture, stored in %+, available as $+{LABEL}
m/(?<LABEL>pattern \g<LABEL>)/ named capture, reuse named capture group in matching expression
'Brave new world!'=~ m/new/ automatic match variables $& Brave, $` new, $' world!
Table 8-1. Regular expression precedence
Regular expression feature Example
Parentheses (grouping | capturing) (...), (?:...), (?<LABEL>...)
Quantifiers a* a+ a? a{n,m}
Anchors and sequence abc ^ $ \A \b \z \Z
Alternation a|b|c
Atoms a [abc] \d \1 \g{2}

Pattern test program

while (<>) {
  chomp;
  if (/YOUR_PATTERN_GOES_HERE/) {
    print "Matched: |$`<$&>$'|\n";
  }
  else {
    print "No match: |$_|\n";
  }
}

9. Processing Text with Regular Expressions

Table 9-0. Processing regular expressions
Expression Note
s/before/after/ substitution
s/before/after/g global replacement
s/pattern/\U$1 turn pattern to upper case with \U # PATTERN
s/pattern/\L$1 turn pattern to lower case with \L # pattern
s/pattern/\u\L$1 turn pattern to title case with \u\L # Pattern
s/pattern/\l\U$1 turn pattern to inverse title case with \l\U # pATTERN
lc, uc, fc, lcfirst, ucfirst case shifting functions
my @fields = split /separator/, $string; break up a string according to a pattern
my $result = join $glue, @pieces; glues together a bunch of pieces to make a single string

 Tom Christiansen on parsing HTML with regular expressions

Table 9-1. Regular expression quantifiers with the non-greedy modifier
Number to match Metacharacter
?? Zero matches (useless)
*? Zero or more, as few as possible
+? One or more, as few as possible
{3,}? At least three, but as few as possible
{3,5}? At least three, as many as five, but as few as possible
{3}? Exactly three

 Levels of regular expression compliance in Unicode Technical Report #18

Updating many files

#!/usr/bin/perl -w

use strict;

$^I = ".bak";

my $date = localtime;

while (<>) {
  chomp;
  s/\AAuthor:.*/Author: Randal L. Schwartz/;
  s/\APhone:.*\n//;
  s/\ADate:.*/Date: $date/;
  print;
}

Equivalent one-liner:

perl -p -i.bak -w -e 's/\AAuthor:.*/Author: Randal L. Schwartz/g' fred*.dat

10. More Control Structures

Control structures

  unless (...)  # equivalent to if (! ...)
  until  (...)  # equivalent to while (! ...)

Statement modifiers

  ... if ...
  ... unless ...
  ... while ...
  ... foreach ...

Examples

print "$n is a negative number" if $n < 0;
print " ", ($n += 2) while $n < 10;

Naked block provides scope for temporary lexical variables:

{
    body;
    body;
    body;
}

elsif clause

if $n < 5 {
    ...
}
elsif $n > 5 {
    ...
}
else {
    ...
}

Autoincrement and autodecrement

my $i = 1;
$i++;
my @people = qw(fred barney fred wilma dino barney fred pebbles);
my %count;                     # empty hash
$count{$_}++ foreach @people;  # create new keys and values as needed

Value of autoincrement

my $i =    1;
my $j = ++$i;  #  pre-increment, increment first, assign    second
my $k = $i++;  # post-increment, assign    first, increment second

Infinite loop

for (;;) {
    ...
}

while (1) {
    ...
}

Loop controls

last  # equivalent to C's break
next  # proceed to next iteration of loop
redo  # redo current iteration, do not test for any condition

Conditional (ternary) operator

my $location = &is_weekend($day) ? "home" : "work";

Logical operators

&&  # and
||  # or

Defined-or (since v5.10)

//  # equivalent to make's ?=
my $last_name = $last_name{$someone} // '(No last name)'

11. Perl Modules

Finding modules

 MetaCPAN

Check if module is installed by trying to read its documentation

perldoc Digest::SHA

Get details on a module with the cpan command

cpan -D Digest::SHA

Installing modules

If the module uses ExtUtils::MakeMaker install new modules with

perl Makefile.PL INSTALL_BASE=&lt;prefix&gt;
make install

Modules based on Module::Build should be build and installed with

perl Build.PL
./Build install --install_base=&lt;prefix&gt;

To use CPAN.pm for module installation and all of its dependencies issue

perl -MCPAN -e shell

Another (user-friendly) alternative is to install modules with the cpan script (it comes with Perl)

cpan Module::CoreList LWP CGI::Prototype

Finally, there is cpanm (cpanminus) that is designed as a zero-configuration, lightweight CPAN client.

cpanm DBI WWW::Mechanize

Using custom installation directories

Use local::lib to set environment variables affecting CPAN module installation. View default settings with

perl -Mlocal::lib

Call cpan with -I switch to respect local::lib settings

cpan -I Set::CrossProduct

To make your Perl program aware of modules in custom locations issue

use local::lib;

Using Simple Modules

use File::Basename;

my $name = "/usr/local/bin/perl";
my $base = basename $name;  # returns 'perl'

Using Only Some Functions from a Module

use File::Basename qw(basename);

Call functions by their full names

use File::Basename qw();  # import no functions

my $dirname = File::Basename::dirname $name;

File::Spec Module

use File::Spec;  # object oriented module

my $path = File::Spec->catfile($dirname, $basename);

Path::Class Module

my $dir    =  dir( qw(Users mabalenk repo));
my $subdir = $dir->subdir('git');        # /Users/mabalenk/repo/git
my $parent = $dir->parent;               # /Users/mabalenk
my $windir = $dir->as_foreign('Win32');  # \Users\mabalenk\repo

Databases and DBI

 Programming the Perl DBI (2000, O’Reilly Media) by Tim Bunce and Alligator Descartes [4]  DBI website

Example of using the DBI module:

use DBI;

# connect to database
my $dbh = DBI->connect($data_source, $username, $password);

# prepare, execute, read query
my $sth = $dbh->prepare("SELECT * FROM persons WHERE surname == 'Smith'");
$sth->execute;
my @row = $sth->fetchrow_array;
$sth->finish;

# disconnect from database
$dbh->disconnect();

Dates and Times

Example of using the Time::Moment module:

use Time::Moment;

# convert time from system to human readable
my $dt = Time::Moment->from_epoch( time );

# obtain current time
my $dt = Time::Moment->now;

# access various parts of date
printf "%4d%02d%02d", $dt->year, $dt->month, $dt->day_of_month;

# arithmetic operations with two time moment objects
my $dt1 = Time::Moment->new(
    year  => 1987,
    month => 12,
    day   => 18,
);

my $dt2 = Time::Moment->now;

my $years  = $dt1->delta_years(  $dt2 );
my $months = $dt1->delta_months( $dt2 ) % 12;

printf "%d years and %d months\n", $years, $months;

 Use postfix dereferencing

12. File Tests

File test operators

To get the list of all the file test operators type:

perldoc -f -X

Processing a list of files:

my @original_files = qw( fred barney betty wilma pebbles dino bamm-bamm );
my @outdated_files;  # to be put on backup tapes
foreach my $filename (@original_files) {
    push @outdated_files, $filename
        if -s $filename > 100_000 and -A $filename > 90;
}
Table 12-1. File tests and their meaning
File test Meaning
-r File or directory is readable by this (effective) user or group
-w File or directory is writable by this (effective) user or group
-x File or directory is executable by this (effective) user or group
-o File or directory is owned by this (effective) user or group
-R File or directory is readable by this real user or group
-W File or directory is writable by this real user or group
-X File or directory is executable by this real user or group
-O File or directory is owned by this real user or group
-e File or directory name exists
-z File exists and has zero size (always false for directories)
-s File exists and has nonzero size (the value is the size in bytes)
-f Entry is a plain file
-d Entry is a directory
-l Entry is a symbolic link
-S Entry is a socket link
-p Entry is a named pipe (a “fifo”)
-b Entry is a block-special file (like a mountable disk)
-c Entry is a character-special file (like an I/O device)
-u File or directory is setuid
-g File or directory is setgid
-k File or directory has a sticky bit set
-t The filehandle is a TTY (as reported by the isatty() system function; filenames can't be tested by this test)
-T File looks like a “text” file
-B File looks like a “binary” file
-M Modification age (measured in days)
-A Access age (measured in days)
-C Inode-modification age (measured in days)

Using default filename stored in $_

foreach (@filenames) {
    print "$_ is readable\n" if -r;  # same as -r $_
}

Testing Several Attributes of the Same File

Calling system’s stat each time:

# expensive file tests
if (-r $filename and -w $filename) {
    ...
}

Re-using information from last file lookup:

# elegant file tests
# _ is a virtual filehandle
if (-r $filename and -w _) {
    ...
}

Stacked File Test Operators (Starting with Perl 5.10)

use v5.10;

if (-w -r $filename) {
    print "The file is both readable and writable!\n";
}

The stat and lstat Functions

Return value of a call to stat is a 13-element list:

my ($dev, $ino, $mode, $nlink, $uid, $gid, $rdev,
    $size, $atime, $mtime, $ctime, $blksize, $blocks)
        = stat($filename);
$dev and $inode
device number and inode number of the file
$mode
set of permission bits for the file and some other bits
$nlink
number of (hard) links to the file or directory
$uid and $gid
numeric user-ID and group-ID showing file’s ownership
$size
size in bytes as returned by the -s file test
$atime, $mtime, $ctime
three timestamps (access, modification, creation), tell number of seconds since the epoch

Use lstat function to obtain information on symbolic links.
File::stat module provides a friendlier interface to stat.

The localtime function

Function localtime in a scalar context converts a string into human readable date-time string:

my $timestamp = 1454133253;
my $date      = localtime $timestamp;

localtime in a list context:

my ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst)
    = localtime $timestamp;
$mon
a month number [0, 11]
$year
number of years since 1900
$wday
weekday from Sunday to Saturday [0, 6]
$yday
day-of-the-year from Jan 1 through Dec 31 [0, 364(5)]

Function gmtime returns time in Universal Time.
Function time returns current timestamp from system clock.

Bitwise Operators

Table 12-2. Examples of bitwise operations
Expression Meaning
10&12 Bitwise-and—which bits are true in both operands (this gives 8)
10|12 Bitwise-or&mdashwhich bits are true in one operand or the other (this gives 14)
10^12 Bitwise-xor—which bits are true in one operand or the other but not in both (this gives 6)
6<<2 Bitwise shift left—shift the left operand the number of bits shown by the right operand, adding zero-bits at the least-significant places (this gives 24)
25>>2 Bitwise shift right—shift the left operand the number of bits shown by the right operand, discarding the least-significant bits (this gives 6)
~10 Bitwise negation, also called unary bit complement, returns the number with the opposite bit for each bit in the operand (this gives 0xFFFFFFF5)

Using Bitstrings

Since Perl 5.22 it is possible to use either (all) numeric bitwise operation or (all) string:

use v5.22;
use feature qw(bitwise);
no warnings qw(experimental::bitwise);

my $number_str = '137';
my $str        = 'Amelia';

# treat both operands as numbers
say "number_str &  str: ", $number_str & $str;

# treat both operands as strings
say "number_str &. str: ", $number_str &. $str;

Results in

number_str &  string: 0
number_str &. string: ¿!%

13. Directory Operations

The current working directory

Obtain current working directory using the Cwd module:

use Cwd;
say "current working directory: ", getcwd();

Use the File::Spec module to convert between relative and absolute paths.

Changing the directory

Change directory with chdir:

chdir '/etc' or die "Cannot change to /etc: $!";

Module File::HomeDir helps to set and get the environment variables for chdir.

Globbing

Use the glob operator to expand a pattern into the matching filenames:

my @all_files = glob '*';
my @pm_files  = glob '*.pm';

An alternate syntax for globbing

Legacy Perl code uses angle-bracket syntax:

my @all_files = <*>;

# fetch dot and non-dot files
my $dir = '/etc';
my @dir_files = <$dir/* $dir/.*>;

Directory handles

Obtain list of filenames from a directory using a directory handle:

my $dname = '/etc';

opendir my $dhandle, $dname or die "Cannot open $dname: $!";

foreach $file (readdir $dhandle) {
    print "Processing file $file\n";
}
closedir $dhandle;

Alternatively use a bareword directory handle:

opendir DIR, $dname or die "Cannot open $dname: $!";

foreach $file (readdir DIR) {
    print "Processing file $file\n";
}
closedir DIR;

Compared to globbing this is a lower-level operation with more manual work. The list includes all files, not just those matching a given pattern. Implement a skip-over function to obtain the necessary files:

while ($name = readdir $dh) {

    # files ending with .pm
    next unless $name =~ /\.pm\z/;

    # nondot files
    next if $name =~ /\A\./;

    # everything but the dot and dotdot files
    next if $name eq '.' or $name eq '..';

    # ... more processing ..
}

 Filename returned by the readdir operator has no pathname component! It’s only the name within the directory. Patch up the name to get the full name:

opendir $dir, $dname or die "Cannot open $dname: $!";

while (my $name = readdir $dir) {

    next if $name =~ /\A./;             # skip over dot files
    $name = "$dname/$name";             # patch up the path
    next unless -f $name and -r $name;  # obtain only readable files

    # ... more processing ...
}

To improve portability use File::Spec::Functions module to construct the path:

use File::Spec::Functions;

opendir $dir, $dname or die "Cannot open $dname: $!";

while (my $name = readdir $dir) {

    next if $name =~ /\A./;             # skip over dot files
    $name = catfile($dname, $name);     # patch up the path
    next unless -f $name and -r $name;  # obtain only readable files

    # ... more processing ...
}

Removing files

Use the unlink operator to remove files:

unlink 'slate', 'bedrock', 'lava';

unlink qw(slate, bedrock, lava);

Combine unlink with glob to delete multiple files:

unlink glob '*.o';

Return value of a successful call to unlink is a number of files deleted:

my $nfiles = unlink glob '*.o';
print "$nfiles were deleted.\n";

To know which unlink operation failed, process files in a loop:

foreach my $file ( qw(slate bedrock lava) ) {
    unlink $file or warn "failed on $file: $!\n";
}

 It is possible to remove files that you can’t read, you can’t write, you can’t execute and you don’t even own (chmod 0).

Renaming files

Give a new name or move a file with the rename function:

rename 'old', 'new';
rename 'over_there/some/place/some_file',   'some_file';

# using a fat arrow as a separator
rename 'over_there/some/place/some_file' => 'some_file';

 Moving files to another disk partition with rename is not possible.

To batch rename a list of files:

foreach my $file (glob "*.old") {
    my $newfile = $file;
    
    # left  part of substituion is a regex
    # right part is a string, no backslash required
    $newfile =~ s/\.old$/.new/;

    if (-e $newfile) {
        warn "can't rename $file to $newfile: $newfile exists\n";
    }
    elsif (rename $file => $newfile) {
        # success, do nothing
    }
    else {
        warn "rename $file to $newfile failed: $!\n";
    }
}

Links and files

Create hard and soft links to files with:

link 'chicken', 'egg';  # hard

symlink 'dodgson', 'carroll';  # soft, symbolic link

To test whether a file is a symbolic link use:

print "$file is a symbolic link" if (-l $file);

To find out where a symbolic link is pointing, use the readlink function:

my $perl = readlink '/usr/local/bin/perl';

Making directories

Make a directory:

mkdir 'fred', 0755 or warn "Cannot make fred directory: $!";

Second parameter is the initial permission setting in octal. Make sure to write it with a leading zero or use the oct function:

mkdir $dname, oct($dpermissions);

Use extra call to oct function when the value comes from user input:

my ($dname, $permissions) = @ARGV;
mkdir $dname, oct($permissions) or die "Cannot create $dname: $!";

Removing directories

To remove empty directories use rmdir function. Although, it removes one directory at a time:

foreach my $dir ( qw(fred barney betty) ) {
    rmdir $dir or warn "cannot rmdir $dir: $!\n";
}

Example of writing many temporary files during the execution of a program:

# create directory based on process ID
my $temp_dir = "/tmp/scratch_$$";
mkdir $temp_dir, 0700 or die "cannot create $temp_dir: $!";
# ...
# use $temp_dir as location of all temporary files
# ...
unlink glob "$temp_dir/* $temp_dir/.*";  # delete $temp_dir contents
rmdir $temp_dir;                         # delete now-empty directory

Check out File::Temp module for creating temporary directories or files and remove_tree function provided by the File::Path module.

Modifying permissions

Use chmod function to change permissions on a file or directory:

chmod 0700, 'fred', 'barney';

 Symbolic permissions with ugoa and rwxXst are not supported by the chmod function. Use File::chmod module to enable symbolic mode values in chmod.

Changing ownership

Change the ownership and group membership of a list of files:

my $user  = 1004;
my $group =  100;
chown $user, $group glob '*.cxx';

Apply helper functions getpwnam and getgrnam to convert user and group names into numbers:

defined(my $user  = getpwnam 'jsmith') or die 'bad user:  $!\n';
defined(my $group = getgrnam 'users')  or die 'bad group: $!\n';
chwon $user, $group, glob '/home/mabalenk/*'

Changing timesteps

Use utime function to update the access and modification time of a list of files:

my $atime = time;
my $mtime = $atime - 24 * 60 * 60;   # seconds per day
utime $atime, $mtime, glob '*.tex';  # set access to now, mod to a day ago

14. Strings and Sorting

Finding a substring with index

To locate first occurrence of substring in a string use the index operator:

my $idx = index($string, $substring);

 The character position returned by index is a zero-based value. If the substring was not found index returns -1.

The third parameter to index specifies, where to start searching for a given substring. By default index searches from the beginning of the string:

use v5.10;

my $string = "Welcome to Warsaw!";

my @indices = ();
my $idx     = -1;

while (1) {
    $idx = index($string, 'W', $idx+1);
    last if $idx == -1;
    push @indices, $idx;
}

say "Indices are @indices";

Operator rindex finds the last occurrence of the substring (i.e. scans from the end of the string).

my $last_slash = rindex("/etc/passwd", '/');  # value is 4

 Operator rindex counts from the left, from the beginning of the string.

Manipulating a substring with substr

Operator substr extracts a part of the string:

my $sub = substr($str, $start, $length);

The $length parameter may be omitted, if the end of string is required. Initial position $start can be negative, counting from the end of the string. In this case -1 denotes the end of the string.

Functions index and substr work well together:

my $str   ="London is truly lovely!";
my $right = substr($str, index($str, "t"));

Change a given portion of the string with substr and assignment:

my $str = "London is truly lovely!";
substr($str, "t") = "absolutely magical!";  # -> London is absolutely magical!

Giving a length of 0 allows to insert text without removing anything:

my $str = "London is absolutely magical!";
substr($str, 0, 0) = "Old";  # -> Old London is absolutely magical!

Use the binding operator (=~) to restrict an operation to work with a substring:

# replace Fred with Barney within last 20 characters of string
substr($str, -20) =~ s/Fred/Barney/g;

Alternatively use a four argument version of substr, where the fourth argument is the replacement string:

my $str = "Hello, world!";
my $previous_value = substr($str, 0, 5, "Goodbye");

Formatting data with sprintf

Function sprintf returns the requested string instead of printing it:

my $date_tag = sprintf "%4d/%02d/%02d %2d:%02d:%02d", $yr, $mo, $da, $h, $m, $s;
# $date_tag -> 2038/01/19 3:00:08

Using sprintf with “money numbers”

Format a number with two digits after the decimal point:

my $money = sprintf "%.2f", 2.49997;

To insert commas separating thousands in a number use the following subroutine:

sub insert_commas {
    my $number = sprintf "%.2f", shift @_;

    # add one comma each time through the do-nothing loop
    # while true insert one comma, then search again
    1 while $number =~ s/^(-?\d+)(\d\d\d)/$1,$2/;

    # put dollar sign in the right place
    $number =~ s/^(-?)/$1\$/;
    $number;
}

Use modules Number::Format and CLDR::Number for pre-defined operations with numbers.

Advanced sorting

Write a custom comparison statement to specify the sorting order:

# sorting subroutine, expects two variables $a, $b
sub by_number {
    if ($a < $b) { -1 } elsif ($a > $b) { 1 } else { 0 }
    # -1: a must appear before b
    #  1: b must appear before a
    #  0: order does not matter
}

To apply custom sorting routine:

my @result = sort by_number @numbers;

Three-way comparison for numbers is used frequently. Perl’s spaceship operator provides a shortcut for it:

sub by_number { $a <=> $b }

Similarly the cmp operator defines a three-way comparison for strings:

# convert both parameters to lower case before comparing
sub case_insensitive { "\L$a" cmp "\L$b" }

my @strings = sort case_insensitive @input_strings;

To sort Unicode strings apply:

use Unicode::Normalize;

sub equivalents { NFKD($a) cmp NFKD($b) }

When the sorting routines are simple, use them “inline”:

my $sorted_numbers = sort { $a <=> $b } @input_numbers;

The reversed order sorting may be obtained either with the reverse keyword:

my @desc_numbers = reverse sort { $a <=> $b } @numbers;

or by swapping the operands:

my @desc_numbers = sort { $b <=> $a } @numbers;

Sorting a hash by value

Imagine bowling scores of three characters are stored in a hash:

my %score = ("barney" => 195, "fred" => 205, "dino" => 30);

# sort the winners by their scores in descending order
my @winners = sort by_score keys %score;

Enable numeric comparison on the scores, rather than the names:

# sort by scores in descending order
sub by_score { $score{$b} <=> $score{$a} }

Sorting by multiple keys

Consider a forth entry in the scores hash:

my %score = (
    "barney"    => 195,
    "fred"      => 205,
    "dino"      => 30,
    "bamm-bamm" => 195
);

If the players have the same score, sort their entries by name:

my @winners = sort by_score_and_name keys %score;

sub by_score_and_name {
    $score{$b} <=> $score{$a}
        or
    # break the tie with a string-order comparison
    $a cmp $b
} @winners

Example of a library program using a five-level sort:

@patron_IDs = sort {
    &fines($b) <=> &fines($a) or
    $items{$b} <=> $items{$a} or
    $family_name{$a} cmp $family_name{$b} or
    $personal_name{$a} cmp $personal_name{$b} or
    $a <=> $b   # patron's ID number
} @patron_IDs

15. Process Management

The system function

Launch a child process with:

system 'date';

Use single quotes, if Perl interpolation is not needed:

# $HOME is not a Perl variable
system 'ls -l $HOME';

Use shell’s facility to launch a background process (Perl will not wait for it to finish):

system './compute &';

Avoiding the shell

Invoke the system operator with multiple arguments to avoid the shell:

system 'tar', 'cvf', $tarfile, @dirs;

 For security reasons choose a multi-argument call to system (vs a single-argument call).

System operator returns 0 on success (“0 but true” notion):

!system 'rm -frv @files' or die 'Error: unable to delete files!';

The environment variables

Modify environment variables to be inherited by child processes:

$ENV{'PATH'} = "/root/bin:$ENV{'PATH'}";
delete $ENV{'MKLROOT'};
my $result = system 'make';

Rely on the Config module to use a path separator native for the operating system:

use Config;
$ENV{'PATH'} = join $Config{'path_sep'}, '/root/bin', $ENV{'PATH'};

The exec function

The exec function causes the main Perl process to perform the requested action:

exec 'date';
die "date couldn't run: $!";

 There is no main Perl process to return to after the exec function finishes.

Using backquotes to capture output

Place a command between backquotes to save its output:

my $dt = `date`;
print "Current time is $dt";

Example: invoking the perldoc command repeatedly for a set of functions:

my @func = qw{ int rand sleep length hex eof not exit sqrt umask };
my %doc;

foreach (@func) {

    $doc{$_} = `perldoc -t -f $_`;

    # alternatively: use generalised quoting operator qx()
    $doc{$_} = qx(perldoc -t -f $_);
}

 Avoid using backquotes, when output capture is not needed.

Using backquotes in a list context

Get the data automatically broken up by lines:

my @lines = `who`;

Use the result of a system command in a list context:

# each line from 'who' is placed into the default variable $_
foreach (`who`) {
    my ($usr, $tty, $date) = /(\S+)\s+(\S+)\s+(.*)/;
    $ttys{$usr} .= "$tty at $date\n";
}

External processes with IPC::System::Simple

This module provides simpler interface compared to Perl’s built-in system utility:

use IPC::System::Simple qw(system);

# more robust version of system
system  'tar', 'cvf', $tarfile, @dirs;

# systemx never calls the shell
systemx 'tar', 'cvf', $tarfile, @dirs;

To capture output replace the system command with capture:

my @output = capturex 'tar', 'cvf', $tarfile, @dirs;

Processes as filehandles

Using processes as filehandles provides the only easy way to write to a process based on the results of computation.

Launch a concurrent (parallel) child process with the piped open command:

open DATE, 'date|' or die "Cannot pipe from date: $!";
open MAIL, '|mail' or die "Cannot pipe to mail: $!";

The three-argument form is:

# pipe from 'date', '-' shows placement of system command, e.g. date, mail
open my $date_fh, '-|', 'date';

# pipe to 'mail'
open my $mail_fh, '|-', 'mail';

Read filehandle normally to obtain data:

my $now = <$date_fh>;

Print with filehandle to send data:

print $mail_fh "Current time: $now";

Close filehandle to finish sending data:

close $mail_fh;

# @note: exit status is saved into special variable $?
die "mail: nonzero exit of $?" if $?;

 In case of reading from process while data is not available, process is suspended until sending program speaks again.

For reading backquotes are easier to manage unless you want to have results as they come in.

Example of find command printing results as they are found:

open my $find_fh, '-|', 'find', qw( / -atime +90 -size +1000 -print )
    or die "fork: $!";

while (<$find_fh>) {
    chomp;
    printf "%s size %dK last accessed %.2f days ago\n",
        $_, (1023 + -s $_)/1024, -A $_;
}

Getting down and dirty with fork

It is possible to access low-level process management system calls directly:

Re-implementation of system 'date':

defined(my $pid = fork) or die "Cannot fork: $!";

unless ($pid) {
    # switch to child process
    exec 'date';
    die "Cannot exec date: $!";
}

# back to parent process
waitpid($pid, 0);

Sending and receiving signals

Send “interrupt signal” SIGINT to process with known ID:

my $pid = 4201;

kill 'INT', $pid or die "Cannot signal $pid with SIGINT: $!";

# OR alternatively
kill INT => $pid or die "Cannot signal $pid with SIGINT: $!";

Check, if process is still alive:

# use special signal 0
unless (kill 0, $pid) {
    warn "$pid has gone away!";
}

Assign into the special %SIG hash to activate the signal handler:

sub sig_int_handler() {
    die "Caught interrupt signal: $!\n";
}

$SIG{'INT'} = 'sig_int_handler';

Example of a custom signal handler [1] (chapter 15, exercise 4, pp. 273, 324, 325):

use v5.34;
use strict;
use warnings;

sub hup_handler {
    state $n;
    printf("Caught HUP %d times.\n", ++$n);
}

sub usr1_handler {
    state $n;
    printf("Caught USR1 %d times.\n", ++$n);
}

sub usr2_handler {
    state $n;
    printf("Caught USR2 %d times.\n", ++$n);
}

sub int_handler {
    printf("\nCaught INT. Exiting.\n");
    exit;
}

foreach my $sig ( qw(int hup usr1 usr2) ) {
    $SIG{ uc $sig } = "${sig}_handler";
}

printf("My process no: $$.\n");

while (1) {
    sleep(1);
}

16. Some Advanced Perl Techniques

Slices

Imagine the following list:

fred flintstone:2168:301 Cobblestone Way:555-1212:555-2121:3
barney rubble:709918:299 Cobblestone Way:555-3333:555-3438:0

Extract a few elements using conventional arrays:

while <$fh> {
    chomp;
    my @items = split /:/;
    my ($card_num, $count) = ($items[1], $items[5]);
    ... # work with those variables
}

Assign the result of split to a list of scalars:

# avoid using array @items
my ($name, $card_no, $addr, $home, $work, $count) = split /:/;

Use undef to ignore corresponding elements of the source list:

my (undef, $card_no, undef, undef, undef, $count) = split /:/;

Extreme use case to extract mtime value from stat:

my (undef, undef, undef, undef, undef, undef, undef, undef, undef, $mtime) = 
    stat $filename;

Instead, index into a list as if it were an array (with a list slice):

my $mtime = (stat $filename)[9];

Use list slices to pull out items from the initial example:

my $card_no = (split /:/)[1];
my $count   = (split /:/)[5];

List slice in a list context (merge two operations together):

my ($card_no, $count) = (split /:/)[1, 5];

Pull the first and last items from a list:

my ($first, $last) = (sort @names)[0, -1];

 Use List::Util module for better, more efficient sorting.

Slice subscripts may be in any order and may repeat indices:

my @names = qw{ zero one two three four five six seven eight nine };
my @numbers = ( @names )[ 9, 0, 2, 1, 0 ];
print "Bedrock @numbers\n";  # says Bedrock nine zero two one zero

Array Slice

Parentheses may be omitted, when slicing elements from an array:

my @numbers = @names[ 9, 0, 2, 1, 0 ];

Interpolate a slice directly into a string:

my @names = qw{ zero one two three four five six seven eight nine };
print "Bedrock @names[ 9, 0, 2, 1, 0 ]";

Update selected elements of the array:

my $new_phone   = "555-6090";
my $new_address = "99380 Red Rock West";
@items[2, 3] = ($new_address, $new_phone);

Hash Slice

Pull values with a list of hash keys or with a slice:

my @selected_scores = ( $score{"barney"}, $score{"fred"}, $score{"dino"} );

my @selected_scores = ( @score{ qw/ barney fred dino / } );

 Slice is always a list. Hence, the hash slice notation uses an at sign.

Elegant assignment using hash slices:

my @players          = qw/ barney fred dino /;
my @scores           = (195, 205, 30);
@results{ @players } = @scores;

Key-Value Slices

Since v5.20 it is possible to extract key-value pairs with a key-value slice:

use v5.20;

my %top_results = %results{ @top_players };

 Sigils do not denote variable type, they communicate what you do with the variable. Key-value pairs is a hashy sort of operation, hence there is % in front of it.

Trapping Errors

Using eval

Wrap code in an eval block to trap fatal errors:

my $barney = eval { $fred / $dino } // 'NaN';

In case of an error the eval block stops running, but the program doesn’t crash.

More Advanced Error Handling

In basic Perl, you may throw an exception with die and catch it with eval:

eval {
    ...
    die "Bad denominator" if $deno == 0;
}
if ( $@ =~ /unexpected/ ) {
    ...
}
elseif ( $@ =~ /denominator/ ) {
    ...
}

Inspect value of $@ to figure out what went wrong.

Dynamic scope of $@ may cause problems. Use module Try::Tiny from CPAN for better error handling.

use Try::Tiny;

try {
    ... # code that may throw an error
}
catch {
    ... # code to handle the error
}
finally {
    ... # execute in any case
}

Try::Tiny puts the error message into $_ to prevent abuse of $@.

Picking Items from a List with grep

Perl’s grep operator acts as a filter:

# select odd numbers from range
my @odd_numbers    = grep { $_ % 2 } 1..1000;

# pick matching lines from file
my @matching_lines = grep { /\bfred\b/i } <$fh>;

# simpler syntax with comma
my @matching_lines = grep   /\bfred\b/i,  <$fh>;

In scalar context grep tells the number of items selected:

my $line_count = grep /\bfred\b/i, <$fh>;

Transforming Items from a List with map

Use map operator to change every item in a list:

my @data = (4.75, 1.5, 2, 1234, 6.9456, 12345678.9, 29.95);

my @formatted_data = map { big_money($_) } @data;

Instead of returning a Boolean value as grep, map generates a list of values.

Simpler syntax of map:

print "Powers of two are:\n",
    map "\t" . ( 2 ** $_ ) . "\n", 0..16;

Fancier List Utilities

List::Util module from Standard Library enables high performance list processing utilities:

First occurrence:

use List::Util qw(first);

my $first_match = first { /\bPebbles\b/i } @characters;

Sum:

use List::Util qw(sum);

my $total = sum( 1..1000 );

Maximum numeric and textual:

use List::Util qw(max maxstr);

my $maxn = max( 3, 5, 10, 4, 6 );
my $maxt = maxstr( @strings );

Use shuffle to randomise order of elements in a list:

use List::Util qw(shuffle);

my @numbers = shuffle(1..1000);

Use List::MoreUtils module for more advanced subroutines.

Match a condition with none, any, all:

use List::MoreUtils qw(none any all);

if (none { $_ < 0 } @numbers) {
    print "No elements less then 0\n";
} elsif (any { $_ > 50} @numbers) {
    print "Some elements over 50\n";
} elsif (all { $_ < 10} @numbers) {
    print "All elements are less than 10\n";
}

Process n items at a time with natatime:

use List::MoreUtils qw(natatime);

my $iterator = natatime 3, @array;

while ( my @triad = $iterator->() ) {
    print "Processing triad: @triad\n";
}

Combine two or more lists interweaving the elements with mesh:

use Lists::MoreUtils qw(mesh);

my @abc ='a'..'z';
my @num = 1 .. 20;
my @din = qw( dino );

my @arr = mesh @abc, @num, @din;

# arr = ( a 1 dino b 2  c 3 ... )

References

  1. R. L. Schwartz, brian d foy, and T. Phoenix, Learning Perl, 7th ed. O’Reilly Media, 2017.
  2. “Perl power tools project, official website.” [Online]. Available at: https://perlpowertools.com
  3. J. Friedl, Mastering Regular Expressions, 3rd ed. O’Reilly Media, 2006.
  4. T. Bunce and A. Descartes, Programming with Perl DBI, 1st ed. O’Reilly Media, 2000.