NAME
Regexp::Common - regexps for Debian specific strings
SYNOPSIS
use Regexp::Common qw(debian);
#TODO:
DESCRIPTION
#TODO:
- $RE{debian}{package}
-
'the-very.strange.package+name' =~ $RE{debian}{package}{-keep}; print "package is $1";
This is Debian package name. Rules are described in Section 5.6.7 of Debian policy.
- $RE{debian}{version}
-
'10:1+abc~rc.2-ALPHA-rc25+w~t.f' =~ $RE{debian}{version}{-keep}; $2 eq '10' && $3 eq '1+abc~rc.2-ALPHA' && $4 eq 'rc25+w~t.f' or die;
This is Debian version. Rules are described in Section 5.6.12 of Debian policy.
- $1 is a debian_version
- $2 is an epoch
-
if any. Oterwise --
undef
. - $3 is an upstream_version
-
(caveat) A string like
0--1
will end up with $3 set to weird0-
(hopefully, Debian won't degrade to such versions; though YMMV). - $4 is a debian_revision
-
(bug)
0-1-
will end up with $3 set to0
and $4 set to1
(such trailing hyphens will be missing in $1).0-
will end up with $4undef
ed.
(bug) Either I don't perlre or I didn't tried hard enough. Anyway, I didn't find a way to parse Debian version the way R::C requires in context of perl5.8.8 (perl in stable, going to be oldstable).
qr/(?|)/
saved perl5.10.0 (but see "R_C_d_version").(caveat) The debian_revision is allowed to start with non-digit. This's solely my reading of Debian Policy.
- R_C_d_version
-
use Regexp::Common qw(debian); # though that works too # use Regexp::Common::debian; my $re = Regexp::Common::debian::R_C_d_version; $version =~ /^$re$/; $2 and print "has epoch\n"; $3 || $5 || $6 || $8 and print "has upstream_version\n"; $4 || $7 and print "has debian_revision\n"; $3 && !$4 || !$3 && $4 or die; $6 && !$7 || !$6 && $7 or die; $3 && !$5 && !$6 && !$8 or die; $5 && !$6 && !$8 or die; $6 && !$8 or die;
That's a workaround for perl5.8.8 (read "$RE{debian}{version}" (look for (bug))). Look for (caveat) in "$RE{debian}{version}" -- those apply here too.
- $1 is debian_version again
- $2 is epoch always
- Either $3, or $5, or $6, or $8 is upstream_version
- Either $4 or $7 is debian_revision
That's the best what can be done with RE (in real world it's done functional way). Sorry.
(bug) It always grabs (should be configurable with setting like -keep). OTOH, look, within 2year (or so) (as soon as perl5.10.0 would be oldstable) that dirty piece will be dropped anyway.
- $RE{debian}{architecture}
-
$arch =~ $RE{debian}{architecture}{-keep}; $2 && ($3 || $4) and die; $3 && !$4 and die; $3 && $4 eq 'armel' and die; $2 and print "that's special: $2"; $3 and print "OS is: $3"; $4 and print "arch is: $4";
This is Debian architecture. Rules are described in Section 5.6.8 of Debian policy.
- $1 is some of Debian's architectures
- $2 is any special
-
Distinguishing special architectures (
all
,any
, andsource
) and os-arch pairs is arguable. But I've decided that would be good to separateall
and e.g.i386
(what in turn is actuallylinux-i386
). - $3 is os
-
When
!$3 && $4
is true then undefined $3 actually meanslinux
. Since $digits are read-only yielding here anything butundef
is impossible. More on that in Section 11.1 of Debian policy. - $4 is arch
-
Please note that there are architectures which are present only for
linux
os (namelyarmel
andlpia
, at time of writing).
(caveat) Debian policy by itself doesn't specify what os-arch pairs are valid (only specials are mentioned). In turn it relies on
qx/dpkg-architecture -L/
. In effect R::C::d can desinchronize; Hopefully, that wouldn't stay unnoticed too long. - $RE{debian}{archive}{binary}
-
'abc_1.2.3-512_all.deb' =~ $RE{debian}{archive}{binary}{-keep}; print " package is -> $2"; print " version is -> $3"; print "architecture is -> $4";
This is Debian binary archive (even if there's no binary file (in -B sense) inside it's called "binary" anyway). The naming convention isn't described in Debian policy; Instead it refers to format understood by dpkg (Preface of Chapter 3). (Hopefully, someday here will be references to code inside dpkg and dpkg-deb codebase that does those nasty things with package, version, and arch composing in and decomposing out of filenames.)
- $1 is deb-filename
-
That's the whole archive filename with
.deb
suffix included - $2 is package
- $3 is version
-
There's a big deal of WTF. Filename: in *_Packages miss epoch at all. Archives in pool/ miss them too. Archives in /var/cache/apt/archives ... That seems to be
apt-get
specific (I don't have reference to code though). As a feature $RE{d}{a}{binary} provides an epoch hack in filenames. - $4 is architecture
-
That would match surprising
source
orany
. Sorry. That'll improve in future. Actually that's even worse: OS can prepend any arch or special.
For the sake of symmetry $RE{d}{a}{binary} has trailing anchor -- negative look-ahead for any character that can be found in version string.
- $RE{debian}{archive}{source}
-
'xyz_1-ab.25~6.orig.tar.gz' =~ $RE{debian}{archive}{source}{-keep}; print "package is $2"; index($3, '-') && $4 eq 'tar' and die; $4 eq 'orig.tar' and "print there should be patch";
This is Debian upstream (or Debian-native) source tarball. Naming source archives is outside Debian policy; although
Section 5.6.21 mentions that "the exact forms of the filenames are described in" Section C.3.
Section C.3 points that source archive must be in form package_upstream-version.orig.tar.gz.
Naming Debian-native packages is left completely.
dpkg-source(1) (1.14.23) in Section SOURCE PACKAGE FORMATS mentions some bits of naming (Debian-native packages are left too).
Welcome to the real life. $RE{d}{a}{source} knows only Format: 1.0 naming.
- $1 is tarball-filename
-
Since there's no other suffix, but .gz it's present only in $1
- $2 is package
- $3 is version
- $4 is type
-
This can hold one of 2 strings (
orig.tar
(regular package) ortar
(Debian-native package)).
Since dot (
.
) is used as separator and can be in version the whole thing is implicitly anchored (negative-lookahead for version-forming character) (The idea is that0.orig.tar.gz
can be a very strange version) and version itself is stressed to be as short as possible. - $RE{debian}{archive}{patch}
-
'abc_0cba-12.diff.gz' =~ $RE{debian}{archive}{patch}{-keep}; print "package is $2"; -1 == index $3, '-' and die; print "debian revision is ", (split /-/, $3)[-1];
This is "debianization diff" (Section C.3 of Debian policy). Naming patches is outside Debian policy; So we're back to guessing. There're rumors (or maybe trends) that Format 1.0 will be deprecated (or maybe obsolete).
- $1 is patch-filename
-
Since there's no other suffix, but .diff.gz it's present only in $1
- $2 is package
- $3 is version
-
(caveat) Consider this. A Debian-native package misses a patch and hyphen in version. A regular package has a patch and must have hyphen in version. $RE{d}{a}{patch} is absolutely ignorant about that (we are about matching but verifying after all).
The very same considerations covered in discussion trailing $RE{d}{a}{source} entry apply to $RE{d}{a}{patch} as well (consider:
0.diff.gz
can be a version). - $RE{debian}{archive}{dsc}
-
'abc_0cba-12.dsc' =~ $RE{debian}{archive}{dsc}{-policy=real}; print "package is $2"; print "version is $3";
This is "Debian source control" (Section 5.4 describes its contents but naming). Statistically based guessing, you know (once I'll elaborate to point exact lines in dpkg-dev bundle where it's in use (creating and parsing)).
- $1 is dsc-filename
-
As usual, since the only suffix can be .dsc it's present in $1 only.
- $2 is package
- $3 is version
blah-blah refering to $RE{d}{a}{source} (consider:
0.dsc
can be version). - $RE{debian}{archive}{changes}
-
'abc_0cba-12.changes' =~ $RE{debian}{archive}{changes}{-policy=real}; print "package is $2"; print "version is $3";
This is "Debian changes file" (Section 5.5 describes its contents but naming). Statistically based guessing, you know (once I'll elaborate to point exact lines in dpkg-dev bundle where it's in use (creating and parsing)) (should be a template).
- $1 is changes-filename
-
As usual, since the only suffix can be .changes it's present in $1 only.
- $2 is package
- $3 is version
blah-blah refering to $RE{d}{a}{source} (consider:
0.changes
can be version).
BUGS AND CAVEATS
Grep this pod for (bug)
and/or (caveat)
. They all are placed in appropriate sections.
AUTHOR
Eric Pozharski, <whynot@cpan.org>
COPYRIGHT AND LICENSE
Copyright 2008 by Eric Pozharski
This library is free in sense: AS-IS, NO-WARANRTY, HOPE-TO-BE-USEFUL. This library is released under LGPLv3.