NAME
PDF::Make::Linearization - PDF Linearization (Fast Web View) support
SYNOPSIS
use PDF::Make;
use PDF::Make::Linearization;
# Check if a PDF is linearized
my $doc = PDF::Make->open('document.pdf');
if ($doc->is_linearized) {
my $params = $doc->linear_params;
say "Fast Web View: Yes";
say "Pages: $params->{page_count}";
say "First page ends at byte: $params->{first_page_end}";
}
# Create a linearized PDF
my $pdf = PDF::Make->new;
$pdf->page->text(100, 700, "Page 1");
$pdf->page->text(100, 700, "Page 2");
$pdf->finalize;
$pdf->write_linearized('optimized.pdf');
# Streaming reader for HTTP byte-range requests
my $reader = PDF::Make::StreamReader->new(
fetch => sub {
my ($offset, $length) = @_;
return http_range_request($url, $offset, $length);
}
);
$reader->read_header;
say "Pages: ", $reader->page_count;
# Load pages on demand
$reader->read_page(0); # First page (usually pre-loaded)
$reader->read_page(5); # Triggers fetch for page 6
DESCRIPTION
This module provides PDF linearization support, enabling "Fast Web View" functionality per Annex F of ISO 32000-2:2020.
Linearization reorganizes a PDF file so that:
The first page can display before the entire file downloads
Subsequent pages load on demand via HTTP byte-range requests
Hint tables enable efficient page offset calculation
METHODS ADDED TO PDF::Make
is_linearized
my $bool = $doc->is_linearized;
Returns true if the document is linearized (has Fast Web View).
linear_params
my $params = $doc->linear_params;
Returns a hashref with linearization parameters:
{
version => 1, # Linearized version
file_length => 123456, # Total file size
hint_offset => 1234, # Hint stream offset
hint_length => 567, # Hint stream length
first_page_obj => 7, # First page object number
first_page_end => 12345, # End of first page section
page_count => 10, # Number of pages
main_xref_offset => 98765, # Main xref table offset
}
Returns undef if document is not linearized.
linearize
$doc->linearize;
Prepares the document for linearized output. This analyzes page dependencies and computes the optimal object ordering.
write_linearized
$doc->write_linearized($path);
my $bytes = $doc->write_linearized;
Writes the document in linearized format. If a path is provided, writes to that file. Otherwise returns the PDF bytes.
PDF::Make::StreamReader
Streaming reader for linearized PDFs, enabling page-on-demand loading.
new
my $reader = PDF::Make::StreamReader->new(
fetch => sub {
my ($offset, $length) = @_;
# Return $length bytes starting at $offset
return $data;
}
);
Creates a new streaming reader with the given fetch callback.
read_header
$reader->read_header;
Reads and parses the PDF header and linearization dictionary. This is the first operation to perform.
is_linearized
if ($reader->is_linearized) { ... }
Returns true if the PDF is linearized.
page_count
my $count = $reader->page_count;
Returns the total number of pages. Available after read_header.
page_available
if ($reader->page_available($page_num)) { ... }
Returns true if the given page (0-based) is loaded.
read_page
$reader->read_page($page_num);
Fetches and parses the given page's data. May trigger HTTP range request.
page_range
my ($offset, $length) = $reader->page_range($page_num);
Returns the byte offset and length for the given page. Useful for HTTP Range header construction.
LINEARIZATION STRUCTURE
A linearized PDF has this structure:
┌─────────────────────────────────────┐
│ Header (%PDF-2.0) │
├─────────────────────────────────────┤
│ Linearization dictionary (obj 1) │
├─────────────────────────────────────┤
│ First page xref (partial) │
├─────────────────────────────────────┤
│ Document catalog, pages tree root │
├─────────────────────────────────────┤
│ First page objects │
├─────────────────────────────────────┤
│ Hint stream │
├─────────────────────────────────────┤
│ Remaining pages (2..N) │
├─────────────────────────────────────┤
│ Shared objects │
├─────────────────────────────────────┤
│ Main xref + trailer │
└─────────────────────────────────────┘
SEE ALSO
PDF::Make, ISO 32000-2:2020 Annex F (Linearized PDF)
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 488:
Non-ASCII character seen before =encoding in '┌─────────────────────────────────────┐'. Assuming UTF-8