Parent

PDF::Reader::PageState

encapsulates logic for tracking graphics state as the instructions for a single page are processed. Most of the public methods correspond directly to PDF operators.

Public Class Methods

new(page) click to toggle source

starting a new page

# File lib/pdf/reader/page_state.rb, line 23
def initialize(page)
  @page          = page
  @cache         = page.cache
  @objects       = page.objects
  @font_stack    = [build_fonts(page.fonts)]
  @xobject_stack = [page.xobjects]
  @cs_stack      = [page.color_spaces]
  @stack         = [DEFAULT_GRAPHICS_STATE.dup]
  state[:ctm]    = identity_matrix
end

Public Instance Methods

begin_text_object() click to toggle source

Text Object Operators

# File lib/pdf/reader/page_state.rb, line 81
def begin_text_object
  @text_matrix      = identity_matrix
  @text_line_matrix = identity_matrix
  @font_size = nil
end
clone_state() click to toggle source

This returns a deep clone of the current state, ensuring changes are keep separate from earlier states.

Marshal is used to round-trip the state through a string to easily perform the deep clone. Kinda hacky, but effective.

# File lib/pdf/reader/page_state.rb, line 282
def clone_state
  if @stack.empty?
    {}
  else
    Marshal.load Marshal.dump(@stack.last)
  end
end
concatenate_matrix(a, b, c, d, e, f) click to toggle source

update the current transformation matrix.

If the CTM is currently undefined, just store the new values.

If there’s an existing CTM, then multiply the existing matrix with the new matrix to form the updated matrix.

# File lib/pdf/reader/page_state.rb, line 63
def concatenate_matrix(a, b, c, d, e, f)
  if state[:ctm]
    ctm = state[:ctm]
    state[:ctm] = TransformationMatrix.new(a,b,c,d,e,f).multiply!(
      ctm.a, ctm.b,
      ctm.c, ctm.d,
      ctm.e, ctm.f
    )
  else
    state[:ctm] = TransformationMatrix.new(a,b,c,d,e,f)
  end
  @text_rendering_matrix = nil # invalidate cached value
end
ctm_transform(x, y) click to toggle source

transform x and y co-ordinates from the current user space to the underlying device space.

# File lib/pdf/reader/page_state.rb, line 218
def ctm_transform(x, y)
  [
    (ctm.a * x) + (ctm.c * y) + (ctm.e),
    (ctm.b * x) + (ctm.d * y) + (ctm.f)
  ]
end
current_font() click to toggle source
# File lib/pdf/reader/page_state.rb, line 243
def current_font
  find_font(state[:text_font])
end
end_text_object() click to toggle source
# File lib/pdf/reader/page_state.rb, line 87
def end_text_object
  # don't need to do anything
end
find_color_space(label) click to toggle source
# File lib/pdf/reader/page_state.rb, line 254
def find_color_space(label)
  dict = @cs_stack.detect { |colorspaces|
    colorspaces.has_key?(label)
  }
  dict ? dict[label] : nil
end
find_font(label) click to toggle source
# File lib/pdf/reader/page_state.rb, line 247
def find_font(label)
  dict = @font_stack.detect { |fonts|
    fonts.has_key?(label)
  }
  dict ? dict[label] : nil
end
find_xobject(label) click to toggle source
# File lib/pdf/reader/page_state.rb, line 261
def find_xobject(label)
  dict = @xobject_stack.detect { |xobjects|
    xobjects.has_key?(label)
  }
  dict ? dict[label] : nil
end
font_size() click to toggle source
# File lib/pdf/reader/page_state.rb, line 108
def font_size
  @font_size ||= begin
                   _, zero = trm_transform(0,0)
                   _, one  = trm_transform(1,1)
                   (zero - one).abs
                 end
end
invoke_xobject(label) click to toggle source

XObjects

# File lib/pdf/reader/page_state.rb, line 189
def invoke_xobject(label)
  save_graphics_state
  xobject = find_xobject(label)

  raise MalformedPDFError, "XObject #{label} not found" if xobject.nil?
  matrix = xobject.hash[:Matrix]
  concatenate_matrix(*matrix) if matrix

  if xobject.hash[:Subtype] == :Form
    form = PDF::Reader::FormXObject.new(@page, xobject, :cache => @cache)
    @font_stack.unshift(form.font_objects)
    @xobject_stack.unshift(form.xobjects)
    yield form if block_given?
    @font_stack.shift
    @xobject_stack.shift
  else
    yield xobject if block_given?
  end

  restore_graphics_state
end
move_text_position(x, y) click to toggle source

Text Positioning Operators

# File lib/pdf/reader/page_state.rb, line 136
def move_text_position(x, y) # Td
  temp = TransformationMatrix.new(1, 0,
                                  0, 1,
                                  x, y)
  @text_line_matrix = temp.multiply!(
    @text_line_matrix.a, @text_line_matrix.b,
    @text_line_matrix.c, @text_line_matrix.d,
    @text_line_matrix.e, @text_line_matrix.f
  )
  @text_matrix = @text_line_matrix.dup
  @font_size = @text_rendering_matrix = nil # invalidate cached value
end
move_text_position_and_set_leading(x, y) click to toggle source
# File lib/pdf/reader/page_state.rb, line 149
def move_text_position_and_set_leading(x, y) # TD
  set_text_leading(-1 * y)
  move_text_position(x, y)
end
move_to_next_line_and_show_text(str) click to toggle source
# File lib/pdf/reader/page_state.rb, line 176
def move_to_next_line_and_show_text(str) # '
  move_to_start_of_next_line
end
move_to_start_of_next_line() click to toggle source
# File lib/pdf/reader/page_state.rb, line 164
def move_to_start_of_next_line # T*
  move_text_position(0, -state[:text_leading])
end
process_glyph_displacement(w0, tj, word_boundary) click to toggle source

after each glyph is painted onto the page the text matrix must be modified. There’s no defined operator for this, but depending on the use case some receivers may need to mutate the state with this while walking a page.

NOTE: some of the variable names in this method are obscure because

they mirror variable names from the PDF spec

NOTE: see Section 9.4.4, PDF 32000-1:2008, pp 252

Arguments:

w0 - the glyph width in *text space*. This generally means the width

in glyph space should be divded by 1000 before being passed to
this function

tj - any kerning that should be applied to the text matrix before the

following glyph is painted. This is usually the numeric arguments
in the array passed to a TJ operator

word_boundary - a boolean indicating if a word boundary was just

reached. Depending on the current state extra space
may need to be added
# File lib/pdf/reader/page_state.rb, line 312
def process_glyph_displacement(w0, tj, word_boundary)
  fs = font_size # font size
  tc = state[:char_spacing]
  if word_boundary
    tw = state[:word_spacing]
  else
    tw = 0
  end
  th = state[:h_scaling]
  # optimise the common path to reduce Float allocations
  if th == 1 && tj == 0 && tc == 0 && tw == 0
    glyph_width = w0 * fs
    tx = glyph_width
  else
    glyph_width = ((w0 - (tj/1000.0)) * fs) * th
    tx = glyph_width + ((tc + tw) * th)
  end
  ty = 0

  # TODO: I'm pretty sure that tx shouldn't need to be divided by
  #       ctm[0] here, but this gets my tests green and I'm out of
  #       ideas for now
  # TODO: support ty > 0
  if ctm.a == 1 || ctm.a == 0
    @text_matrix.horizontal_displacement_multiply!(tx)
  else
    @text_matrix.horizontal_displacement_multiply!(tx/ctm.a)
  end
  @font_size = @text_rendering_matrix = nil # invalidate cached value
end
restore_graphics_state() click to toggle source

Restore the state to the previous value on the stack.

# File lib/pdf/reader/page_state.rb, line 48
def restore_graphics_state
  @stack.pop
end
save_graphics_state() click to toggle source

Clones the current graphics state and push it onto the top of the stack. Any changes that are subsequently made to the state can then by reversed by calling restore_graphics_state.

# File lib/pdf/reader/page_state.rb, line 42
def save_graphics_state
  @stack.push clone_state
end
set_character_spacing(char_spacing) click to toggle source

Text State Operators

# File lib/pdf/reader/page_state.rb, line 95
def set_character_spacing(char_spacing)
  state[:char_spacing] = char_spacing
end
set_horizontal_text_scaling(h_scaling) click to toggle source
# File lib/pdf/reader/page_state.rb, line 99
def set_horizontal_text_scaling(h_scaling)
  state[:h_scaling] = h_scaling / 100.0
end
set_spacing_next_line_show_text(aw, ac, string) click to toggle source
# File lib/pdf/reader/page_state.rb, line 180
def set_spacing_next_line_show_text(aw, ac, string) # "
  set_word_spacing(aw)
  set_character_spacing(ac)
  move_to_next_line_and_show_text(string)
end
set_text_font_and_size(label, size) click to toggle source
# File lib/pdf/reader/page_state.rb, line 103
def set_text_font_and_size(label, size)
  state[:text_font]      = label
  state[:text_font_size] = size
end
set_text_leading(leading) click to toggle source
# File lib/pdf/reader/page_state.rb, line 116
def set_text_leading(leading)
  state[:text_leading] = leading
end
set_text_matrix_and_text_line_matrix(a, b, c, d, e, f) click to toggle source
# File lib/pdf/reader/page_state.rb, line 154
def set_text_matrix_and_text_line_matrix(a, b, c, d, e, f) # Tm
  @text_matrix = TransformationMatrix.new(
    a, b,
    c, d,
    e, f
  )
  @text_line_matrix = @text_matrix.dup
  @font_size = @text_rendering_matrix = nil # invalidate cached value
end
set_text_rendering_mode(mode) click to toggle source
# File lib/pdf/reader/page_state.rb, line 120
def set_text_rendering_mode(mode)
  state[:text_mode] = mode
end
set_text_rise(rise) click to toggle source
# File lib/pdf/reader/page_state.rb, line 124
def set_text_rise(rise)
  state[:text_rise] = rise
end
set_word_spacing(word_spacing) click to toggle source
# File lib/pdf/reader/page_state.rb, line 128
def set_word_spacing(word_spacing)
  state[:word_spacing] = word_spacing
end
show_text_with_positioning(params) click to toggle source

Text Showing Operators

# File lib/pdf/reader/page_state.rb, line 172
def show_text_with_positioning(params) # TJ
  # TODO record position changes in state here
end
stack_depth() click to toggle source

when save_graphics_state is called, we need to push a new copy of the current state onto the stack. That way any modifications to the state will be undone once restore_graphics_state is called.

# File lib/pdf/reader/page_state.rb, line 272
def stack_depth
  @stack.size
end
trm_transform(x, y) click to toggle source

transform x and y co-ordinates from the current text space to the underlying device space.

transforming (0,0) is a really common case, so optimise for it to avoid unnecessary object allocations

# File lib/pdf/reader/page_state.rb, line 231
def trm_transform(x, y)
  trm = text_rendering_matrix
  if x == 0 && y == 0
    [trm.e, trm.f]
  else
    [
      (trm.a * x) + (trm.c * y) + (trm.e),
      (trm.b * x) + (trm.d * y) + (trm.f)
    ]
  end
end

[Validate]

Generated with the Darkfish Rdoc Generator 2.