Announcing Tectonic 0.1.8!

I’m happy to announce the release of Tectonic version 0.1.8! Among other changes, there is now a file in the repository that logs changes. Here’s what it lists for this release:

User-facing improvements:

  • A prominent warning is now emitted when missing characters are encountered
    in a font. The hope is that this will help un-confuse users who include
    Unicode characters in their input files without loading a Unicode-capable
    font. Before this change, such characters would just not appear in the
    output document.
  • Fix the implementation of the DVI “POP” operator, which was broken due to a
    typo. This should fix various corner-case failures to generate output.
  • The .toc and .snm output files emitted by Beamer are now treated as
    intermediate files, and therefore not saved to disk by default (contributed
    by Norbert Pozar).
  • Various hardcoded bibtex buffer sizes are increased, allowing larger
    bibliographies to be handled.
  • Format files are now stored uncompressed. The compression did not save a ton
    of disk space, but it did slow down debug builds significantly (contributed
    by @Mrmaxmeier).
  • The C code has been synchronized with XeTeX as of its Subversion
    revision 46289. The chief difference from before is the use of newer
    Harfbuzz features for
    rendering OpenType math fonts, which should substantially improve “Unicode
    math” output.

Work towards HTML output:

  • The first steps have been taken! In particular, the engine now has an
    internal flag to enable output to a new “SPX” format instead of XDV. SPX
    stands for Semantically Paginated XDV — based on my (PKGW’s) research, to
    achieve the best HTML output, the engine will have to emit intermediate data
    that are incompatible with XDV. At the moment, SPX is the same as XDV except
    with different identifying bytes, but this will change as the work towards
    excellent HTML output continues. The command-line tool does not provide
    access to this output format yet, so this work is currently purely internal.
  • In addition, there is a stub engine called spx2html that will translate
    SPX to HTML. At the moment it is a barely-functional proof-of-concept hook,
    and it is not exposed to users.
  • A new internal crate, tectonic_xdv, is added. It can parse XDV and SPX
    files, and is used by the spx2html engine.

Test suite improvements:

  • The test suite now supports reliable byte-for-byte validation of PDF output
    files, through the following improvements:
    • It is now possible for the engine to disable PDF compression (contributed
      by @Mrmaxmeier).
    • xdvipdfmx gained a mode to reproducibly generate the “unique tags”
      associated with fonts.
  • The testing support code is now centralized in a single crate (contributed
    by @Mrmaxmeier).
  • Continuous integration (CI) coverage now includes Linux and a big-endian
  • The CI coverage now includes code coverage monitoring.

Internal improvements:

  • Much of the command-line rebuild code has been moved inside the tectonic
    crate so that it can be reused in a library context (contributed by @jneem).

Improvements to the C code. As usual, there has been a great deal of tidying
that aims to make the code more readable and hackable without altering the
semantics. Many such changes are omitted below.

  • Tectonic’s synchronization with XeTeX is now tracked in version control
    formally, by referencing the
    repository as a Git submodule. It is not actually necessary to check out
    this submodule to build Tectonic, however.
  • The C code now requires, and takes advantage of, features in the
    C11 revision of
    the language standard.
  • All remaining pieces of C code that needed porting to the Rust I/O backend
    have been ported or removed.
  • Virtually all hardcoded strings in the string pool have been removed
    (contributed by @Mrmaxmeier).
  • The C code has been split into a few more files. Some subsystems, like the
    “shipout” code, use a lot of global variables that have been made static
    thanks to the splitting.
  • A big effort to clarify the pervasive and unintuitive memory_word
  • Effort to tidy the line_break() function and significantly increase its
    readability. This is in support of the goal of producing HTML output, for
    which I believe it is going to be necessary to essentially defuse this

Happy typesetting,