106833 – Some letters are omitted when printing

Issue 106833 - Some letters are omitted when printing

Summary: Some letters are omitted when printing

Status:	CLOSED FIXED

Alias:	None

Product:	gsl
Classification:	Code
Component:	www (show other issues)
Version:	OOO310m19
Hardware:	PC Linux, all

Importance:	P2 Trivial (vote)
Target Milestone:	OOo 3.2
Assignee:	h.ilter
QA Contact:	issues@gsl

URL:
Keywords:

Duplicates (2):	104050 105631 (view as issue list)
Depends on:
Blocks:	99999
	Show dependency tree

Reported:	2009-11-11 20:07 UTC by alexps
Modified:	2017-05-20 10:28 UTC (History)
CC List:	3 users (show)

See Also:
Issue Type:	DEFECT
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
Files noted in description (834.57 KB, application/x-compressed) 2009-11-11 20:09 UTC, alexps	no flags	Details
Another sample, more simple, only 1 letter involved (14.44 KB, text/plain) 2009-11-12 10:53 UTC, alexps	no flags	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description alexps 2009-11-11 20:07:21 UTC

When printing documents some letters for certain fonts are omitted. The issue
was experienced when printing from Writer and Calc.

Russian capital letter â€œShort Iâ€, U+0419 is omitting when occurred with
following fonts: Arial, Courier New, and Times New Roman. Russian small letter
â€œBeâ€, U+0431 is omitting when occurred with DejaVu Sans.
The issue lead to corrupted parer printout. Printout depends on version of cups
using, it can look, as a space instead of the letter omitted or letter is
omitted without any space and some space is inserted further in text shifting a
symbol in place partially over the next one.
The issue remains when printing into PDF via cups-pdf. PDF document generated
this way looks the same as paper printout described above.
The issue doesn't appear when document is exported to PDF first, then PDF
document is printed. This is the only way found to bypass the issue.

Attached zip contains:
mistyped-fonts.odt	initial document that contains samples with letters and
fonts in subject. All paragraphs (lines) are the same text containing U+0431 at
4th position (counting from 1) and  U+0419 at 20th position. There are 4
paragraphs per font, with different font styles: regular, italic, bold, and
bold italic. Last 4 paragraphs are in Microsoft Verdana font that has no signs
of the issue, those are given as sample;
mistyped-fonts.pdf	pdf â€œprintoutâ€, generated by cups-pdf;
mistyped-fonts.ps	â€œPrint into fileâ€ output;
mistyped-fonts-exported.pdf	export into PDF output, clean against the
issue, as it was noted above.

I tried to analyze postscript file that is saved when â€œPrint into fileâ€ is
checked. As I can see, there is no glyphs defined in the file for the letters
that will be omitted when printing. For example, for ArialMTFID33HGSet2, there
is no glyph set for â€œEncoding 14â€ (that corresponds to U+0419, 0x0E in the
output later in file). The VerdanaFID59HGSet2 has â€œEncoding 14â€ glyph defined
and is Verdana is printed out correctly.

Environment:
OS: Kubuntu 9.04
OOo: OOO310m19. Build 9420.
DejaVu Sans font from OS distribution, also latest version from dejavu-
Microsoft TTF fonts concerned were installed by â€œmsttcorefontsâ€ package that
downloads them from sourceforge.net.

The issue is very serious for Russian-speaking users, especially for not a lot
IT-experienced office-sitter users in Belarus and Russia where the majority of
the state institutions as well as many private companies use Microsoft document
formats and their fonts as standards, issuing such documents outside and
expecting or even requiring them on entry. Given such a document received, user
may print it out and use further officially even not knowing that printout
contains letters omitted.

Comment 1 alexps 2009-11-11 20:09:49 UTC

Created attachment 66069 [details]
Files noted in description

Comment 2 Regina Henschel 2009-11-11 20:43:30 UTC

Both printing on HP LaserJet and exporting to pdf with an Englisch OOo on German
WinXP have _no_ errors.

Comment 3 hdu@apache.org 2009-11-12 07:08:55 UTC

The problem is probably related to issue 104050 and issue 105631#desc14

Comment 4 alexps 2009-11-12 10:53:14 UTC

Created attachment 66073 [details]
Another sample, more simple, only 1 letter involved

Comment 5 alexps 2009-11-12 11:09:38 UTC

The samples' zip posted above include odt file with only Cyrillic â€œShot Iâ€
(U+0419) in Arial and â€œPrint into fileâ€ postscript output. 
It is clearly seen with postscript file that the only one letter to be printed
has no encoding defined in the embedded font subset. There is strange â€œEncoding
0â€ bound to glyph3 only in the file. Glyph3 looks like an empty rectangle.
Glyph1 and glyph2 that pulled into embedded font subset are parts of Cyrillic
â€œShot Iâ€ (U+0419) â€“ the letter that should be printed. â€œShort Iâ€
(http://en.wikipedia.org/wiki/Ð™) consists two graphical parts â€“ the base that
looks the same as other Russian letter â€œIâ€ and the diactrical mark â€“ breve.
Glyph1 is the base and glyph2 is the breve.

Comment 6 ivanov1965 2009-11-12 20:00:48 UTC

HIGH IMPORTENS FOR RUSSIAN USERS!

It is very important for Russian users, because they can not print in Open
Office documents in MS Winword format, which is the de facto standard for
government agencies and corporations in Russia

Comment 7 philipp.lohmann 2009-12-03 15:02:07 UTC

I can confirm that the second sample doc creates only ony glyph encoded as "0"
in the produced font. However ghostscript seems to show this PostScript file
just fine (including the sample postscript output attached). A PDF file produced
with cups-pdf shows just fine in acroread; only ghostscript shows a problem with
that PDF file.

Anyway, let's try to avoid glyph 0 and use 1 instead.

Comment 8 philipp.lohmann 2009-12-03 16:44:26 UTC

Ok, what is wrong is the Encoding vector (and is plain wrong, the actual encoded
value for the glyph is already '1' as it should be). The Encoding vector however
seems to be not the first place most programs look for the glpyh, it seems to be
the CharStrings array.

The entries of the encoding vector are originally created in
vcl/unx/source/printergfx/glpyhset.cxx in GlyphSet::PSUploadFont. The encoded
entries come from a hash_map, which is not sorted; but that should not be
necessary anyway since the decription comes as a glyph array and an encoding
array. This then goes into FontSubsetInfo, which uses CreateT42FromTTFont (and
friends) to create the subsetted font file.

Now the latter expect the notdef glpyh to be encoded '0' (which is reasonable),
but do not allow for the encoding to be unsorted.

So there are three places where the encodig -> glyph could be repaired by
sorting: PSUploadFont (which has the original unsorted data), FontSubsetInfo
(which uses CreateT???FromTTFont wrongly) or the CreateT???FromTTFont functions,
which perhaps should be able to catch this.

@hdu: do you have an opinion where to best fix this ? I chose to use
CreatePSUploadableFont in glyphset.cxx for this since it is central and allows
to use the unsorted hash_map in the GlyphSet class (which is probably a
performance gain). Anyway I'd like you to review the change.

fixed in CWS vcl108

Comment 9 philipp.lohmann 2009-12-03 17:10:53 UTC

*** Issue 104050 has been marked as a duplicate of this issue. ***

Comment 10 philipp.lohmann 2009-12-03 17:19:26 UTC

*** Issue 105631 has been marked as a duplicate of this issue. ***

Comment 11 hdu@apache.org 2009-12-04 08:20:50 UTC

Doing it in CreatePSUploadableFont() is the solution that will solve this problem reliably and without the 
risk to impact other parts of the code.

The need to put the Notdef glyph at glyphid zero is needed and coded all over the place though so in the 
medium term I'd love to have all this consolidated into one place like FontSubsetInfo::CreateFontSubset(). 
I'm not sure though what all the callers are expecting since they provide a const parameter to request the 
id of each subsetted glyph. Are these just friendly suggestions that can be completely ignored or can they 
be reshuffled at will by the callee? As of now the subsetter will treat them as gospel and does just as 
requested.

Comment 12 philipp.lohmann 2009-12-04 12:17:18 UTC

will also commit to CWS ooo32gsl09 for 3.2 due to the overall severity for
printing on Linux.

Comment 13 philipp.lohmann 2009-12-08 09:57:22 UTC

please verify in CWS ooo32gsl09

Comment 14 h.ilter 2009-12-09 15:25:45 UTC

Bug was not reproducible on my suse linux. Fix verified on PL's machine.
@alexps: Please close this issue when it is still ok in OOo3.2 final.