Use of the information contained in this unapproved document is at your own risk
.Last update: 20 April,2001
1003.2-92 #71
_____________________________________________________________________________
Interpretation Number: XXXX
Topic: diff -c
Relevant Sections: 4.17,6.1.4
Interpretation Request: (Defect Report)
-----------------------
(from Alex White, MKS)
Subject: IEEE Std 1003.2-1992 Interpretation Request: (Defect Report) diff
I would like an interpretation for diff -c output as outlined in
IEEE Std 1003.2-1992 section 4.17.6.1.4.
The format specified for diff -c output by IEEE Std 1003.2-1992 (section
4.17.6.1.4) is not compatible with historical diff -c output. This
difference both breaks existing applications that process context diff
output and makes it impossible for a conforming patch utility to process
diff -c output in either form. Historically a line consisting of 15
asterisks ("***************") was output at the start of the output of
each set of differences. The standard requires instead that the line
consisting of 15 asterisks be output only once immediately after the
file name identification.
For example historical context diff output (with one line of context)
looks like this:
*** t1 Thu May 5 11:15:51 1994
--- t2 Thu May 5 11:15:51 1994
***************
*** 4,6 ****
abacus
- Abaddon
abaff
--- 4,5 ----
***************
*** 8,10 ****
abandon
! abandonned
abase
--- 7,9 ----
abandon
! abandoned
abase
***************
*** 13 ****
--- 12,14 ----
abatis
+ abattoir
+ abaxial
but the standard requires that it looks like this:
*** t1 Thu May 5 11:15:51 1994
--- t2 Thu May 5 11:15:51 1994
***************
*** 4,6 ****
abacus
- Abaddon
abaff
--- 4,5 ----
*** 8,10 ****
abandon
! abandonned
abase
--- 7,9 ----
abandon
! abandoned
abase
*** 13 ****
--- 12,14 ----
abatis
+ abattoir
+ abaxial
This also affects the patch utility because its input is defined in
terms of diff output. It makes context diff output, in either the
historical format or the format the standard requires, impossible to
process by patch. The line with 15 asterisks is key to disambiguating
whether a line in the format of "*** %d,%d ****\n" is part of a "hunk"
or is the part of the filename identification, matching expression "***
filename timestamp" (see 5.22.7.1). Traditional patch considered a line
in the format of "*** %d,%d ****\n" file identification unless it was
preceeded by a line of 15 asterisks, in which case it considered it the
start of a new hunk.
Note, patch does not interpret filename identification in terms of diff
output, only hunks are. It has its own rules that require that two
expressions be recognized as filename identification, neither of which
include a line consisting of 15 asterisks. Also, the timestamp can not
be used to disambiguate between the two lines because it is required
that the timestamp not affect the processing of a patchfile (see PASC
Interpretation reference 1003-92.2 #14).
Would it be possible to change 4.17.6.1.4 by:
a) deleting lines 3352-3353.
b) deleting line 3362 and inserting in its place:
First, a string of 15 asterisks shall be output:
"***************\n"
Next, the range of lines in file1 shall be written in
following format:
Interpretation Response:
------------------------
The standard states behavior for the diff -c option and conforming
implementations must conform to this. However, concerns have been
raised about this which are being referred to the sponsor.
Rationale:
----------
None.
Forwarded to Interpretations group: 13 Oct 1994
Proposed resolution sent for review: 19th Nov 94
Resolved: 10th Dec 94
_____________________________________________________________________________