X-Git-Url: http://erislabs.net/gitweb/?a=blobdiff_plain;f=doc%2Fstandards.texi;h=a42cf71f9a06c68864c88b78036c0149ca4e6046;hb=7926b523071cfc66fa2252b51a8d51ffd9df7219;hp=9f1cfbf3bfcbe6e805897627eab656d10202e512;hpb=ce0aff17cefaab160cbe8b283ce1ccff760c9ce0;p=gnulib.git diff --git a/doc/standards.texi b/doc/standards.texi index 9f1cfbf3b..a42cf71f9 100644 --- a/doc/standards.texi +++ b/doc/standards.texi @@ -3,7 +3,7 @@ @setfilename standards.info @settitle GNU Coding Standards @c This date is automagically updated when you save this file: -@set lastupdate January 1, 2005 +@set lastupdate December 25, 2005 @c %**end of header @dircategory GNU organization @@ -33,7 +33,7 @@ The GNU coding standards, last updated @value{lastupdate}. Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, -2001, 2002, 2003, 2004 Free Software Foundation, Inc. +2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 @@ -239,9 +239,9 @@ C'' as a label for the compiler rather than for the language. Please don't use ``win'' as an abbreviation for Microsoft Windows in GNU software or documentation. In hacker terminology, calling -something a "win" is a form of praise. If you wish to praise +something a ``win'' is a form of praise. If you wish to praise Microsoft Windows when speaking on your own, by all means do so, but -not in GNU software. Usually we write the word ``windows'' in full, +not in GNU software. Usually we write the name ``Windows'' in full, but when brevity is very important (as in file names and sometimes symbol names), we abbreviate it to ``w''. For instance, the files and functions in Emacs that deal with Windows start with @samp{w32}. @@ -850,7 +850,7 @@ All programs should support two standard options: @samp{--version} and @samp{--help}. CGI programs should accept these as command-line options, and also if given as the @env{PATH_INFO}; for instance, visiting @url{http://example.org/p.cgi/--help} in a browser should -output the same information as inokving @samp{p.cgi --help} from the +output the same information as invoking @samp{p.cgi --help} from the command line. @table @code @@ -2142,6 +2142,8 @@ when writing GNU software. * CPU Portability:: Supporting the range of CPU types * System Functions:: Portability and ``standard'' library functions * Internationalization:: Techniques for internationalization +* Character Set:: Use ASCII by default. +* Quote Characters:: Use `...' in the C locale. * Mmap:: How you can safely use @code{mmap}. @end menu @@ -2152,21 +2154,20 @@ when writing GNU software. @cindex open brace @cindex braces, in C source It is important to put the open-brace that starts the body of a C -function in column zero, and avoid putting any other open-brace or -open-parenthesis or open-bracket in column zero. Several tools look -for open-braces in column zero to find the beginnings of C functions. +function in column one, and avoid putting any other open-brace or +open-parenthesis or open-bracket in column one. Several tools look +for open-braces in column one to find the beginnings of C functions. These tools will not work on code not formatted that way. It is also important for function definitions to start the name of the -function in column zero. This helps people to search for function +function in column one. This helps people to search for function definitions, and may also help certain tools recognize them. Thus, -the proper format is this: +using Standard C syntax, the format is this: @example static char * -concat (s1, s2) /* Name starts in column zero here */ - char *s1, *s2; -@{ /* Open brace in column zero here */ +concat (char *s1, char *s2) +@{ @dots{} @} @end example @@ -2177,8 +2178,9 @@ this: @example static char * -concat (char *s1, char *s2) -@{ +concat (s1, s2) /* Name starts in column one here */ + char *s1, *s2; +@{ /* Open brace in column one here */ @dots{} @} @end example @@ -2296,7 +2298,13 @@ page. The formfeeds should appear alone on lines by themselves. @cindex commenting Every program should start with a comment saying briefly what it is for. -Example: @samp{fmt - filter for simple filling of text}. +Example: @samp{fmt - filter for simple filling of text}. This comment +should be at the top of the source file containing the @samp{main} +function of the program. + +Also, please write a brief comment at the start of each source file, +with the file name and a line or two about the overall purpose of the +file. Please write the comments in a GNU program in English, because English is the one language that nearly all programmers in all countries can @@ -2574,7 +2582,7 @@ constants. @cindex file-name limitations @pindex doschk You might want to make sure that none of the file names would conflict -the files were loaded onto an MS-DOS file system which shortens the +if the files were loaded onto an MS-DOS file system which shortens the names. You can use the program @code{doschk} to test for this. Some GNU programs were designed to limit themselves to file names of 14 @@ -2616,11 +2624,11 @@ Avoid using the format of semi-internal data bases (e.g., directories) when there is a higher-level alternative (@code{readdir}). @cindex non-@sc{posix} systems, and portability -As for systems that are not like Unix, such as MSDOS, Windows, the -Macintosh, VMS, and MVS, supporting them is often a lot of work. When -that is the case, it is better to spend your time adding features that -will be useful on GNU and GNU/Linux, rather than on supporting other -incompatible systems. +As for systems that are not like Unix, such as MSDOS, Windows, VMS, MVS, +and older Macintosh systems, supporting them is often a lot of work. +When that is the case, it is better to spend your time adding features +that will be useful on GNU and GNU/Linux, rather than on supporting +other incompatible systems. If you do support Windows, please do not abbreviate it as ``win''. In hacker terminology, calling something a ``win'' is a form of praise. @@ -2665,7 +2673,7 @@ printf ("diff = %ld\n", (long) (pointer2 - pointer1)); @end example 1989 Standard C requires this to work, and we know of only one -counterexample: 64-bit programs on Microsoft Windows IA-64. We will +counterexample: 64-bit programs on Microsoft Windows. We will leave it to those who want to port GNU programs to that environment to figure out how to do it. @@ -2969,6 +2977,63 @@ printf (f->tried_implicit : "# Implicit rule search has not been done.\n"); @end example + +@node Character Set +@section Character Set +@cindex character set +@cindex encodings +@cindex ASCII characters +@cindex non-ASCII characters + +Sticking to the ASCII character set (plain text, 7-bit characters) is +preferred in GNU source code comments, text documents, and other +contexts, unless there is good reason to do something else because of +the application domain. For example, if source code deals with the +French Revolutionary calendar, it is OK if its literal strings contain +accented characters in month names like ``Flor@'eal''. Also, it is OK +to use non-ASCII characters to represent proper names of contributors in +change logs (@pxref{Change Logs}). + +If you need to use non-ASCII characters, you should normally stick with +one encoding, as one cannot in general mix encodings reliably. + + +@node Quote Characters +@section Quote Characters +@cindex quote characters +@cindex locale-specific quote characters +@cindex left quote +@cindex grave accent + +In the C locale, GNU programs should stick to plain ASCII for quotation +characters in messages to users: preferably 0x60 (@samp{`}) for left +quotes and 0x27 (@samp{'}) for right quotes. It is ok, but not +required, to use locale-specific quotes in other locales. + +The @uref{http://www.gnu.org/software/gnulib/, Gnulib} @code{quote} and +@code{quotearg} modules provide a reasonably straightforward way to +support locale-specific quote characters, as well as taking care of +other issues, such as quoting a filename that itself contains a quote +character. See the Gnulib documentation for usage details. + +In any case, the documentation for your program should clearly specify +how it does quoting, if different than the preferred method of @samp{`} +and @samp{'}. This is especially important if the output of your +program is ever likely to be parsed by another program. + +Quotation characters are a difficult area in the computing world at +this time: there are no true left or right quote characters in Latin1; +the @samp{`} character we use was standardized there as a grave +accent. Moreover, Latin1 is still not universally usable. + +Unicode contains the unambiguous quote characters required, and its +common encoding UTF-8 is upward compatible with Latin1. However, +Unicode and UTF-8 are not universally well-supported, either. + +This may change over the next few years, and then we will revisit +this. + + @node Mmap @section Mmap @findex mmap @@ -3087,9 +3152,9 @@ functions, variables, options, and important concepts that are part of the program. One combined Index should do for a short manual, but sometimes for a complex package it is better to use multiple indices. The Texinfo manual includes advice on preparing good index entries, see -@ref{Index Entries, , Making Index Entries, texinfo, The GNU Texinfo -Manual}, and see @ref{Indexing Commands, , Defining the Entries of an -Index, texinfo, The GNU Texinfo manual}. +@ref{Index Entries, , Making Index Entries, texinfo, GNU Texinfo}, and +see @ref{Indexing Commands, , Defining the Entries of an +Index, texinfo, GNU Texinfo}. Don't use Unix man pages as a model for how to write GNU documentation; most of them are terse, badly structured, and give inadequate @@ -3598,20 +3663,21 @@ For example, an Athlon-based GNU/Linux system might be The @code{configure} script needs to be able to decode all plausible alternatives for how to describe a machine. Thus, -@samp{athlon-pc-gnu/linux} would be a valid alias. -There is a shell script called -@uref{ftp://ftp.gnu.org/gnu/config/config.sub, @file{config.sub}} -that you can use -as a subroutine to validate system types and canonicalize aliases. +@samp{athlon-pc-gnu/linux} would be a valid alias. There is a shell +script called +@uref{http://savannah.gnu.org/cgi-bin/viewcvs/*checkout*/config/config/config.sub, +@file{config.sub}} that you can use as a subroutine to validate system +types and canonicalize aliases. The @code{configure} script should also take the option @option{--build=@var{buildtype}}, which should be equivalent to a plain @var{buildtype} argument. For example, @samp{configure --build=i686-pc-linux-gnu} is equivalent to @samp{configure i686-pc-linux-gnu}. When the build type is not specified by an option -or argument, the @code{configure} script should normally guess it -using the shell script -@uref{ftp://ftp.gnu.org/gnu/config/config.guess, @file{config.guess}}. +or argument, the @code{configure} script should normally guess it using +the shell script +@uref{http://savannah.gnu.org/cgi-bin/viewcvs/*checkout*/config/config/config.guess, +@file{config.guess}}. @cindex optional features, configure-time Other options are permitted to specify in more detail the software @@ -3854,7 +3920,7 @@ scope of an operating system project. Referring to a web site that describes or recommends a non-free program is in effect promoting that software, so please do not make links (or mention by name) web sites that contain such material. This -policy is relevant particulary for the web pages for a GNU package. +policy is relevant particularly for the web pages for a GNU package. Following links from nearly any web site can lead to non-free software; this is an inescapable aspect of the nature of the web, and