File bom-1.0.1.obscpio of Package bom

07070100000000000081A4000003E8000000640000000161830F5E00000029000000000000000000000000000000000000001200000000bom-1.0.1/AUTHORSArchie L. Cobbs <archie.cobbs@gmail.com>
07070100000001000081A4000003E8000000640000000161830F5E000000B0000000000000000000000000000000000000001200000000bom-1.0.1/CHANGESVersion 1.0.1 Released November 3, 2021

    - Fixed bug when multi-byte sequence crossed input buffer boundary

Version 1.0.0 Released October 16, 2021

    - Initial release
07070100000002000081A4000003E8000000640000000161830F5E00000052000000000000000000000000000000000000001200000000bom-1.0.1/INSTALLSimplified instructions:

    1. ./configure
    2. make
    3. sudo make install
07070100000003000081A4000003E8000000640000000161830F5E00002C5D000000000000000000000000000000000000001200000000bom-1.0.1/LICENSE                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   APPENDIX: How to apply the Apache License to your work.

      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

   Copyright [yyyy] [name of copyright owner]

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
07070100000004000081A4000003E8000000640000000161830F5E000003F6000000000000000000000000000000000000001600000000bom-1.0.1/Makefile.am#
# bom - Deals with Unicode byte order marks
#
# Copyright (C) 2021 Archie L. Cobbs. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

bin_PROGRAMS=		bom

man_MANS=		bom.1

docdir=			$(datadir)/doc/packages/$(PACKAGE)

doc_DATA=		CHANGES LICENSE README.md INSTALL AUTHORS

EXTRA_DIST=		CHANGES LICENSE README.md

bom_SOURCES=		main.c \
			gitrev.c

.PHONY:			tests
tests:			bom
			cd tests && ./run.sh

gitrev.c:
			printf 'const char *const bom_version = "%s";\n' "`git describe`" > gitrev.c
07070100000005000081A4000003E8000000640000000161830F5E0000142A000000000000000000000000000000000000001400000000bom-1.0.1/README.md**bom** is a simple UNIX command line utility for dealing with Unicode byte order marks (BOM's).

Unicode byte order marks are "magic number" byte sequences that sometimes appear at the beginning of a file to indicate the file's character encoding. They're sometimes helpful but usually they're just annoying.

You can read more about byte order marks [here](https://en.wikipedia.org/wiki/Byte_order_mark).

**bom** operates in one of the following modes:

  * `bom --detect` Detect which type of byte order mark is present (if any) and print to standard output
  * `bom --strip` Strip off the byte order mark (if any) and output the remainder of the file, optionally also converting to UTF-8
  * `bom --print` Output the byte sequence corresponding to a byte order mark (useful for adding them to files)
  * `bom --list` List the supported byte order mark types

Here is the man page:
```
BOM(1)                      BSD General Commands Manual                      BOM(1)

NAME
     bom -- Decode Unicode byte order mark

SYNOPSIS
     bom --strip [--expect types] [--lenient] [--prefer32] [--utf8] [file]
     bom --detect [--expect types] [--prefer32] [file]
     bom --print type
     bom --list
     bom --help
     bom --version

DESCRIPTION
     bom decodes, verifies, reports, and/or strips the byte order mark (BOM) at the
     start of the specified file, if any.

     When no file is specified, or when file is -, read standard input.

OPTIONS
     -d, --detect
             Report the detected BOM type to standard output and then exit.

             See SUPPORTED BOM TYPES for possible values.

     -e, --expect types
             Expect to find one of the specified BOM types, otherwise exit with an
             error.

             Multiple types may be specified, separated by commas.

             Specifying NONE is acceptable and matches when the file has no (sup-
             ported) BOM.

     -h, --help
             Output command line usage help.

     -l, --lenient
             Silently ignore any illegal byte sequences encountered when converting
             the remainder of the file to UTF-8.

             Without this flag, bom will exit immediately with an error if an ille-
             gal byte sequence is encountered.

             This flag has no effect unless the --utf8 flag is given.

     --list  List the supported BOM types and exit.

     -p, --print type
             Output the byte sequence corresponding to the type byte order mark.

     --prefer32
             Used to disambiguate the byte sequence FF FE 00 00, which can be
             either a UTF-32LE BOM or a UTF-16LE BOM followed by a NUL character.

             Without this flag, UTF-16LE is assumed; with this flag, UTF-32LE is
             assumed.

     -s, --strip
             Strip the BOM, if any, from the beginning of the file and output the
             remainder of the file.

     -u, --utf8
             Convert the remainder of the file to UTF-8, assuming the character
             encoding implied by the detected BOM.

             For files with no (supported) BOM, this flag has no effect and the
             remainder of the file is copied unmodified.

             For files with a UTF-8 BOM, the identity transformation is still
             applied, so (for example) illegal byte sequences will be detected.

     -v, --version
             Output program version and exit.

SUPPORTED BOM TYPES
     The supported BOM types are:

     NONE    No supported BOM was detected.

     UTF-7   A UTF-7 BOM was detected.

     UTF-8   A UTF-8 BOM was detected.

     UTF-16BE
             A UTF-16 (Big Endian) BOM was detected.

     UTF-16LE
             A UTF-16 (Little Endian) BOM was detected.

     UTF-32BE
             A UTF-32 (Big Endian) BOM was detected.

     UTF-32LE
             A UTF-32 (Little Endian) BOM was detected.

     GB18030
             A GB18030 (Chinese National Standard) BOM was detected.

EXAMPLES
     To tell what kind of byte order mark a file has:

           $ bom --detect

     To normalize files with byte order marks into UTF-8, and pass other files
     through unchanged:

           $ bom --strip --utf8

     Same as previous example, but discard illegal byte sequences instead of gener-
     ating an error:

           $ bom --strip --utf8 --lenient

     To verify a properly encoded UTF-8 or UTF-16 file with a byte-order-mark and
     output it as UTF-8:

           $ bom --strip --utf8 --expect UTF-8,UTF-16LE,UTF-16BE

     To just remove any byte order mark and get on with your life:

           $ bom --strip file

RETURN VALUES
     bom exits with one of the following values:

     0       Success.

     1       A general error occurred.

     2       The --expect flag was given but the detected BOM did not match.

     3       An illegal byte sequence was detected (and --lenient was not speci-
             fied).

SEE ALSO
     iconv(1)

     bom: Decode Unicode byte order mark, https://github.com/archiecobbs/bom.

AUTHOR
     Archie L. Cobbs <archie.cobbs@gmail.com>

BSD                               October 14, 2021                              BSD
```
07070100000006000081ED000003E8000000640000000161830F5E0000039B000000000000000000000000000000000000001500000000bom-1.0.1/autogen.sh#!/bin/bash

#
# Script to regenerate all the GNU auto* gunk.
# Run this from the top directory of the source tree.
#
# If it looks like I don't know what I'm doing here, you're right.
#

set -e

echo "cleaning up"
rm -rf autom4te*.cache scripts aclocal.m4 configure config.log config.status .deps stamp-h1
rm -f config.h.in config.h.in~ config.h
rm -rf scripts
find . \( -name Makefile -o -name Makefile.in \) -print0 | xargs -0 rm -f
rm -f *.o bom bom.1 bom-*.tar.gz gitrev.c
rm -rf a.out.* tags
if [ "${1}" = '-C' ]; then
    exit 0
fi

ACLOCAL="aclocal"
AUTOHEADER="autoheader"
AUTOMAKE="automake"
AUTOCONF="autoconf"

echo "running aclocal"
mkdir scripts
${ACLOCAL} ${ACLOCAL_ARGS} -I scripts

echo "running autoheader"
${AUTOHEADER}

echo "running automake"
${AUTOMAKE} --add-missing -c --foreign

echo "running autoconf"
${AUTOCONF} -f -i

if [ "${1}" = '-c' ]; then
    echo "running configure"
    ./configure
fi

07070100000007000081A4000003E8000000640000000161830F5E000012B1000000000000000000000000000000000000001300000000bom-1.0.1/bom.1.in.\"  -*- nroff -*-
.\"
.\" bom - Deals with Unicode byte order marks
.\"
.\" Copyright (C) 2021 Archie L. Cobbs. All rights reserved.
.\"
.\" Licensed under the Apache License, Version 2.0 (the "License");
.\" you may not use this file except in compliance with the License.
.\" You may obtain a copy of the License at
.\"
.\"     http://www.apache.org/licenses/LICENSE-2.0
.\"
.\" Unless required by applicable law or agreed to in writing, software
.\" distributed under the License is distributed on an "AS IS" BASIS,
.\" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
.\" See the License for the specific language governing permissions and
.\" limitations under the License.
.\"
.Dd October 14, 2021
.Dt BOM 1
.Os
.Sh NAME
.Nm bom
.Nd Decode Unicode byte order mark
.Sh SYNOPSIS
.Nm
.Fl \-strip
.Op Fl \-expect Ar types
.Op Fl \-lenient
.Op Fl \-prefer32
.Op Fl \-utf8
.Op Ar file
.Nm
.Fl \-detect
.Op Fl \-expect Ar types
.Op Fl \-prefer32
.Op Ar file
.Nm
.Fl \-print Ar type
.Nm
.Fl \-list
.Nm
.Fl \-help
.Nm
.Fl \-version
.Sh DESCRIPTION
.Nm
decodes, verifies, reports, and/or strips the byte order mark (BOM) at the start of the specified file, if any.
.Pp
When no
.Ar file
is specified, or when
.Ar file
is \-, read standard input.
.Sh OPTIONS
.Bl -tag -width Ds
.It Fl d , Fl \-detect
Report the detected BOM type to standard output and then exit.
.Pp
See
.Sx "SUPPORTED BOM TYPES"
for possible values.
.It Fl e , Fl \-expect Ar types
Expect to find one of the specified BOM types, otherwise exit with an error.
.Pp
Multiple types may be specified, separated by commas.
.Pp
Specifying
.Ar NONE
is acceptable and matches when the file has no (supported) BOM.
.It Fl h , Fl \-help
Output command line usage help.
.It Fl l , Fl \-lenient
Silently ignore any illegal byte sequences encountered when converting the remainder of the file to UTF-8.
.Pp
Without this flag,
.Nm
will exit immediately with an error if an illegal byte sequence is encountered.
.Pp
This flag has no effect unless the
.Fl \-utf8
flag is given.
.It Fl \-list
List the supported BOM types and exit.
.It Fl p , Fl \-print Ar type
Output the byte sequence corresponding to the
.Ar type
byte order mark.
.It Fl \-prefer32
Used to disambiguate the byte sequence
.Ar "FF FE 00 00" ,
which can be either a
.Ar UTF-32LE
BOM or a
.Ar UTF-16LE
BOM followed by a NUL character.
.Pp
Without this flag,
.Ar UTF-16LE
is assumed; with this flag,
.Ar UTF-32LE
is assumed.
.It Fl s , Fl \-strip
Strip the BOM, if any, from the beginning of the file and output the remainder of the file.
.It Fl u , Fl \-utf8
Convert the remainder of the file to UTF-8, assuming the character encoding implied by the detected BOM.
.Pp
For files with no (supported) BOM, this flag has no effect and the remainder of the file is copied unmodified.
.Pp
For files with a UTF-8 BOM, the identity transformation is still applied, so (for example) illegal byte sequences will be detected.
.It Fl v , Fl \-version
Output program version and exit.
.El
.Sh SUPPORTED BOM TYPES
The supported BOM types are:
.Bl -tag -width Ds
.It NONE
No supported BOM was detected.
.It UTF-7
A UTF-7 BOM was detected.
.It UTF-8
A UTF-8 BOM was detected.
.It UTF-16BE
A UTF-16 (Big Endian) BOM was detected.
.It UTF-16LE
A UTF-16 (Little Endian) BOM was detected.
.It UTF-32BE
A UTF-32 (Big Endian) BOM was detected.
.It UTF-32LE
A UTF-32 (Little Endian) BOM was detected.
.It GB18030
A GB18030 (Chinese National Standard) BOM was detected.
.El
.Sh EXAMPLES
.Pp
To tell what kind of byte order mark a file has:
.Bd -literal -offset indent
$ bom --detect file
.Ed
.Pp
To normalize files with byte order marks into UTF-8, and pass other files through unchanged:
.Bd -literal -offset indent
$ bom --strip --utf8 file
.Ed
.Pp
Same as previous example, but discard illegal byte sequences instead of generating an error:
.Bd -literal -offset indent
$ bom --strip --utf8 --lenient file
.Ed
.Pp
To verify a properly encoded UTF-8 or UTF-16 file with a byte-order-mark and output it as UTF-8:
.Bd -literal -offset indent
$ bom --strip --utf8 --expect UTF-8,UTF-16LE,UTF-16BE file
.Ed
.Pp
To just remove any byte order mark and get on with your life:
.Bd -literal -offset indent
$ bom --strip file
.Ed
.Sh RETURN VALUES
.Nm
exits with one of the following values:
.Bl -tag -width Ds
.It 0
Success.
.It 1
A general error occurred.
.It 2
The
.Fl \-expect
flag was given but the detected BOM did not match.
.It 3
An illegal byte sequence was detected (and
.Fl \-lenient
was not specified).
.El
.Sh SEE ALSO
.Xr iconv 1
.Rs
.%T "bom: Decode Unicode byte order mark"
.%O https://github.com/archiecobbs/bom
.Re
.Rs
.%T "Byte order mark (Wikipedia)"
.%O https://en.wikipedia.org/wiki/Byte_order_mark
.Re
.Sh AUTHOR
.An Archie L. Cobbs Aq archie.cobbs@gmail.com
07070100000008000081A4000003E8000000640000000161830F5E00000A07000000000000000000000000000000000000001700000000bom-1.0.1/configure.ac#
# bom - Deals with Unicode byte order marks
#
# Copyright (C) 2021 Archie L. Cobbs. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

AC_INIT([bom Deals with Unicode byte order marks], [1.0.1], [https://github.com/archiecobbs/bom/], [bom])
AC_CONFIG_AUX_DIR(scripts)
AM_INIT_AUTOMAKE
dnl AM_MAINTAINER_MODE
AC_PREREQ(2.59)
AC_PREFIX_DEFAULT(/usr)
AC_PROG_MAKE_SET

[CFLAGS="-g -O3 -pipe -Wall -Waggregate-return -Wcast-align -Wchar-subscripts -Wcomment -Wformat -Wimplicit -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wno-long-long -Wparentheses -Wpointer-arith -Wredundant-decls -Wreturn-type -Wswitch -Wtrigraphs -Wuninitialized -Wunused -Wwrite-strings -Wshadow -Wstrict-prototypes -Wcast-qual $CFLAGS"]
AC_SUBST(CFLAGS)

# Compile flags for Linux
AC_DEFINE(_DEFAULT_SOURCE, 1, Default functions)
AC_DEFINE(_GNU_SOURCE, 1, GNU functions)
AC_DEFINE(_BSD_SOURCE, 1, BSD functions)
AC_DEFINE(_XOPEN_SOURCE, 500, XOpen functions)

# Compile flags for Mac OS
AC_DEFINE(_DARWIN_C_SOURCE, 1, MacOS functions)

# Check for required programs
AC_PROG_INSTALL
AC_PROG_CC
AC_PATH_PROG([CAT], [cat], [], [])
if test "x${CAT}" = "x"; then
    AC_MSG_ERROR[cat not found]
fi
AC_PATH_PROG([SED], [sed], [], [])
if test "x${SED}" = "x"; then
    AC_MSG_ERROR[sed not found]
fi

# Check for required libc functions
AC_SEARCH_LIBS([iconv_open], [iconv],,
    [if test `uname -o` = 'Cygwin' -a -f /usr/lib/libiconv.a; then LIBS="-liconv ${LIBS}"; else AC_MSG_ERROR([required function iconv_open missing]); fi])

# Check for required header files
AC_HEADER_STDC
AC_CHECK_HEADERS(ctype.h errno.h stdio.h stdlib.h string.h unistd.h sys/stat.h sys/types.h, [],
        [AC_MSG_ERROR([required header file '$ac_header' missing])])

# Optional features
AC_ARG_ENABLE(Werror,
    AC_HELP_STRING([--enable-Werror],
        [enable compilation with -Werror flag (default NO)]),
    [test x"$enableval" = "xyes" && CFLAGS="${CFLAGS} -Werror"])

# Generated files
AC_CONFIG_FILES(Makefile)
AC_CONFIG_FILES(bom.1)
AM_CONFIG_HEADER(config.h)

# Go
AC_OUTPUT
07070100000009000081A4000003E8000000640000000161830F5E000049E6000000000000000000000000000000000000001100000000bom-1.0.1/main.c/*
 * bom - Deals with Unicode byte order marks
 *
 * Copyright (C) 2021 Archie L. Cobbs. All rights reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

#include <assert.h>
#include <ctype.h>
#include <err.h>
#include <errno.h>
#include <getopt.h>
#include <iconv.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

// Copyright character
#define COPYRIGHT       "\xc2\xa9"

// Special exit values
#define EX_EXPECT_FAIL      2
#define EX_ILLEGAL_BYTES    3

// Version string
extern const char *const bom_version;

// Command line options that only have long versions
#define FLAG_LIST       (-2)
#define FLAG_PREFER_32  (-3)

#define OPT(_letter, _name, _arg)                                                               \
    {                                                                                           \
        .name=      _name,                                                                      \
        .has_arg=   _arg,                                                                       \
        .flag=      NULL,                                                                       \
        .val=       _letter                                                                     \
    }
static const struct option long_options[] = {
    OPT('d',            "detect",      no_argument),
    OPT('e',            "expect",      required_argument),
    OPT('h',            "help",        no_argument),
    OPT(FLAG_LIST,      "list",        no_argument),
    OPT('l',            "lenient",     no_argument),
    OPT('p',            "print",       required_argument),
    OPT(FLAG_PREFER_32, "prefer32",    no_argument),
    OPT('s',            "strip",       no_argument),
    OPT('u',            "utf8",        no_argument),
    OPT('v',            "version",     no_argument),
    OPT(0, NULL, 0)
};

// Execution modes
#define MODE_STRIP      1
#define MODE_DETECT     2
#define MODE_LIST       3
#define MODE_PRINT      4
#define MODE_HELP       5
#define MODE_VERSION    6

// BOM types
struct bom_type {
    const char      *name;
    const char      *encoding;
    const char      *bytes;
    const int       len;
};
#define BOM_TYPE(_name, _encoding, _bytes)                                                  \
    {                                                                                       \
        .name=      _name,                                                                  \
        .encoding=  _encoding,                                                              \
        .bytes=     _bytes,                                                                 \
        .len=       sizeof(_bytes) - 1                                                      \
    }
static const struct bom_type bom_types[] = {
    BOM_TYPE("NONE",        NULL,       ""),
    BOM_TYPE("UTF-7",       "UTF-7",    "\x2b\x2f\x76"),
    BOM_TYPE("UTF-8",       "UTF-8",    "\xef\xbb\xbf"),
    BOM_TYPE("UTF-16BE",    "UTF-16BE", "\xfe\xff"),
    BOM_TYPE("UTF-16LE",    "UTF-16LE", "\xff\xfe"),
    BOM_TYPE("UTF-32BE",    "UTF-32BE", "\x00\x00\xfe\xff"),
    BOM_TYPE("UTF-32LE",    "UTF-32LE", "\xff\xfe\x00\x00"),
    BOM_TYPE("GB18030",     "GB18030",  "\x84\x31\x95\x33"),
};
#define BOM_TYPE_NONE       0
#define BOM_TYPE_UTF_7      1
#define BOM_TYPE_UTF_8      2
#define BOM_TYPE_UTF_16BE   3
#define BOM_TYPE_UTF_16LE   4
#define BOM_TYPE_UTF_32BE   5
#define BOM_TYPE_UTF_32LE   6
#define BOM_TYPE_GB18030    7
#define BOM_TYPE_MAX        8

// Input buffer
#define BUFFER_SIZE         1024
struct bom_input {
    char    buf[BUFFER_SIZE];
    int     len;
    int     num_complete;
    int     num_finished;
    int     match_state[BOM_TYPE_MAX];
};
#define MATCH_PREFIX        0
#define MATCH_COMPLETE      1
#define MATCH_FAILED        2

// Mode of execution functions
static void bom_detect(FILE *fp, long expect_types, int prefer32);
static void bom_strip(FILE *fp, long expect_types, int lenient, int prefer32, int utf8);
static void bom_list(void);
static void bom_print(int bom_type);

// Helper functions
static int read_bom(FILE *fp, struct bom_input *const input, long expect_types, int prefer32);
static int read_byte(FILE *fp, struct bom_input *input);
static int bom_type_from_name(const char *name);
static void init_bom_input(struct bom_input *const input);
static void set_mode(int *modep, int mode);
static void usage(void);

int
main(int argc, char **argv)
{
    const struct option *opt;
    char optstring[32];
    long expect_types = 0;
    int option_index;
    int bom_type = -1;
    int prefer32 = 0;
    int lenient = 0;
    FILE *fp = NULL;
    int mode = 0;
    int utf8 = 0;
    char *s;
    int ch;

    // Build optstring dynamically
    s = optstring;
    for (opt = long_options; opt->name != NULL; opt++) {
        if (opt->val > 0) {
            *s++ = (char)opt->val;
            if (opt->has_arg)
                *s++ = ':';
        }
    }
    *s = '\0';

    // Parse command line
    while ((ch = getopt_long(argc, argv, optstring, long_options, &option_index)) != -1) {
        switch (ch) {
        case 'd':
            set_mode(&mode, MODE_DETECT);
            break;
        case 'e':
            while ((s = strsep(&optarg, ",")) != NULL) {
                if ((bom_type = bom_type_from_name(s)) >= sizeof(expect_types) * 8)
                    errx(1, "internal error: %s", "too many BOM types");
                expect_types |= (1 << bom_type);
            }
            break;
        case 'h':
            set_mode(&mode, MODE_HELP);
            break;
        case 'l':
            lenient = 1;
            break;
        case 'p':
            bom_type = bom_type_from_name(optarg);
            set_mode(&mode, MODE_PRINT);
            break;
        case 's':
            set_mode(&mode, MODE_STRIP);
            break;
        case 'u':
            utf8 = 1;
            break;
        case 'v':
            set_mode(&mode, MODE_VERSION);
            break;
        case FLAG_PREFER_32:
            prefer32 = 1;
            break;
        case FLAG_LIST:
            set_mode(&mode, MODE_LIST);
            break;
        case '?':
        default:
            usage();
            return 1;
        }
    }
    argv += optind;
    argc -= optind;

    // Parse remainder of command line
    switch (mode) {
    case MODE_STRIP:
    case MODE_DETECT:
        switch (argc) {
        case 0:
            fp = stdin;
            break;
        case 1:
            if (strcmp(argv[0], "-") == 0) {
                fp = stdin;
                break;
            }
            if ((fp = fopen(argv[0], "r")) == NULL)
                err(1, "%s", argv[0]);
            break;
        default:
            usage();
            return 1;
        }
        break;
    default:
        switch (argc) {
        case 0:
            break;
        default:
            usage();
            return 1;
        }
        break;
    }

    // Execute
    switch (mode) {
    case MODE_STRIP:
        bom_strip(fp, expect_types, lenient, prefer32, utf8);
        break;
    case MODE_DETECT:
        bom_detect(fp, expect_types, prefer32);
        break;
    case MODE_LIST:
        bom_list();
        break;
    case MODE_PRINT:
        bom_print(bom_type);
        break;
    case MODE_HELP:
        usage();
        break;
    case MODE_VERSION:
        fprintf(stderr, "bom %s\n", bom_version);
        fprintf(stderr, "Copyright %s Archie L. Cobbs. All rights reserved.\n", COPYRIGHT);
        break;
    default:
        usage();
        return 1;
    }

    // Done
    return 0;
}

static void
bom_detect(FILE *fp, long expect_types, int prefer32)
{
    const struct bom_type *bt;
    struct bom_input input;
    int bom_type;

    // Read BOM
    init_bom_input(&input);
    bom_type = read_bom(fp, &input, expect_types, prefer32);
    bt = &bom_types[bom_type];

    // Print its name
    printf("%s\n", bt->name);
}

#if DEBUG_ICONV_OPS

#define BYTES_PER_ROW   20
static void
debug_buffer(const size_t base, const void *data, size_t len)
{
    size_t offset;
    size_t i;

    if (data == NULL) {
        fprintf(stderr, "    NULL\n");
        return;
    }
    for (offset = 0; offset < len; offset += BYTES_PER_ROW) {
        fprintf(stderr, "%08d: ", (unsigned int)(base + offset));
        for (i = 0; i < BYTES_PER_ROW; i++) {
            const int val = offset + i < len ? *((const char *)data + offset + i) & 0xff : -1;
            if (i == BYTES_PER_ROW / 2)
                fprintf(stderr, " ");
            if (val != -1)
                fprintf(stderr, " %02x", val);
            else
                fprintf(stderr, "   ");
        }
        fprintf(stderr, "  ");
        for (i = 0; i < BYTES_PER_ROW; i++) {
            const int val = offset + i < len ? *((const char *)data + offset + i) & 0xff : -1;
            if (val != -1)
                fprintf(stderr, "%c", isprint(val) ? val : '.');
            else
                fprintf(stderr, " ");
        }
        fprintf(stderr, "\n");
    }
}

#endif  /* DEBUG_ICONV_OPS */

static void
bom_strip(FILE *fp, long expect_types, int lenient, int prefer32, int utf8)
{
    const struct bom_type *bt;
    struct bom_input input;
    char ibuf[BUFFER_SIZE];
    char obuf[BUFFER_SIZE];
    char tocode[32];
    size_t offset;
    iconv_t icd = 0;
    int done = 0;
    int bom_type;
    int ilen;

    // Read BOM
    init_bom_input(&input);
    bom_type = read_bom(fp, &input, expect_types, prefer32);
    bt = &bom_types[bom_type];

    // If BOM type is NONE, then obviously we can't convert to UTF-8
    if (bom_type == BOM_TYPE_NONE)
        utf8 = 0;

    // Initialize iconv conversion engine
    if (utf8) {
        snprintf(tocode, sizeof(tocode), "%s%s", bom_types[BOM_TYPE_UTF_8].encoding, lenient ? "//IGNORE" : "");
        if ((icd = iconv_open(tocode, bt->encoding)) == (iconv_t)-1)
            err(1, "iconv: \"%s\" -> \"%s\"", bt->encoding, tocode);
    }

    // Copy over any bytes we read after the BOM into our input buffer
    ilen = input.len - bt->len;
    memcpy(ibuf, input.buf + bt->len, ilen);
    offset = bt->len;

    // Convert remainder of file
    while (!done) {
        size_t nread;
        size_t nwrit;
        char *iptr;
        char *optr;
        size_t iremain;
        size_t oremain;
        int eof = 0;
        size_t r;

        // Fill the input buffer
        while (ilen < sizeof(ibuf)) {
            if ((nread = fread(ibuf + ilen, 1, sizeof(ibuf) - ilen, fp)) == 0) {
                if (ferror(fp))
                    err(1, "read error");
                eof = 1;
                break;
            }
            ilen += nread;
        }

        // When the input buffer is empty and we couldn't add anything more, this is the last round
        done = ilen == 0;

        // Convert bytes (unless BOM_TYPE_NONE)
        iptr = ibuf;
        optr = obuf;
        iremain = ilen;
        oremain = sizeof(obuf);

        // Convert to UTF-8 or just pass through
        if (utf8) {
#if DEBUG_ICONV_OPS
            fprintf(stderr, "->iconv@%d: ilen=%d\n", (int)offset, (int)ilen);
            debug_buffer(offset, iptr, ilen);
#endif
            r = iconv(icd, !done ? &iptr : NULL, &iremain, &optr, &oremain);
#if DEBUG_ICONV_OPS
            {
                const int errno_save = errno;

                fprintf(stderr, "<-iconv@%d: r=%d errno=%d iptr@%d optr@%d\n",
                  (int)offset, (int)r, errno, (int)(iptr - ibuf), (int)(optr - obuf));
                debug_buffer(offset, obuf, optr - obuf);
                errno = errno_save;
            }
#endif
            if (r == (size_t)-1) {
                switch (errno) {
                case EINVAL:                    // incomplete multi-byte sequence at the end of the input buffer
                    if (!done && !eof)
                        break;
                    // FALLTHROUGH
                case EILSEQ:                    // an invalid byte sequence was detected
                    if (lenient) {
                        iptr += iremain;        // avoid an infinite loop on trailing partial multi-byte sequence
                        iremain = 0;
                        break;
                    }
                    errx(EX_ILLEGAL_BYTES, "invalid %s byte sequence at file offset %lu", bt->name, offset + (iptr - ibuf));
                default:
                    err(1, "iconv");
                }
            }
        } else {                                // behave like iconv() would but just copy the bytes
            memcpy(optr, iptr, ilen);
            if (!done)
                iptr += ilen;
            iremain = 0;
            optr += ilen;
            oremain -= ilen;
        }

        // Update file offset
        offset += ilen - iremain;

        // Shift unprocessed input for next time
        memmove(ibuf, iptr, iremain);
        ilen = iremain;

        // Write output
        oremain = optr - obuf;
        optr = obuf;
        while (oremain > 0 && (nwrit = fwrite(optr, 1, oremain, stdout)) > 0) {
            optr += nwrit;
            oremain -= nwrit;
        }
        if (ferror(stdout))
            err(1, "write error");
    }
    if (fflush(stdout) == EOF)
        err(1, "write error");

    // Close conversion
    if (utf8)
        (void)iconv_close(icd);
}

static void
bom_list(void)
{
    int bom_type;

    for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) {
        const struct bom_type *const bt = &bom_types[bom_type];

        printf("%s\n", bt->name);
    }
}

static void
bom_print(int bom_type)
{
    const struct bom_type *const bt = &bom_types[bom_type];
    int i;

    for (i = 0; i < bt->len; i++) {
        if (putchar(bt->bytes[i] & 0xff) == EOF)
            err(1, "write error");
    }
}

static int
read_bom(FILE *fp, struct bom_input *const input, long expect_types, int prefer32)
{
    int bom_type;

    // Read bytes until all BOM's are either completely matched or have failed to match
    while (read_byte(fp, input)) {
        if (input->num_finished == BOM_TYPE_MAX)
            break;
    }

    // Handle the UTF-16LE vs. UTF-32LE ambiguity
    if (input->match_state[BOM_TYPE_UTF_16LE] == MATCH_COMPLETE
      && input->match_state[BOM_TYPE_UTF_32LE] == MATCH_COMPLETE) {
        input->match_state[prefer32 ? BOM_TYPE_UTF_16LE : BOM_TYPE_UTF_32LE] = MATCH_FAILED;
        input->num_complete--;
    }

    // At this point there should be BOM_TYPE_NONE and at most one other match
    assert(input->match_state[BOM_TYPE_NONE] == MATCH_COMPLETE);
    switch (input->num_complete) {
    case 1:
        bom_type = BOM_TYPE_NONE;
        break;
    case 2:
        for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) {
            if (bom_type != BOM_TYPE_NONE && input->match_state[bom_type] == MATCH_COMPLETE)
                break;
        }
        if (bom_type < BOM_TYPE_MAX)
            break;
        // FALLTHROUGH
    default:
        errx(1, "internal error: %s", ">2 BOM type matches");
    }

    // Check expected BOM type
    if (expect_types != 0 && (expect_types & (1 << bom_type)) == 0)
        errx(EX_EXPECT_FAIL, "unexpected BOM type %s", bom_types[bom_type].name);

    // Done
    return bom_type;
}

static int
bom_type_from_name(const char *name)
{
    int bom_type;

    for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) {
        if (strcmp(bom_types[bom_type].name, name) == 0)
            return bom_type;
    }
    errx(1, "unknown BOM type \"%s\"", name);
}

static int
read_byte(FILE *fp, struct bom_input *const input)
{
    int bom_type;
    int ch;

    // Read next byte
    if ((ch = getc(fp)) == EOF) {
        if (ferror(fp))
            err(1, "read error");
        return 0;
    }

    // Update state
    if (input->len >= sizeof(input->buf))
        errx(1, "internal error: %s", "input buffer overflow");
    for (bom_type = 0; bom_type < BOM_TYPE_MAX; bom_type++) {
        const struct bom_type *const bt = &bom_types[bom_type];

        switch (input->match_state[bom_type]) {
        case MATCH_PREFIX:
            if (bt->bytes[input->len] != (char)ch) {
                input->match_state[bom_type] = MATCH_FAILED;
                input->num_finished++;
            } else if (bt->len == input->len + 1) {
                input->match_state[bom_type] = MATCH_COMPLETE;
                input->num_finished++;
                input->num_complete++;
            }
            break;
        case MATCH_COMPLETE:
        case MATCH_FAILED:
            break;
        default:
            errx(1, "internal error: %s", "invalid match state");
        }
    }
    input->buf[input->len++] = (char)ch;
    return 1;
}

static void
init_bom_input(struct bom_input *const input)
{
    memset(input, 0, sizeof(*input));
    input->match_state[BOM_TYPE_NONE] = MATCH_COMPLETE;
    input->num_complete = 1;
    input->num_finished = 1;
}

static void
set_mode(int *modep, int mode)
{
    if (*modep != 0) {
        usage();
        exit(1);
    }
    *modep = mode;
}

static void
usage(void)
{
    fprintf(stderr, "Usage:\n");
    fprintf(stderr, "  bom --strip [--expect types] [--lenient] [--prefer32] [--utf8] [file]\n");
    fprintf(stderr, "  bom --detect [--expect types] [--prefer32] [file]\n");
    fprintf(stderr, "  bom --list\n");
    fprintf(stderr, "  bom --print type\n");
    fprintf(stderr, "  bom --help\n");
    fprintf(stderr, "  bom --version\n");
    fprintf(stderr, "Options:\n");
    fprintf(stderr, "  -d, --detect        Report the detected BOM type and exit\n");
    fprintf(stderr, "  -e, --expect types  Expect the specified BOM type(s) (separated by commas)\n");
    fprintf(stderr, "  -h, --help          Output command line usage summary\n");
    fprintf(stderr, "  -l, --lenient       Skip invalid input byte sequences instead of failing\n");
    fprintf(stderr, "      --list          List the supported BOM types\n");
    fprintf(stderr, "  -p, --print type    Output the byte sequence corresponding to \"type\"\n");
    fprintf(stderr, "      --prefer32      Prefer UTF-32LE instead of UTF-16LE followed by NUL\n");
    fprintf(stderr, "  -s, --strip         Strip the BOM and output the remainder of the file\n");
    fprintf(stderr, "  -u, --utf8          Convert the remainder of the file to UTF-8\n");
    fprintf(stderr, "  -v, --version       Output program version and exit\n");
}
0707010000000A000081ED000003E8000000640000000161830F5E00000156000000000000000000000000000000000000001500000000bom-1.0.1/manpage.sh#!/bin/bash

# Bail on error
set -e

NCOLS="83"
MANPAGE="bom.1.in"

sed '/man page/q' < README.md > README.md.NEW

printf '```\n' >> README.md.NEW

groff -r LL=${NCOLS}n -r LT=${NCOLS}n -Tlatin1 -man "${MANPAGE}" \
  | sed -r -e 's/.\x08(.)/\1/g' -e 's/[[0-9]+m//g' \
  >> README.md.NEW

printf '```\n' >> README.md.NEW

mv README.md{.NEW,}
0707010000000B000041ED000003E8000000640000000261830F5E00000000000000000000000000000000000000000000001000000000bom-1.0.1/tests0707010000000C000081ED000003E8000000640000000161830F5E00000CEA000000000000000000000000000000000000001700000000bom-1.0.1/tests/run.sh#!/bin/bash

# Bail on error
set -e

# Setup temporary files
TMP_STDOUT_EXPECTED='bom-test-out-expected.tmp'
TMP_STDERR_EXPECTED='bom-test-err-expected.tmp'
TMP_STDOUT_ACTUAL='bom-test-out-actual.tmp'
TMP_STDERR_ACTUAL='bom-test-err-actual.tmp'
TMP_SWAP_FILE=''bom-test-hexdump.tmp
trap "rm -f \
    ${TMP_STDOUT_EXPECTED} \
    ${TMP_STDERR_EXPECTED} \
    ${TMP_STDOUT_ACTUAL} \
    ${TMP_STDERR_ACTUAL} \
    ${TMP_SWAP_FILE}" 0 2 3 5 10 13 15

# Convert a file to hexdump version
hexdumpify()
{
    FILE="${1}"
    hexdump -C < "${FILE}" > "${TMP_SWAP_FILE}"
    mv "${TMP_SWAP_FILE}" "${FILE}"
}

# Compare files, on failure set ${DIFF_FAIL}
checkdiff()
{
    if [ "${1}" = '-h' ]; then
        HEXDUMPIFY='true'
        shift
    else
        HEXDUMPIFY='false'
    fi
    TESTFILE="${1}"
    WHAT="${2}"
    EXPECTED="${3}"
    ACTUAL="${4}"
    if diff -q "${EXPECTED}" "${ACTUAL}" >/dev/null; then
        return 0
    fi
    echo "test: ${TESTFILE}: ${WHAT} mismatch"
    echo '------------------------------------------------------'
    if [ "${HEXDUMPIFY}" = 'true' ]; then
        hexdumpify "${EXPECTED}"
        hexdumpify "${ACTUAL}"
    fi
    diff -u "${EXPECTED}" "${ACTUAL}" || true
    echo '------------------------------------------------------'
    DIFF_FAIL='true'
}

# Execute one test, on failure set ${TEST_FAIL}
runtest()
{
    # Read test data
    unset FLAGS
    unset STDIN
    unset STDOUT
    unset STDERR
    unset EXITVAL
    . "${TESTFILE}"
    if [ -z "${FLAGS+x}" \
      -o -z "${STDIN+x}" \
      -o -z "${STDOUT+x}" \
      -o -z "${STDERR+x}" \
      -o -z "${EXITVAL+x}" ]; then
        echo "test: ${TESTFILE}: invalid test file"
        exit 1
    fi

    # Set up files
    echo -en "${STDOUT}" > "${TMP_STDOUT_EXPECTED}"
    echo -en "${STDERR}" > "${TMP_STDERR_EXPECTED}"
    set +e
    echo -en "${STDIN}" | ../bom ${FLAGS} >"${TMP_STDOUT_ACTUAL}" 2>"${TMP_STDERR_ACTUAL}"
    ACTUAL_EXITVAL="$?"
    set -e

    # Special hacks
    if [ "${STDERR}" = '!USAGE!' ]; then
        ../bom --help 2>"${TMP_STDERR_EXPECTED}"
    fi

    # Check result
    DIFF_FAIL='false'
    checkdiff -h "${TESTFILE}" "standard output" "${TMP_STDOUT_EXPECTED}" "${TMP_STDOUT_ACTUAL}"
    checkdiff "${TESTFILE}" "standard error" "${TMP_STDERR_EXPECTED}" "${TMP_STDERR_ACTUAL}"
    if [ "${DIFF_FAIL}" != 'false' ]; then
        TEST_FAIL='true'
    fi
    if [ "${ACTUAL_EXITVAL}" -ne "${EXITVAL}" ]; then
        echo "test: ${TESTFILE}: exit value ${ACTUAL_EXITVAL} != ${EXITVAL}"
        TEST_FAIL='true'
    fi

    # Print success or if failure show params
    if [ "${TEST_FAIL}" = 'false' ]; then
        echo "test: ${TESTFILE}: success"
    else
        echo "******************************************************"
        echo "test: ${TESTFILE} FAILED with:"
        echo "  FLAGS='${FLAGS}'"
        echo "  STDIN='${STDIN}'"
        echo "******************************************************"
    fi
}

# Find all tests and run them
ANY_FAIL='false'
for TESTFILE in `find . -maxdepth 1 -type f -name 'test-*.tst' | sort | sed 's|^./||g'`; do
    TEST_FAIL='false'
    runtest "${TESTFILE}"
    if [ "${TEST_FAIL}" != 'false' ]; then
        ANY_FAIL='true'
    fi
done

# Exit with error if any test failed
if [ "${ANY_FAIL}" != 'false' ]; then
    exit 1
fi
0707010000000D000081A4000003E8000000640000000161830F5E00000040000000000000000000000000000000000000002600000000bom-1.0.1/tests/test-detect-empty.tstFLAGS='--detect'
STDIN=''
STDOUT='NONE\n'
STDERR=''
EXITVAL='0'
0707010000000E000081A4000003E8000000640000000161830F5E00000064000000000000000000000000000000000000002B00000000bom-1.0.1/tests/test-detect-expect-001.tstFLAGS='--detect --expect UTF-8'
STDIN='\xef\xbb\xbfblahblah'
STDOUT='UTF-8\n'
STDERR=''
EXITVAL='0'
0707010000000F000081A4000003E8000000640000000161830F5E00000080000000000000000000000000000000000000002B00000000bom-1.0.1/tests/test-detect-expect-002.tstFLAGS='--detect --expect UTF-16LE'
STDIN='\xef\xbb\xbfblahblah'
STDOUT=''
STDERR='bom: unexpected BOM type UTF-8\n'
EXITVAL='2'
07070100000010000081A4000003E8000000640000000161830F5E00000044000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-detect-partial.tstFLAGS='--detect'
STDIN='\xff'
STDOUT='NONE\n'
STDERR=''
EXITVAL='0'
07070100000011000081A4000003E8000000640000000161830F5E0000007D000000000000000000000000000000000000002400000000bom-1.0.1/tests/test-list-types.tstFLAGS='--list'
STDIN=''
STDOUT='NONE\nUTF-7\nUTF-8\nUTF-16BE\nUTF-16LE\nUTF-32BE\nUTF-32LE\nGB18030\n'
STDERR=''
EXITVAL='0'
07070100000012000081A4000003E8000000640000000161830F5E0000004E000000000000000000000000000000000000002600000000bom-1.0.1/tests/test-prefer32-001.tstFLAGS='-d'
STDIN='\xff\xfe\x00\x00'
STDOUT='UTF-16LE\n'
STDERR=''
EXITVAL='0'
07070100000013000081A4000003E8000000640000000161830F5E00000059000000000000000000000000000000000000002600000000bom-1.0.1/tests/test-prefer32-002.tstFLAGS='-d --prefer32'
STDIN='\xff\xfe\x00\x00'
STDOUT='UTF-32LE\n'
STDERR=''
EXITVAL='0'
07070100000014000081A4000003E8000000640000000161830F5E00000051000000000000000000000000000000000000002700000000bom-1.0.1/tests/test-print-GB18030.tstFLAGS='--print GB18030'
STDIN=''
STDOUT='\x84\x31\x95\x33'
STDERR=''
EXITVAL='0'
07070100000015000081A4000003E8000000640000000161830F5E0000003E000000000000000000000000000000000000002400000000bom-1.0.1/tests/test-print-NONE.tstFLAGS='--print NONE'
STDIN=''
STDOUT=''
STDERR=''
EXITVAL='0'
07070100000016000081A4000003E8000000640000000161830F5E00000062000000000000000000000000000000000000002700000000bom-1.0.1/tests/test-print-UNKNOWN.tstFLAGS='--print UNKNOWN'
STDIN=''
STDOUT=''
STDERR='bom: unknown BOM type "UNKNOWN"\n'
EXITVAL='1'
07070100000017000081A4000003E8000000640000000161830F5E0000004A000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-16BE.tstFLAGS='--print UTF-16BE'
STDIN=''
STDOUT='\xfe\xff'
STDERR=''
EXITVAL='0'
07070100000018000081A4000003E8000000640000000161830F5E0000004A000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-16LE.tstFLAGS='--print UTF-16LE'
STDIN=''
STDOUT='\xff\xfe'
STDERR=''
EXITVAL='0'
07070100000019000081A4000003E8000000640000000161830F5E00000052000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-32BE.tstFLAGS='--print UTF-32BE'
STDIN=''
STDOUT='\x00\x00\xfe\xff'
STDERR=''
EXITVAL='0'
0707010000001A000081A4000003E8000000640000000161830F5E00000052000000000000000000000000000000000000002800000000bom-1.0.1/tests/test-print-UTF-32LE.tstFLAGS='--print UTF-32LE'
STDIN=''
STDOUT='\xff\xfe\x00\x00'
STDERR=''
EXITVAL='0'
0707010000001B000081A4000003E8000000640000000161830F5E0000004B000000000000000000000000000000000000002500000000bom-1.0.1/tests/test-print-UTF-7.tstFLAGS='--print UTF-7'
STDIN=''
STDOUT='\x2b\x2f\x76'
STDERR=''
EXITVAL='0'
0707010000001C000081A4000003E8000000640000000161830F5E0000004B000000000000000000000000000000000000002500000000bom-1.0.1/tests/test-print-UTF-8.tstFLAGS='--print UTF-8'
STDIN=''
STDOUT='\xef\xbb\xbf'
STDERR=''
EXITVAL='0'
0707010000001D000081A4000003E8000000640000000161830F5E0000008E000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-001.tstFLAGS='--strip --utf8'
STDIN='\xef\xbb\xbftest123\xff456'
STDOUT=''
STDERR='bom: invalid UTF-8 byte sequence at file offset 10\n'
EXITVAL='3'
0707010000001E000081A4000003E8000000640000000161830F5E0000006E000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-002.tstFLAGS='--strip --lenient --utf8'
STDIN='\xef\xbb\xbftest123\xff456'
STDOUT='test123456'
STDERR=''
EXITVAL='0'
0707010000001F000081A4000003E8000000640000000161830F5E00000061000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-003.tstFLAGS='--strip'
STDIN='\xef\xbb\xbftest123\xff456'
STDOUT='test123\xff456'
STDERR=''
EXITVAL='0'
07070100000020000081A4000003E8000000640000000161830F5E000000F1000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-004.tst# The input is truncated after 2/3 of a rightwards arrow U2192 -> e2 86 92
FLAGS='--strip --expect UTF-8 --utf8'
STDIN='\xef\xbb\xbfpartial arrow: \xe2\x86'
STDOUT=''
STDERR='bom: invalid UTF-8 byte sequence at file offset 18\n'
EXITVAL='3'
07070100000021000081A4000003E8000000640000000161830F5E000000D6000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-005.tst# The input is truncated after 2/3 of a rightwards arrow U2192 -> e2 86 92
FLAGS='--strip --expect UTF-8 --utf8 --lenient'
STDIN='\xef\xbb\xbfpartial arrow: \xe2\x86'
STDOUT='partial arrow: '
STDERR=''
EXITVAL='0'
07070100000022000081A4000003E8000000640000000161830F5E0000014B000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-strip-006.tst# This has a multi-byte sequence that crosses our input buffer boundary
FLAGS='--strip --expect UTF-8 --utf8'
STDIN_BOM='\xef\xbb\xbf'
STDIN_1019=`yes aaaaaaaaaaaaaaa | tr -d \\\\n | head -c 1023`
STDIN_ARROW='\xe2\x86\x92'
STDIN="${STDIN_BOM}${STDIN_1019}${STDIN_ARROW}"
STDOUT="${STDIN_1019}${STDIN_ARROW}"
STDERR=''
EXITVAL='0'
07070100000023000081A4000003E8000000640000000161830F5E00000039000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-001.tstFLAGS=''
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
07070100000024000081A4000003E8000000640000000161830F5E00000049000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-002.tstFLAGS='--strip --detect'
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
07070100000025000081A4000003E8000000640000000161830F5E00000048000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-003.tstFLAGS='--detect --list'
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
07070100000026000081A4000003E8000000640000000161830F5E0000004C000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-004.tstFLAGS='--list --print NONE'
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
07070100000027000081A4000003E8000000640000000161830F5E0000004C000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-005.tstFLAGS='--print NONE --help'
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
07070100000028000081A4000003E8000000640000000161830F5E00000042000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-006.tstFLAGS='-d --list'
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
07070100000029000081A4000003E8000000640000000161830F5E0000003D000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-007.tstFLAGS='-sdu'
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
0707010000002A000081A4000003E8000000640000000161830F5E00000049000000000000000000000000000000000000002300000000bom-1.0.1/tests/test-usage-008.tstFLAGS='--detect foo bar'
STDIN=''
STDOUT=''
STDERR='!USAGE!'
EXITVAL='1'
07070100000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000B00000000TRAILER!!!114 blocks
openSUSE Build Service is sponsored by