Skip to content

Conversation

@octo
Copy link
Contributor

@octo octo commented Mar 20, 2018

This changes fix two cases in which printer.Format() produced output that was not accepted by parser.Parse():

  1. The input ends in \x00: the scanner accepted this, because nothing is following the null byte. The formatter, however, decides to add a newline at the end of the file, assuming that this would be a no-op. Upon re-parsing, the null byte is no longer the last character in the stream and produces an error.
  2. The input contains the U+E123 unicode codepoint: this codepoint is used internally by the formatter to do some postprocessing of the generated output. If the input contains that codepoint, the postprocessing removes semantically relevant information, for example the pound sign signifying the beginning of a comment. This leads to unparsable output.

I think that it's sane to refuse those inputs as "invalid" and I think it's very unlikely that this will break existing use-cases.

octo added 3 commits March 20, 2018 20:46
The formatter will append a newline at the end of file, causing the output
of printer.Format() to be invalid.
This (invalid) Unicode codepoint is used by the printer package to fix up
the indentation of generated files. If this codepoint is present in the
input, the package gets confused and removes more than it should,
producing unparsable output.
@mitchellh
Copy link
Contributor

Thanks, this looks good. Note on point #2 that technically we probably should disallow any of the private use code points. I chose U+E123 because its slightly hinted at being human-made (1-2-3), but there are ~18,000 private use code points (HCL only uses one).

Thanks!

@mitchellh mitchellh merged commit adef769 into hashicorp:master Mar 20, 2018
@octo octo deleted the scanner-null branch March 22, 2018 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants