Skip to content

Conversation

@akirk
Copy link

@akirk akirk commented Dec 5, 2025

Potentially fixes #652, #2479, #2568, #3025, #3053

I noticed that running git commit --help | bat -plhelp displayed text with visible ^H characters and doubled letters. I realized that this happens because man pages use something called overstriking where (similar to a typewriter) a character is typed twice over each other. bat was styling the page and passing through the ^H which lead to this mixed output.

Screenshot 2025-12-05 at 11 36 56

So as a solution I introduced a strip_overstrike() function to simply remove the backspace formatting to allow syntax highlighters to modify the output undisturbed and applies this stripping in both SimplePrinter and InteractivePrinter. Cow<str> is used to avoid allocation when no backspaces are present.

I am not an overly heavy user of bat but in my testing of this fix, I tried to verify that:

  • Now MANPAGER="bat -p -l man" man ls displays cleanly.
  • git commit --help | bat -plhelp now works correctly.
  • --binary=as-text still preserves raw bytes.

I am not sure if there are other scenarios that I missed where stripping the overstriking might be unintentional.

After all, the above renders like this with this patch:

Screenshot 2025-12-05 at 14 25 40

On as side note: I realized that git commit --help | bat -plhelp displays the text not full-width but this is coming from git commit --help when it detects that its output is being piped. One workaround is to run MANWIDTH=$(tput cols) git commit --help | bat -plhelp

Copy link
Collaborator

@keith-hall keith-hall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. If we do decide to merge this, probably we would no longer need the complicated MANPAGER advice in the readme? CC @eth-p as our expert on man pages and the author of #2858

I think it could benefit from some integration tests for content containing overstrike, in binary as text mode and normal mode.

@akirk akirk force-pushed the strip-overstriking branch from d49d8d8 to 4f7c71a Compare December 10, 2025 04:38
@akirk akirk changed the title Strip overstriking to better support man pages Strip overstriking before applying syntax highlighting to better support man pages Dec 10, 2025
@akirk
Copy link
Author

akirk commented Dec 10, 2025

I have added some integration tests which showed me that a safer approach would be (now implemented) to only strip the overstriking when syntax highlighting will be done (which actually more closely matches my original idea).

@akirk akirk force-pushed the strip-overstriking branch from 4f7c71a to 27dfb40 Compare December 10, 2025 04:43
Copy link
Collaborator

@keith-hall keith-hall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general the implementation looks clean to me. I have a couple of minor comments, and I still think we should remove the complicated MANPAGER awk script from the readme now we have strip_overstrike enabled by default.

Could we also have a test to prove that --show-all still shows backspace characters as expected, or is it already covered by existing tests? 🙂

}
};

if self.strip_overstrike && line.contains('\x08') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how necessary the line.contains check is, as the first thing strip_overstrike does is perform a find - essentially double work? Would it be worth benchmarking with vs without this check to see which is really better for performance? Or to consider doing a find here and passing the position of it into strip_overstrike? Maybe an unnecessary micro-optimization, but I'm a bit worried because this will run on every line of input passing through bat...

Comment on lines 5 to +7
## Bugfixes

- Strip overstriking before applying syntax highlighting to better support man pages, see #3517 (@akirk)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that it is more of a feature than a bugfix... Maybe doesn't matter too much, but I would be interested to hear other opinions 🙂

};

// Strip overstrike only when we have syntax highlighting (not plain text).
let strip_overstrike = !is_plain_text;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how necessary it is on other inputs apart from the manpages. I guess it shouldn't cause any confusing output when syntax highlighting is enabled, so maybe this approach is fine 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

man syntax doesn't highlight bold functions correctly

2 participants