Skip to content

ext: add markdown nroff view with table formatting in doc.sh#5053

Open
ilia-maslakov wants to merge 1 commit intoMidnightCommander:masterfrom
ilia-maslakov:pr/markdown-doc-view-clean
Open

ext: add markdown nroff view with table formatting in doc.sh#5053
ilia-maslakov wants to merge 1 commit intoMidnightCommander:masterfrom
ilia-maslakov:pr/markdown-doc-view-clean

Conversation

@ilia-maslakov
Copy link

• ## Proposed changes

  • Add Markdown view support through doc.sh and switch markdown binding to %view{ascii,nroff}.

  • Extend markdown file matching to include both .md/.mkd and .markdown.

  • Implement lightweight markdown rendering for mcview using nroff overstrikes:

    • headers as bold,
    • inline code as underline.
  • Add markdown table rendering with:

    • fixed column widths across the whole table,
    • Unicode borders (│, ─, ┼),
    • separator line after each rendered row,
    • highlighted header row,
    • row wrapping for long rows/cells to avoid truncation.
  • Resolves: N/A

Checklist

  • I have referenced the issue(s) resolved by this PR (if any)
  • I have signed-off my contribution with git commit --amend -s
  • Lint and unit tests pass locally with my changes (make indent && make check)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added the necessary documentation (if appropriate)
markdown

Signed-off-by: Ilia Maslakov <il.smind@gmail.com>
@github-actions github-actions bot added the needs triage Needs triage by maintainers label Mar 7, 2026
@github-actions github-actions bot added this to the Future Releases milestone Mar 7, 2026
@github-actions github-actions bot added the prio: medium Has the potential to affect progress label Mar 7, 2026
@egmontkob
Copy link
Contributor

egmontkob commented Mar 8, 2026

First thing:

For whatever reason, the rule doesn't work for me. F3 just displays the file in its raw form.

It works if I remove the Include=editor line.

I'm suspecting a generic mc bug here, to be investigated.


Second, and this is subjective but perpahs marginally relevant.

I hate markdown with all my heart, for two reasons.

One is that the idea of the source and the formatted versions both being easily human-readable is conceptually flawed.

The second is that it's executed extremely poorly.

Pretty much every special character has a special purpose in certain contexts, and their interaction is not specified, or in the rare case that it is, it's absolutely inconsistent. As soon as you want to use two things at once, you can't be sure what to do. Ex.1. How to have preformatted text? It's between backticks. How to have a backtick in such a context? Or a backtick and a space? You're out of luck. Ex.2. To start second level heading, you begin with ##. To start with literal '##' in the formatted version, you escape as \##. Similarly for unnumbered lists: begin with - but if you want to begin with the dash character appearing then it's \-. Numbered lists? Begin with 1.. Want to have a literal 1. by writing \1.? Of course not, the backslash needs to go in the middle: 1\.. I could go on and on and on...

Hard line wraps without starting a new paragraph? Trailing whitespaces in the source. Multiple paragraphs within a bullet point??? Try and experiment and you might get lucky.

And everyone has their own flavors, their own ideas how the various features should interact with each other. All super incompatible with each other.

Tables aren't even part of official markdown, it's an unofficial extension. With further, even more unofficial extensions like alignment of columns.

Existing markdown viewers have been developed-maintained for years and they still kinda suck (because they try to implement a poor and underspecified format, so it is a pretty hopeless task).


And then here comes your 200-line shell script, mostly awk, as a new markdown parser/formatter which we should trust that does the right thing in decent quality, and should be able to understand and maintain in the future.

You randomly pick two or three aspects of markdown: headers, tables, backticked text; and highlight/format them. And you leave the rest unformatted, i.e. the remaining dozens and dozens of markdown features unaddressed.

Markdown says to automatically renumber numbered lists, e.g. if the source has 1. 1. 1. then it should appear as 1. 2. 3.. Not done. Markdown says 1\. should appear as 1., not done.

Markdown says a single newline character in the source file shouldn't begin a new line, it continues the paragraph which paragraph needs to be reformatted for the display. (See also here.) Not done.

Column alignments of tables? Not done.

I'm sure this list is pretty endless.

You take a specification which aims to use an already more-or-less human-friendly and readable source format as input and loosely describes how its output should look like. You come up with an ad-hoc script that performs this conversion for an arbitrary tiny fraction of its features and leaves the source format unchanged for the rest, giving something that's maybe a 1/10th of the way there.

I find this a very wrong direction.


IMO what you should rather be doing:

Find the best markdown -> ASCII converter, or a list of a few. Hook them up to mc.

Maybe work together with them to add a roff output format: a slightly beefed-up ASCII but still way less powerful than their HTML output (if they have one).

@egmontkob
Copy link
Contributor

I don't want to bash it any further, but just one more data point, which I happened to notice thanks to your screenshot:

Your code does recognize and format backticked text outside of tables. Not inside tables though, there they remain literal backticks.

Here on GitHub backtick has the same meaning inside tables:

Source:

`backticked text`

| col1 | col2 |
| ---- | ---- |
| foobar | `backticked text` |

Rendering:

backticked text

col1 col2
foobar backticked text

@egmontkob
Copy link
Contributor

For whatever reason, the rule doesn't work for me. F3 just displays the file in its raw form.

The Include key is kind of counterintuitive. If it's present, by design, both the Open and View keywords are taken from the referred section. The Include key is checked for first, and if found then the other keys are ignored, even if they precede Include (the order doesn't matter).

No other rule mixes Include with View or Open, since it's not supposed to work. I'm pretty sure the current version in your PR doesn't work for you either.

@ilia-maslakov
Copy link
Author

ilia-maslakov commented Mar 8, 2026

Column alignments of tables? Not done.
raw

raw

formatted
rendered

@ilia-maslakov
Copy link
Author

I intentionally implemented formatting only for headers, tables and inline code.

The goal was simply to make reading Markdown content in mcview more convenient, not to implement a full Markdown renderer.

Most Markdown constructs (paragraphs, lists, emphasis, etc.) are already easily readable in their raw form, so I left them unchanged on purpose. The script only highlights a few elements that significantly improve readability in a terminal viewer.

This was a deliberate design choice: keep the implementation small and lightweight instead of attempting to support the full Markdown specification, which would inevitably lead to a much more complex and heavier parser.

In other words, the approach here is minimal enhancement of the source text, not full Markdown rendering.

@ilia-maslakov
Copy link
Author

Regarding external converters:

Pandoc is unfortunately far too heavy for this use case. It is a very large dependency and is unlikely to be available on minimal systems where Midnight Commander is commonly used (servers, containers, embedded systems). Requiring pandoc just to preview a Markdown file in mcview would be disproportionate.

Lowdown was also evaluated, but it does not produce output that integrates well with mcview. In particular, its terminal output is designed for standalone rendering and does not map cleanly to mcview's nroff/overstrike style formatting. As a result the output quality in mcview is not satisfactory.

Because of this, the goal of the script was not to implement a full Markdown renderer, but to provide a small, dependency-free improvement for the most visually disruptive elements:

headers

tables

inline code

These elements significantly improve readability in the viewer, while the rest of Markdown remains readable in its raw form.

The script intentionally stays small and lightweight instead of trying to implement the full Markdown specification.

@zyv zyv added area: mcview mcview, the built-in text editor and removed needs triage Needs triage by maintainers labels Mar 8, 2026
@egmontkob
Copy link
Contributor

Column alignments of tables? Not done.
raw

Sorry, I wasn't clear. What I meant is left-aligned, center-aligned, right-aligned columns.

@ilia-maslakov
Copy link
Author

For whatever reason, the rule doesn't work for me. F3 just displays the file in its raw form.

The Include key is kind of counterintuitive. If it's present, by design, both the Open and View keywords are taken from the referred section. The Include key is checked for first, and if found then the other keys are ignored, even if they precede Include (the order doesn't matter).

No other rule mixes Include with View or Open, since it's not supposed to work. I'm pretty sure the current version in your PR doesn't work for you either.

i need check it

@egmontkob
Copy link
Contributor

I firmly disagree with your goal.

You pick an arbitrary tiny subset of markdown's features and format them. The list is surely subjective, but even if agreed that these are the most prominent ones, addressing them and only them is IMO more confusing than useful.

The use case you have in mind is tiny, mininal or embedded systems where for some reason someone would want to view a markdown file and for some reason they would want to have these particular aspects of markdown's rendering improved. Yet, you want to ship this "improvement" for all mc users out there, even those (presumably the vast majority) who are just an apt/yum/whatever install pandoc away from a much better markdown formatter and don't care about the amount of dependencies it brings in. I believe that most users would prefer having a great rendering rather than a half-baked (or rather: much-less-than-half-baked) one, much rather than saving on its cost.

I believe that no decent project is a right place to embed your ad-hoc script. It's something that should live in a repo of yours, along with documentation who the target audience is and how to hook up to mc. mc's config files are editable by the users for a good reason, anyone who wants to can place this snippet there.

By the way, it should really reside in its own file, with mc.ext.ini only containing the minimal gluing.

to provide a small, dependency-free improvement for the most visually disruptive elements:

headers

All levels (from 1 to 6 # characters) are rendered the same, precious information is lost.

tables

Center-aligned or right-aligned columns not supported. The line at the bottom looks ugly.

inline code

Does not work inside tables.

Even just properly addressing these aspects of markdown's rendering is more complex than you think it is. It belongs to external projects with proper maintenance.

I don't think mc is the right place to host this minimal hack, nor that we would want to carry and maintain it.

@ilia-maslakov
Copy link
Author

I firmly disagree with your goal.

ок. I'll try to take into account the remarks that can be addressed without introducing significant complexity or performance cost.

In particular, I can look into improving the areas you pointed out (such as header levels, table alignment and inline code handling) where the fixes are relatively small and do not require turning this into a full Markdown renderer.

However, the goal will still be to keep the implementation lightweight and simple, avoiding features that would significantly increase complexity or effectively require implementing a full Markdown parser.

@zyv
Copy link
Member

zyv commented Mar 8, 2026

I firmly disagree with your goal.

ок. I'll try to take into account the remarks that can be addressed without introducing significant complexity or performance cost.

In particular, I can look into improving the areas you pointed out (such as header levels, table alignment and inline code handling) where the fixes are relatively small and do not require turning this into a full Markdown renderer.

However, the goal will still be to keep the implementation lightweight and simple, avoiding features that would significantly increase complexity or effectively require implementing a full Markdown parser.

I don't think that a restricted Markdown parser implemented as a monstrous AWK script without tests is a suitable addition to mc codebase in the first place. If you need/want something like that, you can put it in your user directory.

@egmontkob
Copy link
Contributor

I'm not one of the two main developers who will decide, I'm just writing my opinion. (Although one of the main developers has just chimed in and seems to be on my side.)

I'm afraid I didn't clearly articulate my problems. As for me, it's not the "tiny" bugs that you may fix. It's the overall concept. mc is not the proper place for such hacks fulfilling the particular needs of the author, in the vague hope that supporting this particular tiny subset might be useful for others as well.

Mind you, even if you could turn this to a full-blown Markdown parser-formatter, in beautful clean code, being the best Markdown parser-formatter out there, mc still wouldn't be the right place for it. It should live as an external tool, mc only doing the gluing.

@ilia-maslakov
Copy link
Author

I'm not one of the two main developers who will decide, I'm just writing my opinion. (Although one of the main developers has just chimed in and seems to be on my side.)

I’m still in the top 10 contributors by number of commits to the mc project, even though I haven’t committed for about 7 years. :)

I can certainly add more tests for the parser, but it seems to me that the issue here is not really about the number of tests. Still, if the patch is not needed - no problem. This functionality is useful for me, but of course the decision is yours.

@egmontkob
Copy link
Contributor

I’m still in the top 10 contributors

Not sure how that's relevant.

This functionality is useful for me

"for me" is the part that bothers me. We're supposedly developing this piece of software for a hopefully wide audience, in hopefully high quality. The bar should be much higher than "useful for me".

@ilia-maslakov
Copy link
Author

I’m still in the top 10 contributors

Not sure how that's relevant.

you are right

This functionality is useful for me

"for me" is the part that bothers me. We're supposedly developing this piece of software for a hopefully wide audience, in hopefully high quality. The bar should be much higher than "useful for me".

Most of the functionality I’ve contributed to mc over the years was written based on the principle of “useful for me”. That’s fairly natural: I use the software myself.

At the same time, this is not an unusual situation in open source. I’m a fairly typical file manager user, so things that are convenient for me tend to be convenient for many others as well.

PS: I’m also starting to feel a bit uncomfortable that I’m taking up your time and pulling you into a discussion about functionality that may not even belong in upstream.

@egmontkob
Copy link
Contributor

egmontkob commented Mar 8, 2026

Most of the functionality I’ve contributed to mc over the years was written based on the principle of “useful for me”. That’s fairly natural: I use the software myself.

There might be some subtle terminology differences here. Of course it's okay if you work on things that are useful for you, and prefer not to spend your time on features / fixes at areas you personally don't care about. That's one thing. The feature you're proposing here is useful for you.

But the other possible way to look at it is: Out of a much-much larger feature set of markdown, you've just implemented the bare minimum that results in an obvious improvement for you. In the name of supporting markdown, you maybe support 5% of it because that's what you're looking for. It's useful for you, but I'm not buying that others would also be similarly happy with this feature.

I am not familiar with your earlier contributions, I just happened to learn in another thread just the other day that you wrote the Learn keys dialog, and I'm sure you've seen the discussions about whether to keep or remove that feature. My impression with the Learn keys dialog is similar: It looks to me like it implements the bare minimum that you found useful. It's full of shortcomings; bugs if you will. It cannot override escape sequences that are already defined in terminfo, it does not recognize and cannot use urxvt's Ctrl-F1 ^[[11^ (maybe due to the trailing ^?), its UI is unconventional and counterintuitive to use, it does not work at all on ncurses... I'm sure you've seen these. But if the stars all align perfectly then it's a nice convenient feature that works fine.

I'm much more of an "all or nothing" guy. IMO let's do it properly, or let's not do it at all.

PS: I’m also starting to feel a bit uncomfortable

Don't feel uncomfortable and don't worry about it: I can at any time choose to ignore this (or any) ticket.

@ilia-maslakov
Copy link
Author

But the other possible way to look at it is: Out of a much-much larger feature set of markdown, you've just implemented the bare minimum that results in an obvious improvement for you. In the name of supporting markdown, you maybe support 5% of it because that's what you're looking for. It's useful for you, but I'm not buying that others would also be similarly happy with this feature.

The main issue is that tables are particularly hard to read in raw Markdown. That’s exactly why I addressed them specifically. Most other Markdown constructs are already reasonably readable in their raw form, but tables tend to turn into a dense block of text that is difficult to scan with the eyes.

So the change was intentionally focused on that specific pain point. The rest of Markdown usually remains readable enough without additional formatting, while tables benefit the most from structured rendering.

I'm sure you've seen the discussions about whether to keep or remove that feature.

The idea of removing the Learn Keys dialog is honestly quite discouraging to me. In my view, removing it would make dealing with terminal compatibility problems significantly harder.

Despite its limitations, it provides a practical way for users to adapt mc to terminals that behave differently or send unexpected escape sequences. Without such a mechanism, resolving these issues becomes much more difficult for users.

I am not familiar with your earlier contributions, I just happened to learn in another thread just the other day that you wrote the Learn keys dialog

No, I didn’t write the Learn Keys dialog. What I implemented was the basic support for remapping hotkeys.

I worked on vertical block support, block indent/unindent with Tab / Shift+Tab, the transition from single-byte handling to UTF-8, global clipboard support, the implementation of mcdiff, undo/redo in mcedit, line number display, allowing the cursor to move beyond the end of line, editor macros, text autocompletion in mcedit, and spell checking via aspell.

You may want to look there — perhaps you’ll find plenty of bad code in those parts.

@egmontkob
Copy link
Contributor

No, I didn’t write the Learn Keys dialog.

I must have misunderstood something then, my bad.

Despite its limitations, it provides a practical way for users [...]

For those lucky users who don't run into any of its problems. For some other users it's literally unusable. Why does it not do the same for everyone, without the limitations? :) If my assumptions are true that its issues have existed ever since this feature was added, it shouldn't have passed code review, unfortunately it did.

Anyway, I shouldn't have digressed, let's get back to our topic.


The main issue is that tables are particularly hard to read in raw Markdown. That’s exactly why I addressed them specifically.

I understand this. (I understand less so why you continued with those two other randomly chosen features that aren't hard to read.)

What you don't understand is that production quality software, shipped to millions of people, is not the right place for such limited "quick fixes". Let alone as the default behavior.

The right place for your code is your personal github repo, blog, forum posts etc.

What belongs to mc is to invoke a proper markdown formatter.

@ossilator
Copy link
Contributor

i actually kind of agree with ilia's goal - more isn't really required for a good user experience in this context. but the hackishness and already starting feature creep of the implementation shows that this isn't a viable approach.

super resource constrained systems arguably aren't the target for browsing "random shit" (you would be doing that on the host system you control the small device from), so the dependency argument has little pull. a real argument would be the full renderer taking more than half a second for a small readme file from hot caches.

@egmontkob
Copy link
Contributor

Even then, I'd much rather wait half a second to get the job done properly.

@ossilator
Copy link
Contributor

i wouldn't. i expect semi-instant reaction from mc, and usually actually getting it is often enough the reason why i'm satisfied with a sub-par presentation compared to launching a more fitting graphical application. when i explore a new project, it's often with the left hand on f3 and the right hand on the down key. in this "quick scan mode" i spend less than a second on many files. major hiccups always totally ruin the experience.

@egmontkob
Copy link
Contributor

in this "quick scan mode" i spend less than a second on many files.

So, why do you need nicely formatted tables in this mode? Isn't plain text view sufficient?

more than half a second for a small readme file

Is it that slow, though?

The extensive markdown test file that I happened to find here, with the command pandoc -f markdown_phpextra -t plain (which aligns the columns of tables, as opposed to other markdown variants as input formats), takes about 0.15 seconds for me from hot cache. And this even involves loading the atrocious 200MB pandoc binary.

Isn't there a markdown -> plain text converter out there that doesn't suck? One that supports tables (preferably without having to go through 7 different markdown flavors to find the only one), is fast, and has a reasonable size / dependencies? I'm sure there is one, I just haven't found it yet.

No, I'm still not buying the "let's quickly do the most important 5% of the job, via an ad-hoc 200 line awk script that someone once wrote and no one dares to touch, just to shave off a tiny fraction of a second relative to properly doing it all via an externally and actively maintained project" approach.

Tables isn't even part of official markdown, it's an extension, so our markdown parser/formatter would focus on basically one thing: an extension.

I think this is the situation where the best is for us to do nothing. There are so many flavors anyway, we can't know in which one the source file is. Users can easily hook up whatever viewer filter suits them the best.

@ossilator
Copy link
Contributor

So, why do you need nicely formatted tables in this mode? Isn't plain text view sufficient?

the premise is mc starting to do the rendering automatically on f3.

it is correct that this would be rarely useful.
i did encounter cases where rendered tables would have been nice, but the case i remember was actually during code review on gerrit.

Is it that slow, though?

it's more than the threshold of what does not feel instant (~300 ms).

I think this is the situation where the best is for us to do nothing.

maybe.

There are so many flavors anyway, we can't know in which one the source file is.
Users can easily hook up whatever viewer filter suits them the best.

no, they can't, as they are likely to encounter different flavors on a single system.

an ideal system would try to identify the flavor automatically (probably by correlating other files in the same repo, and examining the repo's upstream remote), and fall back to displaying a selection prompt when it encounters a construct that is ambiguous.
one could reasonably call that insane.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: mcview mcview, the built-in text editor prio: medium Has the potential to affect progress

Development

Successfully merging this pull request may close these issues.

4 participants