ext: add markdown nroff view with table formatting in doc.sh#5053
ext: add markdown nroff view with table formatting in doc.sh#5053ilia-maslakov wants to merge 1 commit intoMidnightCommander:masterfrom
Conversation
Signed-off-by: Ilia Maslakov <il.smind@gmail.com>
|
First thing: For whatever reason, the rule doesn't work for me. F3 just displays the file in its raw form. It works if I remove the I'm suspecting a generic mc bug here, to be investigated. Second, and this is subjective but perpahs marginally relevant. I hate markdown with all my heart, for two reasons. One is that the idea of the source and the formatted versions both being easily human-readable is conceptually flawed. The second is that it's executed extremely poorly. Pretty much every special character has a special purpose in certain contexts, and their interaction is not specified, or in the rare case that it is, it's absolutely inconsistent. As soon as you want to use two things at once, you can't be sure what to do. Ex.1. How to have preformatted text? It's between backticks. How to have a backtick in such a context? Or a backtick and a space? You're out of luck. Ex.2. To start second level heading, you begin with Hard line wraps without starting a new paragraph? Trailing whitespaces in the source. Multiple paragraphs within a bullet point??? Try and experiment and you might get lucky. And everyone has their own flavors, their own ideas how the various features should interact with each other. All super incompatible with each other. Tables aren't even part of official markdown, it's an unofficial extension. With further, even more unofficial extensions like alignment of columns. Existing markdown viewers have been developed-maintained for years and they still kinda suck (because they try to implement a poor and underspecified format, so it is a pretty hopeless task). And then here comes your 200-line shell script, mostly awk, as a new markdown parser/formatter which we should trust that does the right thing in decent quality, and should be able to understand and maintain in the future. You randomly pick two or three aspects of markdown: headers, tables, backticked text; and highlight/format them. And you leave the rest unformatted, i.e. the remaining dozens and dozens of markdown features unaddressed. Markdown says to automatically renumber numbered lists, e.g. if the source has Markdown says a single newline character in the source file shouldn't begin a new line, it continues the paragraph which paragraph needs to be reformatted for the display. (See also here.) Not done. Column alignments of tables? Not done. I'm sure this list is pretty endless. You take a specification which aims to use an already more-or-less human-friendly and readable source format as input and loosely describes how its output should look like. You come up with an ad-hoc script that performs this conversion for an arbitrary tiny fraction of its features and leaves the source format unchanged for the rest, giving something that's maybe a 1/10th of the way there. I find this a very wrong direction. IMO what you should rather be doing: Find the best markdown -> ASCII converter, or a list of a few. Hook them up to mc. Maybe work together with them to add a roff output format: a slightly beefed-up ASCII but still way less powerful than their HTML output (if they have one). |
|
I don't want to bash it any further, but just one more data point, which I happened to notice thanks to your screenshot: Your code does recognize and format backticked text outside of tables. Not inside tables though, there they remain literal backticks. Here on GitHub backtick has the same meaning inside tables: Source: Rendering:
|
The No other rule mixes |
|
I intentionally implemented formatting only for headers, tables and inline code. The goal was simply to make reading Markdown content in mcview more convenient, not to implement a full Markdown renderer. Most Markdown constructs (paragraphs, lists, emphasis, etc.) are already easily readable in their raw form, so I left them unchanged on purpose. The script only highlights a few elements that significantly improve readability in a terminal viewer. This was a deliberate design choice: keep the implementation small and lightweight instead of attempting to support the full Markdown specification, which would inevitably lead to a much more complex and heavier parser. In other words, the approach here is minimal enhancement of the source text, not full Markdown rendering. |
|
Regarding external converters: Pandoc is unfortunately far too heavy for this use case. It is a very large dependency and is unlikely to be available on minimal systems where Midnight Commander is commonly used (servers, containers, embedded systems). Requiring pandoc just to preview a Markdown file in mcview would be disproportionate. Lowdown was also evaluated, but it does not produce output that integrates well with mcview. In particular, its terminal output is designed for standalone rendering and does not map cleanly to mcview's nroff/overstrike style formatting. As a result the output quality in mcview is not satisfactory. Because of this, the goal of the script was not to implement a full Markdown renderer, but to provide a small, dependency-free improvement for the most visually disruptive elements: headers tables inline code These elements significantly improve readability in the viewer, while the rest of Markdown remains readable in its raw form. The script intentionally stays small and lightweight instead of trying to implement the full Markdown specification. |
Sorry, I wasn't clear. What I meant is left-aligned, center-aligned, right-aligned columns. |
i need check it |
|
I firmly disagree with your goal. You pick an arbitrary tiny subset of markdown's features and format them. The list is surely subjective, but even if agreed that these are the most prominent ones, addressing them and only them is IMO more confusing than useful. The use case you have in mind is tiny, mininal or embedded systems where for some reason someone would want to view a markdown file and for some reason they would want to have these particular aspects of markdown's rendering improved. Yet, you want to ship this "improvement" for all mc users out there, even those (presumably the vast majority) who are just an I believe that no decent project is a right place to embed your ad-hoc script. It's something that should live in a repo of yours, along with documentation who the target audience is and how to hook up to mc. mc's config files are editable by the users for a good reason, anyone who wants to can place this snippet there. By the way, it should really reside in its own file, with
All levels (from 1 to 6
Center-aligned or right-aligned columns not supported. The line at the bottom looks ugly.
Does not work inside tables. Even just properly addressing these aspects of markdown's rendering is more complex than you think it is. It belongs to external projects with proper maintenance. I don't think mc is the right place to host this minimal hack, nor that we would want to carry and maintain it. |
ок. I'll try to take into account the remarks that can be addressed without introducing significant complexity or performance cost. In particular, I can look into improving the areas you pointed out (such as header levels, table alignment and inline code handling) where the fixes are relatively small and do not require turning this into a full Markdown renderer. However, the goal will still be to keep the implementation lightweight and simple, avoiding features that would significantly increase complexity or effectively require implementing a full Markdown parser. |
I don't think that a restricted Markdown parser implemented as a monstrous AWK script without tests is a suitable addition to mc codebase in the first place. If you need/want something like that, you can put it in your user directory. |
|
I'm not one of the two main developers who will decide, I'm just writing my opinion. (Although one of the main developers has just chimed in and seems to be on my side.) I'm afraid I didn't clearly articulate my problems. As for me, it's not the "tiny" bugs that you may fix. It's the overall concept. mc is not the proper place for such hacks fulfilling the particular needs of the author, in the vague hope that supporting this particular tiny subset might be useful for others as well. Mind you, even if you could turn this to a full-blown Markdown parser-formatter, in beautful clean code, being the best Markdown parser-formatter out there, mc still wouldn't be the right place for it. It should live as an external tool, mc only doing the gluing. |
I’m still in the top 10 contributors by number of commits to the mc project, even though I haven’t committed for about 7 years. :) I can certainly add more tests for the parser, but it seems to me that the issue here is not really about the number of tests. Still, if the patch is not needed - no problem. This functionality is useful for me, but of course the decision is yours. |
Not sure how that's relevant.
"for me" is the part that bothers me. We're supposedly developing this piece of software for a hopefully wide audience, in hopefully high quality. The bar should be much higher than "useful for me". |
you are right
Most of the functionality I’ve contributed to mc over the years was written based on the principle of “useful for me”. That’s fairly natural: I use the software myself. At the same time, this is not an unusual situation in open source. I’m a fairly typical file manager user, so things that are convenient for me tend to be convenient for many others as well. PS: I’m also starting to feel a bit uncomfortable that I’m taking up your time and pulling you into a discussion about functionality that may not even belong in upstream. |
There might be some subtle terminology differences here. Of course it's okay if you work on things that are useful for you, and prefer not to spend your time on features / fixes at areas you personally don't care about. That's one thing. The feature you're proposing here is useful for you. But the other possible way to look at it is: Out of a much-much larger feature set of markdown, you've just implemented the bare minimum that results in an obvious improvement for you. In the name of supporting markdown, you maybe support 5% of it because that's what you're looking for. It's useful for you, but I'm not buying that others would also be similarly happy with this feature. I am not familiar with your earlier contributions, I just happened to learn in another thread just the other day that you wrote the Learn keys dialog, and I'm sure you've seen the discussions about whether to keep or remove that feature. My impression with the Learn keys dialog is similar: It looks to me like it implements the bare minimum that you found useful. It's full of shortcomings; bugs if you will. It cannot override escape sequences that are already defined in terminfo, it does not recognize and cannot use urxvt's Ctrl-F1 I'm much more of an "all or nothing" guy. IMO let's do it properly, or let's not do it at all.
Don't feel uncomfortable and don't worry about it: I can at any time choose to ignore this (or any) ticket. |
The main issue is that tables are particularly hard to read in raw Markdown. That’s exactly why I addressed them specifically. Most other Markdown constructs are already reasonably readable in their raw form, but tables tend to turn into a dense block of text that is difficult to scan with the eyes. So the change was intentionally focused on that specific pain point. The rest of Markdown usually remains readable enough without additional formatting, while tables benefit the most from structured rendering.
The idea of removing the Learn Keys dialog is honestly quite discouraging to me. In my view, removing it would make dealing with terminal compatibility problems significantly harder. Despite its limitations, it provides a practical way for users to adapt mc to terminals that behave differently or send unexpected escape sequences. Without such a mechanism, resolving these issues becomes much more difficult for users.
No, I didn’t write the Learn Keys dialog. What I implemented was the basic support for remapping hotkeys.
I worked on vertical block support, block indent/unindent with Tab / Shift+Tab, the transition from single-byte handling to UTF-8, global clipboard support, the implementation of mcdiff, undo/redo in mcedit, line number display, allowing the cursor to move beyond the end of line, editor macros, text autocompletion in mcedit, and spell checking via aspell. You may want to look there — perhaps you’ll find plenty of bad code in those parts. |
I must have misunderstood something then, my bad.
For those lucky users who don't run into any of its problems. For some other users it's literally unusable. Why does it not do the same for everyone, without the limitations? :) If my assumptions are true that its issues have existed ever since this feature was added, it shouldn't have passed code review, unfortunately it did. Anyway, I shouldn't have digressed, let's get back to our topic.
I understand this. (I understand less so why you continued with those two other randomly chosen features that aren't hard to read.) What you don't understand is that production quality software, shipped to millions of people, is not the right place for such limited "quick fixes". Let alone as the default behavior. The right place for your code is your personal github repo, blog, forum posts etc. What belongs to mc is to invoke a proper markdown formatter. |
|
i actually kind of agree with ilia's goal - more isn't really required for a good user experience in this context. but the hackishness and already starting feature creep of the implementation shows that this isn't a viable approach. super resource constrained systems arguably aren't the target for browsing "random shit" (you would be doing that on the host system you control the small device from), so the dependency argument has little pull. a real argument would be the full renderer taking more than half a second for a small readme file from hot caches. |
|
Even then, I'd much rather wait half a second to get the job done properly. |
|
i wouldn't. i expect semi-instant reaction from mc, and usually actually getting it is often enough the reason why i'm satisfied with a sub-par presentation compared to launching a more fitting graphical application. when i explore a new project, it's often with the left hand on f3 and the right hand on the down key. in this "quick scan mode" i spend less than a second on many files. major hiccups always totally ruin the experience. |
So, why do you need nicely formatted tables in this mode? Isn't plain text view sufficient?
Is it that slow, though? The extensive markdown test file that I happened to find here, with the command Isn't there a markdown -> plain text converter out there that doesn't suck? One that supports tables (preferably without having to go through 7 different markdown flavors to find the only one), is fast, and has a reasonable size / dependencies? I'm sure there is one, I just haven't found it yet. No, I'm still not buying the "let's quickly do the most important 5% of the job, via an ad-hoc 200 line awk script that someone once wrote and no one dares to touch, just to shave off a tiny fraction of a second relative to properly doing it all via an externally and actively maintained project" approach. Tables isn't even part of official markdown, it's an extension, so our markdown parser/formatter would focus on basically one thing: an extension. I think this is the situation where the best is for us to do nothing. There are so many flavors anyway, we can't know in which one the source file is. Users can easily hook up whatever viewer filter suits them the best. |
the premise is mc starting to do the rendering automatically on f3. it is correct that this would be rarely useful.
it's more than the threshold of what does not feel instant (~300 ms).
maybe.
no, they can't, as they are likely to encounter different flavors on a single system. an ideal system would try to identify the flavor automatically (probably by correlating other files in the same repo, and examining the repo's upstream remote), and fall back to displaying a selection prompt when it encounters a construct that is ambiguous. |


• ## Proposed changes
Add Markdown view support through doc.sh and switch markdown binding to %view{ascii,nroff}.
Extend markdown file matching to include both .md/.mkd and .markdown.
Implement lightweight markdown rendering for mcview using nroff overstrikes:
Add markdown table rendering with:
Resolves: N/A
Checklist