Pragmas#
| Item | Description |
|---|---|
| Extension ID | linter-pragmas |
| GFM Extension Status | Unofficial |
| Configuration Item | extensions.linter-pragmas.enabled |
| Default Value | True |
Configuration#
| Prefixes |
|---|
extensions.linter-pragmas. |
| Value Name | Type | Default | Description |
|---|---|---|---|
enabled |
boolean |
True |
Whether the extension is enabled. |
Summary#
This extension allows the PyMarkdown parser to look for "Pragmas" that provide metadata about a Markdown document. This information is then used by the Rule Engine to alter how Rule Failures are processed.
The most common use case for Pragmas is to disable Rule Failures for a specific Rule Plugin on the line that follows the Pragma. As a logical extension of this, there is also a form of the Pragmas that disables Rule Failures for a specified number of lines after the Pragma.
Examples#
The following Markdown text:
```Markdown some paragraph
My Bad Atx Heading#
some other paragraph ```
causes PyMarkdown to report a failure of no-multiple-space-atx on line 3.
With this extension enabled, the following Markdown text:
```Markdown some paragraph
My Bad Atx Heading#
some other paragraph ```
will not cause PyMarkdown to report a failure.
Specifics#
The GitHub Flavored Markdown specification focuses on the parsing of Markdown and the uniform generation of HTML based on that parsing. As such, the authors of that document did not provide any guidance in the specification that addresses the needs of GFM compliant linters. The Pragmas extension was created to specifically solve the problem of being able to suppress a specific instance of a Rule Failure being reported.
Nomenclature#
The word "pragma" is a term used to specify an instruction that tells a language compiler or interpreter how to interpret the object it is processing.
History#
Existing linters have proven mechanisms to deal with the suppression of notifications that they generate. Whether those notifications are called issues, failures, or Rule Failures, the need is the same. Each of these applications needs a mechanism embedded within the object being scanned that tells the application that the notification is not required as it has been manually verified by the user.
Consider the case of a text document and a word processor. Once you load the document into the word processor application, various numbers of red and blue lines will appear in your view of the document. Based on age old editing standards, the red lines typically show spelling errors while the blue lines typically show grammar and other errors. However, when you click on those lines in most word processors, a context menu comes up and gives you options: different ways in which to correct the error or an option to ignore the error.
Linting applications are no different. There are a set of rules that the user has asked to be applied to the object being scanned. When an error is found, users either want to fix the error or to ignore that failure without turning that rule off. Relating to the word processor example above, if you have a spelling error in your document, you do not want to turn off all spell checking because you disagree with one error. You want options to deal with it on a more granular level.
When looking at other linters for ideas on how to present failure suppressions to the user, we looked at popular linting applications for inspiration:
- PMD -
Java Code Analyzer
@SuppressWarnings("PMD.issue")
- PyLint -
Python Code Linter
# pylint: disable=issue
- Markdown Lint -
Node.js Markdown Linter
<!-- markdownlint-disable issue -->
We quickly noted that in each case, the suppression mechanism is either a code annotation or a specific form of comment that the application can process. Another important feature is that the suppression object contains all required information needed to identify the suppression.
There were also negative things that we noted. The first thing is that the PMD suppression attaches to an object, such as a class or a method, and applies to anything within the scope of that object. The second thing is that PyLint suppressions are enabled or disabled from the point of the suppression, until the end of the file or the opposite enabled/disable is found. As the PyMarkdown linter is written in Python and we use PyLint on our code, we can relate many stories of disabling an issue early on in a module, causing us to miss a real problem later in that same module. We ran into this issue so often that we wrote a simple parser to help us detect these cases in our Python code, allowing us to address them.
Based on those experiences, our team decided that we prefer to have the suppression specified exactly where it is needed without the need for checking scope and without the need for looking for matching disable/enable statements.
Implementation#
After looking at examples from other applications, our team decided that
for Markdown documents, a specialized HTML comment block (multiline text that starts
with <!-- or <!--- and ends with --> or --->) was the best option. In addition,
we decided to break the "statement" part of interpreting Pragmas away from the "interpretation"
part of interpreting Pragmas.
Pragma Statements#
Pragmas must occur at the start of the line. No initial whitespace is allowed.
The Pragma is a normal HTML comment, starting with the character
sequence <!-- or <!---, and ending with the character
sequence --> or --->.
As some editors do not clearly show trailing whitespace
characters, any number of whitespace characters may follow
the closing character sequence. Within the bounds of the HTML comment, the
Pragma data is preceded by zero or more whitespace characters, the character
sequence pyml and a single space character. The remaining text within the
command block is referred to as the Pragma command.
To put this into practical terms, a valid Pragma line matches the following regular expression:
regex
^<!--[\t\s]*pyml\s(.*)[\t\s]*-->[\t\s]*$
where the group (.*) is the Pragma command.
Available: Version 0.9.32
To avoid any confusion, if the character sequence <!-- is used to start the Pragma,
the --> sequence must be used to close the Pragma. Likewise for the sequences
<!--- and --->. This is a breaking change from before where only the --> sequence
was used to close the Pragma, which was confusing to some users.
Removal From Document's Token Stream#
Regardless of whether the extracted Pragma command is valid or not, once a Pragma statement has been identified, it is completely removed from the parser's purview. The Pragma command is then stored in a separate list along with its location metadata.
The removal of the Pragma statement from the parser's Markdown token stream takes getting used to, but it is a logical action. Each Pragma provides instructions to the parser on how to handle that part of the document, it does not provide content for the document. As Pragmas do not provide content, the processing would get complicated if the Pragma statement was not removed from the parser's view.
Take this Markdown as an example:
```Markdown some paragraph
My Bad Atx Heading#
some other paragraph ```
With PyMarkdown's default settings, the Atx Heading element on line 3 raises a Rule
Failure
for Rule Plugin no-multiple-space-atx because the heading starts with multiple
spaces.
To suppress that Rule Failure using a Pragma, the appropriate Pragma command
to add is disable-next-line no-multiple-space-atx, as follows:
```Markdown some paragraph
My Bad Atx Heading#
some other paragraph ```
If that Pragma is not "invisible" to the parser, then adding the Pragma statement
has a cascading effort, causing Rule Plugin MD022 to trigger. That is because
Rule Plugin MD022
mandates that Heading elements are surrounded by blank lines. Then you would need
to have a Pragma to suppress that failure... which we believe is just inefficient
and messy.
By removing the Pragma (and therefore the Pragma line) from the parser's viewpoint, everything is simplified. The Pragma data is stored in separate storage so it can be acted on properly, while not interfering with the parser's work of processing the Markdown text.
Pragma Commands#
As mentioned above, Pragma commands are stored in a list for post-parsing processing. Any errors processing the Pragma commands are handled in the same manner as with Rule Failures.
When a valid Pragma command is processed, the extension does not emit any information.
If there are any errors, an INLINE error is generated with a clear indication of
the error that was raised. For example, given the following three invalid Pragma
commands:
```Markdown
```
the following Pragma command failures are generated:
text
{filename,row,col}: INLINE: Inline configuration specified without command.
{filename,row,col}: INLINE: Inline configuration command 'bad' not understood.
{filename,row,col}: INLINE: Inline configuration command 'disable-num-lines' specified a count 'a' that is not a valid positive integer.
Note that as with Rule Failures, Pragma command failures do not stop the parsing of the Markdown document.
Available Commands#
Previously, the two active commands currently were the disable-next-line command
and the disable-num-lines command. By keeping things simple, our team
hopes to keep Pragmas understandable and their implementation simple. However,
because of user requests, starting in version 0.9.30 we now support the disable
and enable commands.
Disable-next-line Command#
The disable-next-line command is followed by at least one whitespace character
and a comma-separated list of identifiers. Those identifiers specify one or more
Rule Plugins that will have their ability to generate Rule Failures suppressed for
the
line after the Pragma command.
Pragma command failures are issued if:
- an identifier (Rule ID or alias) is not provided
- the identifier is not an existing Rule ID or alias for one of the Rule Plugins
Therefore, a proper command to suppress Rule ID MD031 on the next line
is disable-next-line MD031 or disable-next-line blanks-around-fences.
```Markdown some paragraph
My Bad Atx Heading#
some other paragraph ```
Disable-num-lines Command#
The disable-num-lines command is followed by at least one whitespace character,
a positive integer, at least one whitespace character, and a comma-separated list
of identifiers. Those identifiers specify one or more
Rule Plugins that will have any generation of Rule Failures suppressed for the
specified number
of lines after the Pragma command.
Pragma command failures are issued if:
- a count was not specified, and therefore, one or more identifiers were not specified
- a count was not specified as a positive integer
- an identifier (Rule ID or alias) is not provided
- the identifier is not an existing Rule ID or alias for one of the Rule Plugins
Therefore, a proper command to suppress Rule ID MD031 on the next three lines
is disable-num-lines 3 MD031 or disable-num-lines 3 blanks-around-fences.
```Markdown
some paragraph
My Bad Atx Heading#
some other paragraph ```
Disable Command and Enable Command#
Available: Version 0.9.30
Introduced due to user requests, the disable and enable commands are used to
define a range within which a Rule Plugin is disabled. Typically, the commands
are paired
in the following fashion:
```Markdown
| Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | Column 7 |
|---|---|---|---|---|---|---|
| data | data | data | data | data | data | data |
```
As with the disable-next-line and disable-num-lines commands, a command separated
list of identifiers can be specified for these commands. A command to enable
a Rule Plugin that has not been disabled will be ignored, as will a command to disable
a Rule Plugin that has already been disabled.
In addition, the enabling or disabling of Rule Plugins does not have to be done together. Consider this example:
```markdown
Header with double spaces#
This is a simple document with a table, which is not yet supported.
| Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | Column 7 |
|---|---|---|---|---|---|---|
| data | data | data | data | data | data | data |
```
In this example, the Rule Plugin MD019 is disabled for the entire document along
with the
line-length Rule Plugin. Note that the Rule ID is used for one Rule Plugin whereas
the Rule Plguin's alias
is used for the second Rule Plugin, as both are identifiers, both are valid to enable
and disabled the Rule Plugin. Also note that the line-length Rule Plugin is disabled
on
line 1 and line 6. In this case, the disable on line 6 was probably added as part
of a pattern to surround possible tables with a disable/enable block. Since
the disable is repeated, there is not negative affect. Finally, note that only
the line-length Rule Plugin is enabled on line 10. While this closes the line-length
disable block, it does not affect the disable block for Rule Plugin MD019. This
is
allowed as a disable block does not have to be terminated. The effect is that
Rule Plugin MD019 is disabled from line 1 to the end of the document.
Important Notes#
There are three important concepts that are important to stress when
using these commands. The first concept is that these commands are applied after
the disable-next-line and disable-num-lines commands are applied.
The second concept is that these commands can only disable a Rule Plugin, not enable
it.
The third concept is that the disable commands and enable commands do not stack.
The disable-next-line and disable-num-lines commands are specific in which
Markdown lines they apply to, either the next line or the specified number of lines.
To honor this sense of specificity, the disable and enabled commands are applied
after those two commands. This ensures that the more specific disables have priority
and presents a predictable pattern to the user.
Similarly, the disable-next-line and disable-num-lines commands were
originally created to suppress an enable Rule Plugins for a small portion of the
Markdown
document. The driving factor for that decision was to keep the algorithms for deciding
if a Rule Plugin was enabled or not simple. A Rule Plugin is enabled or disabled
through configuration,
with the disable-next-line and disable-num-lines commands providing temporary
relief from the constraints of specific Rule Plugin. To keep the system from getting
complex, the disable and enabled commands retain that philosophy by only allowing
for Rule Plugins to be disabled in blocks but not enabled.
Finally, the disable and enabled commands do not stack. If two disable commands
are present in a Markdown document for the same Rule Plugin, the first command is
applied
and the second command is ignored. Similarly, if an enable command is present
and there is no "active" disable command for that Rule Plugin, the enable command
is
ignored. These commands are not meant to be complex, and this is one way in which
their simplicity can be maintained.