Rule - MD033#
| Property | Value |
|---|---|
| Aliases | md033, no-inline-html |
| Autofix Available | No* |
| Enabled By Default | Yes |
Summary#
Inline HTML.
Reasoning#
Portability#
The primary reason for enabling this rule is to force the document writer to not use any HTML elements, or only a small subset of HTML elements, in their documents. The two main reasons for this are:
- for specific reasons, such as security, where allowing HTML to be added to a document is not desirable
- for use with Markdown parsers that render their output as something other than HTML
Examples#
Failure Scenarios#
This rule triggers if either a HTML block or an inline Raw HTML element is detected.
```Markdown This is Raw HTML.
Correct Scenarios#
This rule only triggers on HTML elements that are present and not on the
allowed_elements list. Therefore, one solution to correct the above
example is to express both concepts as Markdown elements instead of HTML
elements:
```Markdown This is Raw HTML.
```
Depending on the reasons for enabling this rule, the other approach is
to add any required elements to the allowed_elements list. Added
through configuration, setting the allowed_elements to image,i will
also suppress the triggering of this rule.
However, as this rule is generally used for a sincere and overwhelming
reason to not allow inline HTML, the use of this approach for any other
image tags than the default !-- (HTML comment) are strongly discouraged.
Fix Description#
The reason for not being able to auto-fix this rule is duplication. There are two primary use cases for this rule:
- Preventing any HTML from being present in the document.
- Preventing specific raw HTML elements from being present in the document.
For addressing the first use case, the Disallow HTML
extension is available. When enabled, this extension protects against any HTML
elements that change the context of nested elements from being used. This protection
occurs by replacing the < character with the < sequence.
In the second use case, users want to scan for specific HTML tags to prevent their use outright. In most scenarios, those users want to be able to report the error information back to their authors, giving them a chance to correct the text before resubmitting the text.
As the first case is addressed by the Disallow HTML extension and the majority of use is addressed by the product user in the second case, there is not a current compelling argument to provide a fix for this rule.
Configuration#
| Prefixes |
|---|
plugins.md033. |
plugins.no-inline-html. |
| Value Name | Type | Default | Description |
|---|---|---|---|
enabled |
boolean |
True |
Whether the Rule Plugin is enabled. |
allowed_elements |
string |
!--,![CDATA[,!DOCTYPE |
Comma separated list of tag starts that are allowable.** |
allow_first_image_element |
boolean |
True |
Whether to allow an image HTML block. |
** The comma-separated list of items is a string with a format of {item},...,{item}.
Any leading or trailing space characters surrounding the {item} are trimmed during
processing. Empty {item} values after trimming has been applied will generate
a configuration error.
The element names in the list are derived by taking the start of the tag and skipping
over the start character <. From that point, the parser collects the contents
of the tag up to one of the following:
- the first whitespace character
- the close HTML tag character (
/) - the end HTML tag character (
>)
As tags either require a whitespace character, the end character, or
the closing characters, this supplies a straightforward way to represent each HTML
tag. The only exceptions to this are when the rule encounters the
CDATA
character sequence ![CDATA[ right after the start HTML tag character (<) or
the HTML comment sequence !--.
Because those sequences do not require any whitespace to follow it, they are managed
separately.
Allowing For Image Headings#
Looking at numerous GitHub project pages, there are a significant number of more
established projects that use an HTML Image for their initial heading. This is
already supported through Rule Md041 which allows an h1 tag at the very start
of the document to satisfy the requirements for the document starting with a level
1 Heading element:
```Markdown
```
To round out that support in this rule, the allow_first_image_element was added
to provide an exception to the normal allowed_elements configuration value. This
exception is specifically for the very first element in the document, and only
triggers if that HTML Block element starts and ends with a h1 tag, with only an
img tag between them.
Special Sequences - Processing Instructions and the DOCTYPE Directive#
In earlier versions of the Rule Plugin, support for identifying a generic Processing
Instruction starting with the ? character was supported. After doing some needed
research, it was determined that Processing Instructions are supported in XML documents,
but not in HTML documents.
After confirming that information at other sources, the Processing Instruction sequence
was removed from the default for the allowed_elements configuration value.
At the same time, research was performed on Declarations that start with the !
character. Related to Processing Instructions, these elements are also used in XML
documents for special instructions, but do not appear in HTML documents. The one
exception is the !DOCTYPE declaration. The DOCTYPE declaration
is included in HTML documents to ensure that browser interprets the HTML document
in the manner that it was intended to be parsed.
Origination of Rule#
This rule is largely inspired by the MarkdownLint rule MD033.
Differences From MarkdownLint Rule#
The substantial difference from the original rule is that the original rule only
triggers if an alphabetic character follows the starting < character. While
that worked in most cases, it precluded the detection of
HTML start conditions
number 2 to number 5 and the closing tag case for number 7.
In creating this rule to work with all HTML tags, including the missing
HTML start conditions, the
allowed_elements configuration default value is set to !--,![CDATA[ to allow
for common HTML tags to not trigger this rule by default.
To provide better support for the "image as a heading" scenario, the
allow_first_image_element configuration value was added to specifically
allow that scenario to not trigger this rule.