1. Problem
Imagine you have the following test.md
Markdown file:
> **Note:** Single line note Unrelated text > **Warn:** Multi-line warning Second line Third line Unrelated text > **Alert:** Prettier multi-line alert > Second line > Third line Unrelated text > **note**: Irregular colon and lowercase 'note' Unrelated text > **warn** No colon and lowercase 'warn' Unrelated text
And for each blockquote style admonition, you want to replace it with a Hugo shortcode with the same content. The Hugo shortcode looks like:
{{% admonition type="note" %}} Single line note {{% /admonition %}}
2. Solution with GNU sed
Match any of 'Note', 'Alert', 'Warn', or their lowercase counterparts.
TYPES='[Nn]ote|[Aa]lert|[Ww]arn'
Match the start of a Markdown blockquote.
BLOCKQUOTE='> '
Match the start of blockquote admonitions allowing for a colon inside or outside of the emphasised type.
MARKER="${BLOCKQUOTE}\*\*(${TYPES}):?\*\*:?"
Match everything up to the end of the line (requires sed -z).
REST_OF_LINE='[^\n]+\n'
Match the opening and closing tags for an admonition.
ADMON_OPEN="\{\{% admonition type=\"[\(${TYPES}\)]+\" %\}\}" ADMON_CLOSE='\{\{% \/admonition %\}\}'
A complete regexp for matching blockquote admonitions. It matches a the admonition marker, capturing the type in the first capture group, the rest of the first line in the second capture group, and all subsequent lines in the third capture group.
BLOCKQUOTE_ADMON="${MARKER} ?(${REST_OF_LINE})((${REST_OF_LINE})?+)\n"
Given the aforementioned capture groups, one can replace them with an admonition shortcode.
The \L
and \E
transform the type to lowercase.
SHORTCODE_ADMON='{{% admonition type="\L\1\E" %}}\n\2\3{{% \/admonition %}}\n\n'
Wrapping it all up in a script.
#!/usr/bin/env bash TYPES='[Nn]ote|[Aa]lert|[Ww]arn' BLOCKQUOTE='> ' MARKER="${BLOCKQUOTE}\*\*(${TYPES}):?\*\*:?" REST_OF_LINE='[^\n]+\n' ADMON_OPEN="\{\{% admonition type=\"[\(${TYPES}\)]+\" %\}\}" ADMON_CLOSE='\{\{% \/admonition %\}\}' BLOCKQUOTE_ADMON="${MARKER} ?(${REST_OF_LINE})((${REST_OF_LINE})?+)\n" SHORTCODE_ADMON='{{% admonition type="\L\1\E" %}}\n\2\3{{% \/admonition %}}\n\n' sed -Ez -f - test.md <<EOF s/${BLOCKQUOTE_ADMON}/${SHORTCODE_ADMON}/g /${ADMON_OPEN}.+${ADMON_CLOSE}/s/> //g EOF
Produces the following output Markdown: